Copy Data is a term used to describe additional copies of production data that are created to feed other processes in the organization. Functions like backup, disaster recovery, development, test as well as reporting and analytics all need copies of data. Some of them need their copies to be as close to the original as possible, others need to have a static copy maintained as long as possible.
In this 1 minute StorageShort, Steve Kenniston, VP of Technology for Catalogic, and I discuss why all these copies are needed and the problems they cause.
The big problem with copy data is that all these copies are legitimate. They can’t simply be deleted or moved to tape like an archive data set. The system that stores these copies needs to be able to capture them quickly, automate how the copies are delivered to the other processes and in many cases be able to provide access with a decent level of performance.
For example test/dev needs to be able to execute on a very recent copy of the data and be provided with a storage performance experience that is similar to production so that they properly measure scale and user experience.
As we discuss in our webinar, the growth of copy data is far outpacing the growth of production data. But a copy data management solution can limit the physical capacity impact of copy data by leveraging snapshots. But it also enhances snapshots to make them more usable as a data protection on long term retention strategy. First, they provide search so that a specific file or file version can be found amongst potentially thousands of snapshots. Second, some of these solutions provide automation so that these copies can be presented to the various processes described above.

