Copy data is getting a lot of attention and solutions to manage it are flooding to market. But a question few of these solutions let you ask is, “WHERE should that copy data reside?” For the most part, they assume and expect that the organization will host its copy data on-premises. Is that the right place to store copy data?
What is Copy Data?
To clarify, copy data is data that is a copy of primary data. IT uses it for data protection, testing and development, reporting and analytics. The problem for most organizations is these copies are actually just that, specific copies for each of those use cases. Copy data management solutions promise to eliminate that problem. Typically these solutions capture one copy of primary data and then present virtual copies of that data to the various tasks that need them. Again, the problem is these solutions assume all this data is on-premises.
The Problem with On-premises Copy Data
Copy data can be 10X to 20X the capacity of primary storage. Copy data management solutions solve that problem and reduce copy data capacity requirements to 2X of primary data. But the problem is all of this data, virtual or not, needs to be processed by something. That means the organization needs to stand up compute infrastructure to do the work of data protection, test/dev, reporting and analytics.
The Cloud Advantage
What if that first copy of data was in the cloud? Like traditional copy data management solutions virtual copies could easily be presented to the various use cases. But unlike traditional solutions, the cloud also has virtual compute that could be spun up to process those copies. The availability of pay as you need compute could not only save the organization money, it also opens up new possibilities. For example imagine testing new code and being able to spin up thousands of compute nodes to truly stress the code.
A Step Further
A cloud hosted copy of data has one similarity with on-premises copies, there is still as 2x growth of data. What if primary storage were in the cloud and active data was cached to the data center? Then the copy in the cloud could be presented virtually directly from primary storage while still leveraging cloud compute.
There is a concern over cloud latency. While cache provides instant response, if there is a miss the time it takes for data to copy from the cloud to the data center will almost certainly impact performance. A potential solution is to use a metro point of presence provided by a regional solution provider. The basic design would then be a small cache on presence, a much larger storage tier at the regional provider prior to sending it to the public cloud. The result is in the event of an on-premises cache miss the metro MSP provide the data in milliseconds.
Copy data management is a must have for organizations of all sizes. But copy data is an area where the cloud might be a better option. The ability to make zero cost virtual copies AND provide compute to process that data gives it a distinct advantage.
Managing copy data in the cloud is also a way to reduce disaster recovery costs. The enterprise can use copy data as part of a disaster recover as a service (DRaaS) solution. To learn more how cloud copy data management can reduce disaster recovery costs check out our on demand webinar, “How to Cut Disaster Recovery Expenses – Improve Recovery Times.”