2015 will be THE year of copy data management. Multiple vendors will bring solutions to the market. Many of these solutions will leverage snapshot technology in one form or another in an effort to reduce the capacity requirements of secondary data copies needed to drive data protection, business analytics, and test/dev operations. But there is another key resource that needs to be saved; time. Copy data solutions need to provide a high level of orchestration and analysis so that system administrators can be more efficient and decrease the chance for error.
What is Copy Data?
As Storage Switzerland discussed in its article “What is Copy Data”, copy data solutions reduce the capacity required for secondary data sets. Many of them, as stated above, leverage a form of snapshot technology to create these copies but that secondary copy is made on an independent system not on the production system. Snapshot tools tend to provide a management layer that prescribes the number of snapshots that can be taken and the location in which they are stored. But snapshots are just the beginning but many of the first wave of copy data solutions stop at snapshots. The second wave of copy data solutions will do more to reduce the capacity requirements of the secondary data set; they will improve its operational value by providing orchestration and analysis of those copies.
In this article Storage Switzerland will explain the importance of copy data orchestration and how it can save IT its most valuable resource; time.
Orchestration is More Than Scheduling
Copy data solutions, without an orchestration aspect to them, require that the storage administrator be involved in every aspect of copy data management. Orchestration is more than just scheduling, which a few copy data solutions have. Scheduling includes the timing and frequency of data capture. But a fully automated copy data solution should provide more granular control over where the snapshot is taken, how long it is kept as well as how and where it is leveraged. These allow the storage administrator to meet granular retention service levels for various types of data for multiple applications.
While today’s copy data solutions use space efficient snapshot technology or allow the administrator to leverage the snapshot technology they already utilize from their storage vendor, they still consume capacity eventually. The longer the snapshots are in place, the more of a capacity impact they will have. Without direct control over snapshot expiration, as well as the ability to vary that expiration time by application or location, the storage administrator will have to constantly monitor snapshot utilization and make “best guesses” as to when to delete the various snapshot instances. Manually integrating copy data retention with an organization’s retention service levels is a challenging proposition and has a high chance for error.
Automation of expiration, a basic first step in copy data orchestration removes the mundane task of capacity management from the storage administrator’s packed to-do list. The administrator should be able to, at point of creation, not only schedule snapshot frequency and expiration of that copy, but also the expiration of any additional copies of the snapshot, vault or mirror made for disaster recovery or data distribution purposes. Doing so not only frees up the storage administrator’s time, but it also reduces the chance for error and allows for much easier integration into the organization’s overall service levels.
Orchestration of Copy Utilization
The next step in copy data orchestration goes beyond just making the basic copy. In this step the copy data engine should provide control and automation over how that copy is used. One example of where copy data can be valuable is for the test/dev environment. A copy data solution should allow the test/dev environment to work on a near-production copy of the data.
The production application and its data can easily be copied, leveraging snapshots, for protection as an example, by a copy data solution every four hours. It would make sense that the test/dev team would want to have their environment refreshed with this four-hour copy of data. But think of the work that would be required of the storage administrator.
Every four hours the storage administrator would have to create a view into this protected copy of data and gracefully move that view into position for the test & development team. The situation is even worse in a disaster recovery test since servers or virtual machines need to be started in an isolated network environment, applications started, verification code executed, and reports of successful instantiation of the application sent.
The Power of Copy Data Orchestration
If the use of copy data can be equally controlled then the time savings can be significant and the value of the copy data solution increased. Using the above examples, a copy data solution with automation could automatically create a view into the four hour copy of data and present that to the test/dev environment. It could also run various commands to make sure the test/dev environment is properly refreshed to “see” the latest data.
The power of orchestration is nowhere more evident than when used for disaster recovery testing. While virtualization and replication have provided organizations with better DR preparation, one of the basic challenges of disaster recovery testing remains: the time it takes to actually verify the DR environment. A copy data management solution with proper orchestration could allow an organization to test their FULL DR readiness on an almost daily basis.
Orchestration allows for the storage administrator to ensure that data is “copied” to the DR site as per the organization’s service levels. It then allows the organization to create a view into that DR copy and automatically start virtual machines, their applications, and run verification code. The verification code could issue a report back to the IT team. This capability answers the most common question about data protection and disaster recovery: “will it work?”.
Many large enterprises and especially service providers are faced with the challenge of having to manage to many consoles or software GUIs. The second wave of copy data solutions need to offer a RESTful API. This type of interface allows large enterprises and service providers to integrate the copy data solution into existing operational consoles.
Providing efficiency of secondary copy data is only the first step for copy data solutions. The next steps are to be able to orchestrate and then leverage this data. Companies like Catalogic are providing these abilities in their copy data solutions today so that data centers can not only save the capital expenditure of secondary storage capacity, but also the operational expenditure of storage administration time. Along with this operational savings comes a reduction in errors, more rapid deployment of the secondary copy, and 100% confidence in the organization’s ability to recover from a disaster.
Sponsored by Catalogic Software