For organizations needing to protect mission-critical critical systems, software-based replication is an ideal way to meet and potentially exceed expectations. In fact, replication can be implemented so cost effectively that the scope of coverage can extend beyond mission-critical systems and even cover business-critical systems. The goal of this design is to provide a cost-effective yet reliable way to implement replication.
Step 1 – Software Selection
Selecting the right replication software is a critical first step. Most applications today do a good job of covering the basics like granular data movement, managing failover and write order fidelity. What separates them is how they support specific environments like VMware, clouds like Amazon, as well as how well they scale to support a large quantity of replication jobs.
Part of the software selection process is aligning the choices with the organization’s needs. For example, if the cloud is not in the organization’s future then cloud support is not critical, and if there is only the need to replicate a handful of servers then an application that can manage hundreds of replications jobs is overkill.
Step 2 – Hardware Matters
One of the advantages of a software-based replication tool is it can support a large variety of hardware. This means the secondary target does not need to be an exact duplicate of the primary storage system. But, a word of warning; the secondary storage target does need to have some level of performance and reliability. This is the system the organization will count on in the event of a disaster. If storage performance isn’t anything close to the performance of the primary storage system users are likely going to complain. And, of course, if during the disaster this secondary storage system also crashes, the organization will be completely down.
Step 3 – Consider two targets
Given the declining cost of mid-range storage systems it makes an increasing amount of sense to have two replication targets. Ideally, the primary storage system is replicating to a local on-premises storage system and to a remote storage system at another site or in the cloud.
The advantage of dual targets is the on-premises target can be used for recovery from minor disasters like a storage system failure. If that situation occurs administrators can simply point servers at the second storage system and be back up and running minutes from where they left off. This secondary target, assuming it has the capability, can be snapshotted and have those snapped volumes presented other applications for test-dev, reporting or backup.
Three Systems for the price of One
While the cost of three systems might seem impossible to afford, it is important to remember these systems do not need to be the same capacity as the primary storage system. The mission-critical servers are a fraction of the total server population and they are often one of the smaller consumers of storage capacity.
Also remember that only the most active data is being replicated, no historical information. Often the capacity of what really has to be replicated is less that 10% of the total capacity the enterprise consumes. That means if the organization has 100TBs of production data they only need a 10TB target system on-premises and a 10TB target system in the remote location. The remaining 90% is going to be protected by other processes within the data center.
Step 4 – Start Small
Considering these are mission-critical systems it is advised that the organization pay for initial implementation services. But be careful not to let the vendor just come in and take over. Instead, require that installation be done by the IT team under the oversight and guidance of the vendor or the value-added reseller. This lets the organization expand the scope of the installation on its own without having to bring the vendor back in.
Second, don’t start with all the mission-critical applications at once. In fact, don’t start with a mission-critical application in the first place. Start with an application that is fairly active, but if something goes wrong it won’t be a resume generating event. Then, as the IT teams becomes more comfortable with the process, move on to implementing additional servers with increasing criticality.
Step 5 – Test
Test, test, test. Once the replication software is implemented and bytes of data are magically appearing on another storage system, make sure the replicated copy is tested for accuracy. The best way to confirm this accuracy is to have a copy of the application try to access the data on the replicated storage system. IT should test a variety of failure scenarios; a fail-over where they had time for one last sync, and a fail-over where the interruption was sudden.
Another key testing point is to make sure to test how failback works. Test failing back with some of the data remaining on the primary system and test failing back with no data (simulating a data center loss). Make sure the time it takes to perform both fail-over and failback are understood and well-documented.
Replication, especially software-based replication, is a valuable tool that enables organizations to cost effectively meet the recovery point and recovery time objectives of their mission-critical applications. It also opens up new possibilities, like leveraging a second system on-site and use the cloud for disaster recovery.
As we discuss in our webinar, “Hitting Your Data Protection Sweet Spot”, replication is a key part in creating a balanced data protection strategy.