Verifying that you’re ready for a disaster is difficult but not impossible. The three elements of your infrastructure that must be present for a successful recovery from a disaster are compute, network, and storage. Let’s take a look at how you prepare each of these elements for disaster.
Making sure you have adequate computing resources in a disaster is significantly easier today than it has in days past, thanks to the invention of server virtualization and cloud computing. In the old days, one had to provide their own servers during a disaster, or contract with another company to provide the servers. The difficulty of matching production server hardware with DR server hardware was a constantly losing battle, as DR hardware had to be updated every time production hardware was updated. Now that both sides of the equation are using server virtualization, updating the DR configuration costs nothing but the time spent using the configuration tool to update the VM configurations. There are also tools that can completely automate that process.
Similarly, making sure you have the appropriate network infrastructure for disaster is a lot easier than it used to be. Most infrastructure providers have much more bandwidth available than a typical DR customer would require in order to perform their job. Therefore, the hardest part of making the network infrastructure ready for disaster is setting up the DNS and VPN configurations so that the DR network can behave as if it is part of the local data center. This is not a simple process, but the IT administrator can automate it.
Preparing the storage infrastructure and the data that resides on it is also significantly easier than it used to be with the advent of DR Ready storage. It is a much simpler process to define and make ready storage of equivalent capacity and performance. It’s also very simple to continuously update the data on that storage. With modern storage at the right level of abstraction, automation is significantly easier.
Trust but verify
Just because it is possible to define and make ready all of the elements of the infrastructure in advance, that doesn’t mean it will actually be ready in the case of a disaster. The only way to verify that you’re ready for disaster is to test a recovery on a regular basis. Unfortunately, most people test only a small portion of their infrastructure when doing a DR test – if they test anything. They recover a single application or database. In fact, many people doing a recovery test perform only a data restore or verification. They do not actually restore an entire application.
It’s important to understand that modern applications usually use a variety of resources residing on several interdependent systems. Where historically you could restore a single database and know the application using the database was on the same system, this is no longer the case. So proper DR testing must first acknowledge many systems are related and they must be recovered in groups. These groups are referred to as consistency groups because the data between different VMs must be from a single point in time or it is not consistent. You cannot recover two different databases to two different points in time and then expect them to work together without some type of referential integrity issue.
This is why DR Ready storage needs to be able to understand the data being stored, and understand the concept of consistency groups so that all related data can be restored to the same point in time. Any DR tests should be restoring consistency groups, not simply a single database, application, or file system. IT personnel must be aware of this and trained on this by performing frequent DR tests. Being ready for disaster is as much about preparing your personnel and automation process as it is about preparing your infrastructure. If the first time your personnel are using your DR infrastructure is during a disaster, it’s going to be a disaster.
There is no reason a modern company should suffer a major outage during a disaster. But the reality is far from ideal. Stories abound of companies that do not survive major outages. Don’t let that be your company. Contract cloud services that can be made available in an instant in case of DR and test them upfront. The technology is there and the cost is reasonable. Just make sure to verify your trust in these service, especially since automation and data awareness make verification much easier and less expensive than it used to be.
Sponsored by Tintri