While every IT professional will admit that it is important, disaster recovery testing generally falls to the bottom of the IT to do list, right after root canal. But if testing is not a key component, the process of creating a disaster recovery plan (DRP) is a wasted effort. In fact, most recoveries that either fail or don’t meet the required objectives do so because of a lack of testing.
Taking the Pain Out of DR Testing
1 – Frequency
Ironically frequency is the first key to less painful DR testing. The more often IT tests the DR plan, the better the personnel performing the recovery get at doing the job. Frequent testing, at least initially, means more failure and higher frustration, but that same frequency forces the organization to address problems or at least be aware of them.
Instead of testing once a year or even once a quarter, try to test once per week or at least once per month.
2 – Scope
The second key is to limit the scope, which thanks to frequency is now practical. Depending on the frequency schedule only test one application or scenario at a time. During a quarter, try to test nine different applications or scenarios. For example, on week one, test a failure of the MS-SQL server. The next week, test a failure of the MS-SQL storage system. Same environment different situations, probably different recoveries. After MS-SQL, move on to another environment, testing various scenarios in it.
At some point during the year, a full-scale DR test should be performed, just to make sure all the tested parts come together.
3 – Document and Report
The third key is to document and report on the results, regardless of those results. Even if the news is not good! Document the process and results of each mini-test. Make note of the time involved and how much data is lost. Report these results to stakeholders. The report should be relatively informal so it does not slow down the distribution of information. Make sure to note in the report that the results are based on a test, and should be considered best case.
Create non-disclosure agreements and then share test reports with trusted outside advisers (consultants and vendors). See if they can make suggestions for areas of improvement and costs to do so.
Not every improvement requires spending money, some of the best improvements are free. A change in backup order, server grouping and protected data scope can make significant improvements in recovery time. But, of course, sometimes IT should buy new hardware or software or maybe leverage the cloud to make a difference in both recovery times and costs.
4 – Improve
The final key is to set a goal of improving the process constantly, even if the current plan is meeting objectives. Recovering faster with less data loss will be a constant ask by the organization, and if they are not asking, it is a competitive advantage. Some of this improvement will come from repetition, some from process improvement and some from additional investment in software or hardware. The key is to keep improving.
StorageSwiss Take
The goal of DR testing is to recover, regardless of the situation, in a known time frame. In most disaster situations, it’s the not knowing that drives employees crazy and customers away. Continuous testing provides that predictability. Ideally the testing should create a muscle memory effect so that IT staff can recover without thinking because in a real disaster their minds are probably on other matters.
To learn more about taking the pain out of DR testing read “The Three Steps to Maximum Disaster Recovery Success“.