With each passing year, disaster recovery changes. Most of the time, those changes relate to faster and faster recoveries, as well as protection from net threats like ransomware. In 2020 the pressure to recover even faster remains critical, but there are also new requirements that IT needs to work into their disaster recovery plans.
A new expectation of disaster recovery plans is not only the ability to have a plan and to test that plan but to also prove that the plan works and can have the organization back up and running in a reasonable time. While it is not unusual to demonstrate the ability to recover from a disaster internally, now organizations also need to prove to external regulating bodies that they can recover as part of compliance with regulations like GDPR and CCPA. Additionally, just working is not enough; the DR plan has to work within specific timeframes. The challenge is these timeframes are subjective, so IT has no set target that indicates how quickly they need to recover. While IT doesn’t need to assume the worst case scenario of an “up in minutes” recovery, clearly recoveries that take IT days or weeks are not acceptable.
To “prove it”, IT needs to test its recovery capabilities continually. The days of once or twice a year DR tests are no longer acceptable. The challenge with annual or bi-annual tests is IT often returns with a punch list of components that didn’t work. IT needs to try to work out solutions to these failures before the next DR test.
Given the new regulations and even internal expectations, organizations need DR testing to be flawless, every time. IT needs to shift disaster recovery testing into a highly automated process, where sections of the plan are tested monthly or even bi-weekly. Workflows are required to make sure that those applications and operations are brought up in the right order every time, so the personnel testing recovery can execute the DR process with a single click of a button. Additionally, testing needs to be performed within a virtual lab or in the cloud, so that IT doesn’t need to go offsite to complete the test.
Another new expectation of DR is for the organizations to know what data the DR process is storing. Regulations like GDPR and CCPA grant users the right to ask organizations to delete their data or at least make sure it doesn’t come back into production. Data stored within the backup process is not immune to these regulations. Consequently, organizations need visibility into the data that the backup applications are storing. Today this may include peering into files to make sure that the organization knows what data is referring to a specific user account or may have personally identifiable information (PII). Providing this kind of information means the backup solution needs to have a rich metadata index that can give context level information about the data it is protecting.
Another fundamental expectation, although not new, is the ability to provide all of these capabilities while lowering costs. IT is under more pressure than ever to reduce expenses and disaster recovery is not immune from this expectation. IT needs to take two steps to lower disaster recovery costs. First, it needs to look into the ability to archive older backup datasets, either by moving old data to tape or the cloud. In a disaster, most data, as much as 80%, is not needed, at least not immediately. The primary focus is to restore the latest copy of data, not old data.
Second, IT also needs to consider outsourcing the DR site to the cloud, also known as Disaster Recovery as a Service (DRaaS). With DRaaS, the organization can leverage cloud resources for tests and actual DR situations while not having to pay for those resources while not in use. Backup archiving also helps to lower DR in the cloud costs. The only data the organization needs in the cloud is the data required during disaster recovery. Older data can be restored later if required.
Still, Recover Faster
In addition to these newer requirements, IT must continue to make sure that the recovery effort itself meets the expectations of demanding users. Fast disaster recovery requires rapid and frequent backups to reduce the recovery point objective (RPO). Achieving the recovery time objective (RTO) means having the data on a backup device from which IT can restore data quickly. It also means having recovery servers available, at least for mission-critical applications. Again, another advantage of the cloud is computing resources are always available. IT must also factor cloud hypervisor conversion time into the RTO equation, though. In some cases, a hybrid DR plan may work best. Where a few very mission critical applications are recovered at a DR site, it requires dedicated standby servers, while the DR plan uses the cloud for servers with a four hour or more RTO.
Disaster recovery planning is a mix of time tested best practices and new approaches to meet new requirements. In our next blog we’ll focus on the time tested best practices before delving into new techniques. Storage Switzerland will be at Commvault Go in Denver offering its Disaster Recovery workshop at no additional charge to attendees. For more information go to https://www.commvault.com/go