What’s The Best Way To Rapidly Recover Data?

Posted on December 13, 2017 by George Crump

There are more methods to recover data than ever. It used to be that recovery meant loading a tape drive, scanning the whole tape to find the job that had the needed data, extracting that data from the job and finally restoring it. Today, organizations can choose from snapshots, replication, recovery in-place and changed block recovery. Which of these methods is best and what are the downsides of each?

Balancing Recovery Expectations

In the end, assuming a successful protection process, recovery comes down to a balance of meeting expectations and managing costs. Setting and maintaining service level objectives (SLO) is the key to understanding those expectations. It is essential for users to understand that while everyone may want instant recovery and forever retention, the organization likely can’t afford to deliver that level of service to all parties.

Understanding Fast Recovery Technologies

Generally speaking, the faster a solution promises recovery, the more expensive it is and the more exposed it is to damage. Snapshots can create an instant copy of a production application. They don’t consume additional storage space until changes start to occur to the active copy and they don’t need to transfer data to a remote system. But if the storage system fails, the snapshots typically fail with it.

Replication solves the same storage system problem. The replication process captures every change and sends it to another storage system providing very rapid recovery. The recovery point objective (RPO) is very low since changes are sent to the second system as they happen, and the recovery time objective (RTO) is low because data is stored in an active state on a live file system.

In the event of a storage hardware failure, the application only needs to redirect to the standby system. But if the failure is software-based, a data corruption, for example, the replication system will replicate that corruption almost as soon as it occurs on the primary system.

Instant Recovery feature solutions perform a backup of the production application on a scheduled basis, typically every hour to four hours, still delivering a reasonable RPO. They also can mount a working image of the protected system. The problem with the instant recovery method is the performance of the system storing the backup is now critical, and at some point, data has to be transferred back to the production system, which likely means an outage at some point.

Change block recovery is the last of the rapid recovery technologies and also the least common. It is primarily the inverse of a changed block backup in that it only restores the data required to move the application back to a specific point in time. A changed block backup updates the backup copy with new or changed data. A changed block recovery replaces new or changed data on the production system with old data, moving it back to the desired point in time. Assuming, like a change block backup, that most of the data have not changed, then only a small part of the data needs to be transferred.

The good news is the data is being recovered to a system that is designed to be production class and the transfer itself should complete very quickly. The bad news is that first, it does need to be transferred and, second, most of the data on the storage system does need to be intact. Given that today most failures are the result of software bugs or user error, this recovery method has a lot of potential.

When to Use What?

The pros and cons of each of these rapid recovery solutions explain why an organization should use multiple methods. Certainly, in preparation for the worst case, a separate, disconnected backup copy is critical. Then some form of rapid recovery is needed. For most organizations, a combination of replication and live recovery should meet all of their SLOs as well as strike a balance between meeting those objectives and adherence to the budget. The key is to make sure the chosen solutions complement each other and that the chosen hardware will allow those solutions to live up to their potential.

About George Crump

George Crump is the Chief Marketing Officer at VergeIO, the leader in Ultraconverged Infrastructure. Prior to VergeIO he was Chief Product Strategist at StorONE. Before assuming roles with innovative technology vendors, George spent almost 14 years as the founder and lead analyst at Storage Switzerland. In his spare time, he continues to write blogs on Storage Switzerland to educate IT professionals on all aspects of data center storage. He is the primary contributor to Storage Switzerland and is a heavily sought-after public speaker. With over 30 years of experience designing storage solutions for data centers across the US, he has seen the birth of such technologies as RAID, NAS, SAN, Virtualization, Cloud, and Enterprise Flash. Before founding Storage Switzerland, he was CTO at one of the nation's largest storage integrators, where he was in charge of technology testing, integration, and product selection.

Tagged with: dr, Recovery in Place, Replication, RPO, RTO, SLO, Snapshots
Posted in Blog

What’s The Best Way To Rapidly Recover Data?

Balancing Recovery Expectations

Understanding Fast Recovery Technologies

When to Use What?

Share this:

Related