High Availability vs. Instant Recovery

Posted on February 26, 2018 by George Crump

All applications in a data center are critical to some degree, and while there are a few applications that are mission-critical, most applications can be down for at least a few minutes without seriously impacting the organization. The problem is when it comes to providing availability most organizations take an either-or approach. Either they use a High Availability (HA) product, like replication, for all their applications, or because of the cost of HA they use backup exposing them to the risk of hours of downtime. But now there is an alternative that can establish a middle ground, instant recovery.

The Differences Between HA and Instant Recovery

High Availability solutions come in several forms, but short of deploying a full-scale application cluster, most organizations will leverage a replication solution. Replication can also occur in several forms. It can be built into the storage system and replicate data as it changes from one storage system to another, typically on a per volume basis. Replication can also be loaded onto hypervisor and replicate between storage systems within or between hypervisor clusters. Another option for HA is for the replication software to be installed on the server or virtual machine running the application and some applications have a replication function built in. Lastly, some backup applications are adding a replication function to their capabilities.

Generally speaking, replication works by making a real-time or near real-time copy of new or changed data onto another storage system either in the data center, at another site owned by the organization or in the cloud. It should be able to meet recovery time (RTO) and recovery point objectives (RPO) of just a few minutes and is ideal for mission-critical application recovery.

Replication requires a doubling of capacity, and while many replication solutions can maintain a point in time history of protected data, they are not designed to support long-term (or even medium-term) data retention. That is the role of backup and archive. Replication can not replace backup and archive and, in most cases, it is run as a separate task outside of the backup process. It is also a separate storage system, independent of backup storage.

Replication solutions have become much more affordable than they have been in the past. They are also becoming more flexible, able to replicate data to the cloud, for example. This flexibility is especially apparent in hypervisor-based replication solutions. An option for some organizations is to actually to use replication for all their major applications, and only use backup for long-term data retention.

Instant recovery, also known (depending on the vendor) as Live Recovery, Boot from Backup and Recovery in Place, are integrated into the backup process, meaning they can be managed from the data protection solution and they can leverage the same storage as the backup. The feature requires granular sub-file backup capabilities, known as a block-level incremental backup or change block tracked backups. These granular backup techniques allow for repeated backups to be executed on the same data throughout the day. While the backups cannot execute as data changes, a backup frequency of every 15 minutes is very possible.

The second part of instant recovery is its ability to instantiate a volume on the backup storage device. If a primary storage system fails or data is corrupted, the backup software can create and mount a virtual volume of the data for the application. Assuming backups every 15 minutes that means the exposure window is 15 minutes of data loss. Typical recovery times should be within that same time window since no data has to be copied from backup storage to primary storage, at least not right away.

While it will not be acceptable for the organization to have every application in the environment lose 15 minutes of data and potentially suffer 15 minutes of outage, a vast majority of the organization’s applications should be able to. If the organization can cut the number of applications that need faster recovery and less data loss to one or two applications, then the cost savings for the organizations can be significant. Instant recovery uses comes with an application (backup) the organization has to have anyway. It also leverages backup storage, which it also already has to have. It leverages the same data as the backup so extra capacity is not wasted on HA copies.

There are some downsides to instant recovery the organization needs to keep in mind. First, since the backup storage may be counted on as primary storage at some point that backup storage has to have decent performance and reasonable hardware redundancy. Second, at some point, the instantly recovered volume will need to be migrated to a volume on a production system. A plan has to be in place to know when that migration should occur and what the performance or downtime impact is. And third, there will be some downtime and some data loss. Expectation setting is key to using instant recovery.

StorageSwiss Take

Instant recovery may be the only rapid recovery solution that many organizations need in that 15-minute recovery is acceptable for all its applications. The cost savings of moving recovery expectations from three minutes to fifteen are significant. While instant recover does require some additional capabilities from backup storage, those capabilities should not add significantly to the cost.

At the same time, HA is no longer for the elite few. Software-based replication makes the cost of implementing HA much more affordable. In fact, some organizations will find that by broadly implementing replication, they can decrease their backup investment while providing sub five minute RPO/RTO.

Most organizations will blend the two. Using replication for mission-critical applications and using instant recovery for the rest of their applications.

About George Crump

George Crump is the Chief Marketing Officer at VergeIO, the leader in Ultraconverged Infrastructure. Prior to VergeIO he was Chief Product Strategist at StorONE. Before assuming roles with innovative technology vendors, George spent almost 14 years as the founder and lead analyst at Storage Switzerland. In his spare time, he continues to write blogs on Storage Switzerland to educate IT professionals on all aspects of data center storage. He is the primary contributor to Storage Switzerland and is a heavily sought-after public speaker. With over 30 years of experience designing storage solutions for data centers across the US, he has seen the birth of such technologies as RAID, NAS, SAN, Virtualization, Cloud, and Enterprise Flash. Before founding Storage Switzerland, he was CTO at one of the nation's largest storage integrators, where he was in charge of technology testing, integration, and product selection.

Tagged with: Archive, Backup, Cloud, HA, Hypervisor, Migration, Replication, RPO, RTO, VM
Posted in Blog

High Availability vs. Instant Recovery

The Differences Between HA and Instant Recovery

StorageSwiss Take

Share this:

Related