Most of the environment in a majority of data centers is now virtualized, often using VMware as the primary hypervisor. These organizations have also virtualized mission critical applications which need to be recovered quickly in the event of a disaster. Most IT professionals leverage snapshot replication to prepare their DR facility in the event of a failure. While snapshot replication is a massive improvement compared to traditional backup, it may no longer meet the expectations of users.
The Challenges of Snapshot Replication
Most storage systems build replication solutions from their snapshot capabilities. When a snapshot is taken, the block of data that changed between snapshot jobs is replicated to the remote site. Therein lies the first problem. Something else has to happen before data is sent to the remote site, data changes are not sent in real-time as data changes. If the organization executes a snapshot every few minutes, the time delay is probably not a problem. But most organizations do not take snapshots every few minutes, or even every hour. In my experience “aggressive” snapshot strategies execute snapshots every four hours or so, and a once or twice a day schedule is far more common.
There is a reason for the delay between snapshot events, something has to happen prior to the snapshot being triggered. The application being snapshot needs to be quiesce so that it can capture a clean copy of its data, otherwise you risk a dirty snapshot which can either delay or prevent recovery. My colleague Curtis Preston details the importance of application aware snapshots in this blog.
The other problem with snapshot-based replication is they are often single point, meaning they are designed for replicating data from point A to point B and few solutions support a point B that is in the cloud. This monolithic approach to replication ignores the reality that many organizations are multi-site with storage infrastructure in each of those sites. It also ignores the fact all organizations have access to public cloud storage and compute. This lack of recognition of a dispersed data center is ironic, since one of the key advantages of VMware is its ability bring workload portability.
It might be time for organizations to rethink their storage infrastructure. Today a storage architecture is implemented on a per site basis and “connectivity” is point to point instead of a distributed fabric or mesh. Disaster recovery, be it by snapshot replication or backups, is a secondary process that needs managing and monitoring independently of primary storage.
Distributed storage moves the organization away the monolithic designs of the past. Nodes are implemented at each site but the intelligence is distributed across sites. Data can be replicated multiple times within a site and to multiple sites automatically, which means VMs move seamlessly between sites with no special data preparation steps. Disaster recovery is integrated into the storage system, as data changes, data is distributed across the cluster which spans multiple locations.
In our on demand webinar, Storage Switzerland and Hedvig compares the various VMware techniques and then compares them to distributed storage. In the webinar we cover how to create a highly available infrastructure that provides not only disaster recovery but complete workload mobility. Register today and get an exclusive copy of our latest white paper “What to expect from your VMware Storage“.