A DRaaS solution can use several methods to get data to the provider, each method directly impacts RPO and RTO. IT needs to be very careful to select the right method to meet the organization’s objectives and budget.
The primary reason for Disaster Recovery as a Service (DRaaS) rapid adoption is that it solves a problem that has been nagging at IT since, well ever since there has been an IT. But all DRaaS solutions are not created equal and how your DRaaS solution gets data to the cloud will directly impact how quickly you can recover and how much data will be lost in the process.
As a quick refresher most DRaaS solutions operate by copying data from the local data center to either a public or private cloud provider. Once that data is stored in the cloud, it sits patiently by waiting for disaster to strike. When it does the cloud provider can position the data so you can start your applications up as virtual machines in their cloud.
The advantage is, assuming you can resolve the networking issues, you have a DR site with available compute and storage that you don’t have to pay for until you need it. Other than testing your plan and assuming your data center is not in a war zone, you may only need to declare a disaster once or twice in your entire career. You do have to pay for the storage you consume as part of the prep work but that cost pales in comparison to a fully equipped second data center.
Preparation via Cloud Backup
Meeting strict recovery point objectives (RPO) and recovery time objectives (RTO) is possible with a DRaaS solution but you have to know how the cloud is being prepped. In the classic backup to cloud solution prep is done as part of the backup process. Which could mean that data is only sent to the cloud once per night. Some cloud backup solutions have fine tuned their offering to backup more frequently but it is seldom continuous. The frequency of backup impacts RPO, the less often data is protected the greater the RPO window.
The backup method also has a recovery challenge, most backup solutions store data in a backup format. When disaster does strike (or a test is planned) data has to be restored out of the format and into a format that the cloud’s compute layer can use. It also needs to transfer from cheap backup storage to a more production grade storage tier. All this movement and conversion takes time and it impacts the RTO.
Preparation via Cloud Replication
A significant step forward comes from replication software companies that are adding support for cloud targets. These solutions replicate data in much tighter intervals, in some cases as it changes, to the cloud. The increase in copy frequency lowers RPO significantly. Data is typically stored on production class storage in the cloud data center. Depending on the solution the data is transformed to work in the cloud as it lands or soon after.
While the RPO is lower with the cloud replication method, it is not down to zero. There is a time lag in how long it takes for the replication process to trigger and there is a time lag for how long it takes for the data to make it all the way up to the cloud.
The RTO of a cloud replication solution is very good since it is already stored in a live format on production storage. The challenge is that you are now paying for production storage twice, once at your primary data center and once in the cloud.
Preparation via Cloud Caching
A third method of cloud preparation is appearing on the market, cloud caching. Instead of pushing data up to the cloud, cloud caching solutions push data down to the primary data center. Since only the most active data is pushed down, or more accurately cached, the need for on-premises storage is reduced significantly. The cloud caching method has the advantages that the data is already in the cloud and it is updated as it changes, so RPOs can be very tight and since it also stores data on production cloud storage RTOs are near instant.
Most cloud caching solutions have the same problem as cloud replication, their RPO weakness is the amount of time it takes to get data across the network to the provider, which is essentially a latency issue. But they also have the added problem of latency of a cache miss, which could occur much more frequently that a disaster.
A way to work around the cloud caching latency problem is to establish regional point of presence (POP) that are only milliseconds of latency behind the primary site. That way new data and data updates copied and stored in almost real time. Also an on-premises cache miss will be unnoticeable to users and applications since the regional POP can service the cache miss in a few milliseconds.
Each of the methods for preparing the cloud to help you recover from disaster are a significant improvement over doing nothing and each data center needs to make its own decision. While the cloud cache method represents the largest commitment to the cloud, it does represent very low PRO and RTOs. It also delivers cost savings as there is no longer a need to continue to invest in primary on-premises storage.
How a DRaaS solution gets data to the cloud is only one element in improving the time required for an organization to recover from a disaster. To learn more about improving DR recovery times while also reducing costs, watch our on-demand webinar, “How to Cut Disaster Recovery Expenses – Improve Recovery Times”.
Sponsored by ClearSky Data