Most of the time IT spends focusing on disaster recovery planning is actually spent on protection, making sure data is being copied to a secondary storage system and a secondary site either by backup or replication. Some advanced recovery planners may have incorporated testing into their plan. But very few have thought about the recovery process itself. Now, with the impact of Hurricane Harvey and Irma on full display, IT professionals need to think about not only recovery of their personal lives but also the recovery of the organization for which they work.
Major Disaster Recovery is Different
In a year that is showing the negative impact of ransomware, the thought of a major recovery effort – one in which the entire data center is wiped out – has been put on the back burner. As the organizations impacted by Hurricane Harvey are finding out, major disaster recovery efforts are something entirely different.
In many cases, the time pressure to recover is eased, as organizations find they truly can go a few days without data and still be in business. But at some point, however, life has to be breathed back into the organization and a recovery process has to start. In other cases, the business has failed over to other locations or the cloud and the business is operational relatively quickly. In both situations though, the business is eventually, and hopefully, back up and running at least from a data perspective. Now, the problem becomes how to get out of the DR site and back to the primary data center.
It is Never Too Soon To Plan
Once the organization is back up and running at the DR site it is never too soon to start preparing for recovery. The first step in the recovery process is to treat the DR site as the primary site. That means implementing a backup and DR strategy for the DR site like what was in place for the original primary data center. While lightning may not strike twice, disaster might – so be prepared.
The next step is to start thinking about moving back to the original data center location. In any disaster, but especially hurricanes, there will be data centers that start disaster recovery operations but end up with a primary data center that remains intact. There will be others that have partial damage and others that are a complete and total loss. Each requires a different strategy for recovery.
If the data center is intact or partially survives, that means some or all of the data is also still intact. But since there was a failover, more than likely the DR site has a more recent copy of data. If that is the case, IT should not copy all the data back from the DR site. It should, instead, save time and only copy the data that was changed while the DR site was acting as the primary. Most replication tools should be able to handle this type of request. Many backup software products will not. Before the recovery starts, understand the recovery granularity of the data protection tools the organization is using.
If the data center is a total loss, and that may be the case for most of the organizations impacted by Harvey, then a full recovery back to the new primary data center is required. It is also safe to assume that none of the physical hardware is suitable for use. Water that is partially salt water, mixed with diesel and who knows what else is not conducive to electronic hardware. The organizations will have to replace it all.
The best move is to have all of the new hardware shipped to the DR site so an initial recovery can be done via local network connections. Then ship the hardware to the new primary site, maintaining operations in the DR site. When the new equipment is installed and implemented at the disaster site, do a smaller re-sync type of operation so the data that changed during transport and implementation can be sent.
The Cloud Recovery Problem
Recovery is one challenge to using the cloud for backup and disaster recovery, even DRaaS. DRaaS solutions are ideal for rapid recovery but moving the full data set back across that Internet connection may be a challenge. In theory, the organization could continue to run operations in the cloud while data trickles back across the Internet. The organization needs to understand the cost involved and it needs to understand how long it will take to replicate potentially petabytes of data. All of the technology an organization uses during the backup process will not help.
Organizations should look for DRaaS providers that either don’t charge for compute costs once replication back to the primary site has started, providers that can create a disk or NAS copy of data and ship that copy to the new primary data center, or providers that will allow the organization to ship new physical hardware to the cloud provider’s data center for local restoration.
Priority one for most IT personnel is, as it should be, to take care of their family and personal property. But eventually thoughts return to the business. Getting systems back up and running becomes critical. Most organizations will eventually get back to operation at some level, losing some productivity and data (eliminating that loss is the subject for another day). Returning to the primary data center is THE challenge that faces organizations in the south Texas, Louisiana and Florida areas and while it make take months for that return to occur, preparing for that return should begin now.
At Storage Switzerland, backup and recovery are full time jobs. We are fortunate to have several webinars already scheduled that meet the needs of organizations impacted by Harvey and IT professionals that are heeding the warning signs.