As discussed in our last entry, replication is an ideal way for most organizations to meet the data protection and data recovery demands of their mission critical applications. But there is a choice that IT needs to make when selecting a replication solution. Should they select a hardware-based solution, one that is integrated into their storage system or should they select a software-based solution that requires separate implementation but provides freedom of choice?
What is Hardware-Based Replication?
Hardware-based replication is still software. It is just integrated into the storage system. Most storage systems today include Ethernet connectivity, so granting them access to a WAN connection is relatively easy. Also, enabling replication on these systems is straightforward and with a few clicks the storage administrator can replicate volumes of data. Hardware-based replication does not require the installation of any software on the client, which further simplifies installation. Finally, the client-less replication means that any operating system that can connect to the array can be protected.
Most hardware-based replication solutions leverage the storage system’s snapshot capability to feed data to the target system. After making the initial copy, a snapshot is taken on a periodic basis and the changes in that snapshot are replicated to the secondary storage system. The window of data exposure is dependent on the time in between snapshots. Considering most organizations don’t snapshot every 15 minutes, hardware-based replication may not meet an organization’s mission critical recovery needs.
Hardware-based replications has some other notable downsides. First, in almost every case, the target system must be from the same vendor as the source system. Some of these vendors will allow replication from a higher-end system to one of their lower-end systems, but the premium the customer pays for a name brand storage system is definitely felt at least twice. Considering the customer may rarely use that second system, the extra expense may feel wasteful. The hardware limitation also means an organization using multiple vendors to meet their primary storage needs will have to manage several different replication interfaces.
The second concern stems from this first concern; if the source and target systems must be from the same vendor then replicating data to the cloud is more limited. A few of these vendors have enabled the software of their storage systems to run in the cloud which would enable cloud replication.
The third concern is because of their lack of an agent, these systems lose granularity to the data they are protecting. They operate at a volume level so if the volume is 90% extraneous data and 10% mission critical data, these solutions need to replicate the whole volume.
What is Software-Based Replication?
Software-based replication is not integrated into the array. This means a software client needs to be installed on the servers being protected and each server needs to be individually managed and monitored. Some replication software will install at the hypervisor level so a single installation will protect all of the virtual machines in an environment. Typically, the client software will replicate data, at a byte or block level, as it changes to a target system, narrowing the window of exposure.
Despite this seemingly more difficult installation, software-based replication has some significant advantages. First, it is near-realtime. As data is changing it is being sent to the secondary storage system. There is not the exposure window of hardware-based replication. Second, its ability to run at the client means it can be more granular, and only replicates data sets needed in the disaster. It may also have better interfaces with applications, allowing for the creation of a cleaner snapshot.
The third advantage is probably the biggest. With software-based replication the secondary target can be from a variety of vendors, including second tier vendors that may be significantly less expensive than the name brand vendors. The lowering of costs may also enable the organization to include the protection of a greater number of applications with the software.
This hardware flexibility also helps on the source side. The source systems can also be from a variety of vendors, that means an organization with primary storage systems from multiple vendors can use a single replication tool to protect mission critical applications. It also means that when the organization decides to upgrade its production hardware they don’t have to change their mission critical replication process.
The final advantage is software-based replication, because it is flexible on the target side selection, is better equipped to work with cloud providers. In fact several replication vendors allow the replication of data to all of the major public cloud storage providers.
The big advantage to hardware-based replication is it can protect a hundreds of servers with very low installation requirements because they are protecting at the volume level. There is also the appearance of the capability being free since it is bundled with the storage hardware. Of course it’s not really free, it is just factored into the price.
Software-based replication has the big advantage of flexibility. The hardware targets and sources can be from a variety of vendors which allows the organization to keep costs down. It also can provide a more frequent protection minimizing exposure, something important to mission-critical applications.
Since protecting mission critical applications are typically few in number, the fact replication software has a client component to be installed on every server it protects is of less consequence. The gain in better granularity, better hardware flexibility and more continuous protection offsets the theoretical challenges of installing the client on a handful of mission critical servers.