The 3-2-1 rule states that an organization should have three copies of data on two different types of media, with IT storing one copy off-site and preferably offline. Traditionally, organizations count on the backup process to meet the 3-2-1 rule’s requirements. The problem is that designing the backup architecture to meet these requirements is expensive and complicated. Additionally, the ability to store and access offline data is proving more challenging, and even offline copies are not always safe from a malware attack.
The Problem with Designing Backup for 3-2-1
Designing backup architectures to live up to the 3-2-1 standard, requires transferring all data from primary storage to a secondary storage device and then replicating or shipping a copy of the copy off-site. The cost of software to perform these functions, the networks to make it possible and the secondary storage to store all the data quickly adds up. Then there are the management costs of an IT administrator to babysit the process.
The core problem in designing a backup architecture to satisfy the 3-2-1 requirement is that the backup software operates outside of the primary storage systems it protects. While there is some integration between snapshots and backup systems, the backup software still needs to transfer data from primary storage to a secondary storage system. It also then needs to manage the replication of that data to another storage system at another location.
To achieve some efficiency most modern backup applications backup data as images instead of discrete files. As organizations start to address the rising tide of data privacy issues, the need to search and remove specific datasets within backups dramatically increases. Image backups can’t offer this functionality without first restoring the image.
Can Primary Storage Meet the 3-2-1 Requirement?
But what if IT abandons the notion of backup as the means to meet the 3-2-1 requirement? Modern primary storage can almost fulfill the 3-2-1 requirement. Snapshots provide point-in-time protection, and they feed the creation of a second copy of data to a separate system. The data on that system can be snapshotted again for a further point in time granularity. Most primary storage systems can also replicate data to a remote location.
In this configuration, the primary storage system meets the three-copy requirement and part of the 1 copy off-site. It’s also relatively simple to design and manage. However, it does not meet the two different types of media requirement, nor does it meet the disconnected offline requirement.
The more significant problem is that designing a primary storage architecture to meet the 3-2-1 requirements is very expensive. Most vendors’ storage systems only perform replication functions to a similarly configured system. In most situations, the organization has to purchase three of the same system, two on-premises and one at a remote site. They also have to pay for that remote site and manage it. The expense of the architecture is why the overwhelming majority of organizations continue to invest in traditional backup architectures, which despite its complexity, is still considerably less expensive.
Using Cloud Storage to Create 3-2-1-Viable Primary Storage
Meeting the 3-2-1 rule through primary storage is less complicated which should lead to greater consistency, but it does need to overcome the cost problem and areas where it does not meet all of the 3-2-1 requirements. Cloud storage, if used as primary storage, may enable the organization to meet the 3-2-1 requirements cost-effectively.
A big challenge to using the cloud as primary storage is overcoming the latency concern. IT can overcome latency by placing an on-premises appliance, which intelligently caches data. The problem is that with the traditional cloud as primary storage, cache misses are fulfilled from the cloud, which is far away, introducing enough latency to cause an application to crash. The way to overcome this challenge is to have a second tier more regionally located, at the edge, that is milliseconds away in terms of latency.
With the latency concern addressed, using the cloud as 3-2-1-viable primary storage becomes possible. The three-tier architecture of cloud as primary storage means that active data is automatically in three locations (on-premises, secondary tier, and public cloud cloud) and replicates to two or many more locations of a large cloud provider. Cloud as primary storage more than meets the three copy requirement and the one copy off-site of the 3-2-1 rule but does so in a vastly more cost-effective way.
The requirements of the two different types of media part of the 3-2-1 rule are harder to meet. Traditionally “two copies” meant copying data from a hard disk-based system to a tape-based system. Modern interpretations suggest that copying from a flash or hard disk-based infrastructure to a dedicated hard disk or object storage driven infrastructure meets the spirit of the two different media types requirement.
If the organization indeed has a concern that there is a viable threat to the worldwide hard drive installed base from a weapon or hack then the best place to make a copy to tape is at the secondary tier. The reality that data is available on multiple types of storage (flash and hard disk) and different storage format types (file, block, and object) meets at least the spirit of the two different types of media requirement.
The two copies, at least in the modern era, are more about a copy, resistant to modification, from outside influences. If that is the concern then a large cloud copy can be set to write-once, read many or IT can limit authorization so that no other services other than the storage service, can update the second copy.
The Cost Advantage
The significant disadvantage of using primary storage to meet the 3-2-1 rule has been its expense. The cloud as primary storage eliminates this disadvantage.
First, cloud as primary storage reduces the organization’s dependency on on-premises storage. It is practical for the organization to eventually have a compute-only data center and access all data through the on-premises cache. In the 3-2-1 construct, cloud as primary storage eliminates all the primary storage systems and the on-premises secondary storage systems.
Second, cloud as primary storage eliminates the need for backups. New or modified data is stored on the on-premises cache and it instantly (within milliseconds) copied the data to the regional tier. Then a few seconds later, it sends a copy to the large cloud provider. There are providers who eliminate the customer ever having to pay for this replication, and simply provide a single durable copy to customers as part of a service agreement.
Third, cloud for primary storage eliminates the need for a dedicated disaster recovery site. If there is a site failure, the organization can restart applications in the cloud, since all the data is there in its native format or it can reconnect to the regional tier from an alternate location.
The cloud as primary storage is compelling just from its data protection capabilities. The simplicity with which it allows the organization to meet 3-2-1 requirements promises to make data protection headaches a thing of the past. Combine cloud as primary storage’s data protection capabilities with its potential to lower costs and better leverage cloud resources, and it becomes a solution that any organization should consider.