All restores are not created equal, even during a disaster. The reality is that IT should not recover most mission critical systems from a backup of any kind (flash, disk or tape). Instead, IT should recover these critical systems from a replicated copy of data that is more than likely on a second production quality storage system. The backup process handles all the other less time sensitive recoveries. First, there are a handful of applications that have an hour or two recovery time objective (RTO). These applications should be on backup disk and leverage a recovery in-place type of feature, now common in most backup applications. With the most mission critical and business important applications up and running, IT can turn its attention to the remaining applications and data sets. One can make a case that these remaining recoveries can come from tape.
Depending on the organization, as much as 80% of all backup data should be stored exclusively on tape, which can dramatically reduce backup storage infrastructure costs. Assuming you agree with our analysis from our blog “Comparing the Price of Disk-only Backup to Tape Integrated Backup” an organization can cut backup infrastructure hardware costs by as much at 60%. Using a tape integrated methodology also reduces long-term costs since tape libraries upgrade capacity by simply adding more media, not by adding more libraries.
Given the realities of how restores actually happen, it makes sense to see if backups themselves can change. Can IT send data directly to tape and bypass disk altogether? First, let’s discuss what can’t backup to tape first. Tape can ingest data at incredible rates but it needs to be fed data consistently or it has to slow itself down and then speed backup, a process called shoe shining. Any data set with many small files, or any network constrained environment, should use disk as a cache and then move the backup jobs to tape from the disk. This caching method still saves an organization money since it only uses the disk capacity for a short time.
The first data set that can go straight to tape, bypassing disk, might be surprising; mission critical data! In most cases, IT is already replicating mission critical data for recovery. They are not counting on the backup process at all. Most mission critical data can stream to tape very effectively because it is large in size and easy to send to tape in parallel jobs. As a result, tape drives are able to quickly get up to their rated speeds and stay there for the majority of the job.
The second type of backup job that can go directly to tape is any file system that stores large files like high-resolution video or audio. These files typically are completely unique from the other data in the environment, so there is no deduplication benefit and the file sizes are large enough to keep a tape drive running at full performance.
Most organizations, if they embrace tape, and we gave you a lot of reasons to do so in our Online Guide, “Reintroducing Tape to the Data Center,” will send all backups to disk first, then later copy those jobs to tape. This, as we discussed in that series, will reduce infrastructure costs significantly.
IT however can realize even greater savings by looking at three scenarios where moving data to tape sooner is a better option. Any data that does not need sub-four hour recovery or that has another recovery option, like replication, is a viable candidate for direct to tape or cache to tape backups. Data sets without a rapid restore requirement but with a lot of small files, or situations where organizations are using block level incremental backups, should leverage disk as a cache and then move the job to tape after it completes. Mission critical applications may be able to keep tape drives running at full performance for the entire backup, as should large-file data sets. Given the potential percentage of the total dataset that these systems represent, the additional savings beyond just moving old backup jobs to tape should be significant for many organizations.