With the release of LTO-8 Tape media and hardware vendors are once again boasting about how affordable it is to store data on tape. With a compressed capacity of 30TB per tape, they are right in doing so. At Storage Switzerland, we continue to see tape as part of a data protection and archive strategy but we also think that the tape industry’s leading with price is a mistake. The price delta between disk and tape has always been there, but increasingly organizations are buying disk to fulfill tradition tape roles like backup and archive storage.
What’s Wrong With Tape
Part of the challenge with deciding if tape makes sense in your data center is understanding what it will be used for. Since the 1990’s the role of tape has traditionally been as storage for backup and archive data. In fact, for both of those processes, it was the ONLY target. In the early 2000’s, most organizations began to mix disk into the backup environment. Today, many organizations send all backups and archives initially to a disk-based secondary storage system and an increasing number of organizations don’t use tape at all for any function.
Tape, as a storage area, poor reliability reputation is undeserved. The truth is that as a storage medium it is reliable and extremely cost effective. If IT uses tape to store data it needs to follow similar best practices to disk-based storage. Most notably migrating data every several years to new media, and keeping current with technology.
The one problem tape-based storage can’t overcome, when compared to disk-based secondary storage, is “time to first bit.” Certainly, once data is found and when it is being streamed, tape is fast, faster in most cases than a hard-disk system. But to realize that speed, the data being retrieved needs to be contiguous and fairly large. A single access of a 500k file will never reveal tape’s true speed.
The challenge is the world is all about access today. Recall, right or wrong, is all about instant gratification. Even if the file hasn’t been accessed in years, when users want old data they want it NOW! Users simply don’t want to wait for the recalling of data, even if that retrieval is just a few minutes slower than disk. While the tape community will make a strong case about the cost savings of that few minutes of wait-time, organizations are repeatedly voting with their wallets that even the shortest wait time is just too much.
Tape, once the initial investment is out of the way, is incredibly cost effective. As little as $90 for 6-12TBs of data. The problem is getting that initial investment out of the way. In most cases, an organization considering tape is going to use a tape library so tape media can automatically be placed in drives. The organization needs to buy a robotic tape library, with at least two tape drives in it. New, these drives can be over $4,000 raw and the organization needs to prepare to pay a premium when they are “customized” for use in a tape library. The library is, essentially, sheet metal wrapped around tape media shelves with a robotic arm that is precise enough to pick up a piece of tape media and gently place it in the drive. The library itself costs $8,000 for a very small system and scales in price as the number of supported slots increases.
To justify the cost of a library, the organization has to have enough capacity requirement and amortize that capacity across the cost of the library. As a result, most tape library vendors are almost solely focused on sales of their largest systems. They’ve found the competition against midrange disk-based secondary storage systems to be too steep.
What’s Right with Tape
According to the tape industry, the big advantage tape has, assuming there is enough of a capacity requirement to offset the cost of the initial investment, is a tremendous price advantage. This advantage becomes greater if the organization can live with off-line media, tapes not in the library, which many can. This is especially true if the long term cost of powering and cooling a disk array is factored in.
Power down technologies on disk arrays simply have not been embraced. Also using power down disk technology is especially problematic with scale-out storage systems, which is the architecture that drives most disk archive systems. Only one object storage vendor has solved the riddle of powering down nodes within a scale-out disk cluster.
The second advantage tape has, especially in recent years, is it is essentially offline. While disk systems could and should take steps to protect themselves from attack, nothing is as protected as a disconnected copy of data. Tape should survive even the worst cyber-attack since the media is not directly accessible. Ironically, surviving the newest form of disaster, ransomware, may be the single reason IT needs to justify the investment.
The third advantage is its diversity. In a very disk-based world, it is simply not disk. The value of having data on something else other than disk has yet to be realized, but there is a working assumption that if a virus of some form infects every hard disk system in the world, tape will be the only means of digitally preserving information.
The final advantage is tape’s portability. Dozens of tapes, totaling hundreds of terabytes, can be put into a shipping container and shipped around the world in less than 24 hours. Disk systems are simply not designed to be moved and all the internet bandwidth in the world won’t allow hundreds of TBs to be shipped 24 days.
The Cloud Use Case
Tape media manufacturers, tape library manufacturers and pro-tape IT pundits are quick to point to major cloud provider’s use tape as part of their backup infrastructure. They say this to justify their belief that every organization should have a tape library. But the reality is organizations don’t have a data center that looks like or has staffing like a major cloud provider. The organization needs to make its own assessment as to whether it makes sense to leverage tape as part of its backup/archive strategy.
Cloud providers have needs that align almost perfectly with the upsides of tape. They have the capacity requirements, they can control expectations through pricing and service levels, they have the staff to manage the tape operations and they need potentially the most protection from cyber attack.
That’s not to say tape is only of interest to cloud providers. Non-cloud organizations just need to evaluate tape more carefully. The reality for most small to large data centers is they probably can’t store enough data on tape to cost justify the large upfront investment. For them, even if tape is a less expensive storage option, it will take them a long time to realize that savings. That said, there is clearly value in having data storage on a disconnected piece of media and to have a different media than a hard disk. In this use case, it becomes the backup of last resort, and if the organization is out of options that is a resort worth checking into.
The cheapest technology doesn’t always win and tape vendors may be parking up the wrong tree by constantly discussing tapes price advantages. Instead, they need to lead with the advantages of an off-line disconnected copy and the media diversity.
There are clearly advantages to a data protection and data archive architecture that leverages tape, but we live in a instant gratification world and disk systems have the advantage of delivering that gratification. And for that access and speed the organization may actually pay less in terms of cost, at least initially. The reason to buy tape, except for cloud providers and enterprises, has to be about much more than it is the cheaper storage medium.