As my colleague, George Crump, discussed in a previous article, “What is Better than Cloud Storage for Cold Data”, cloud storage is great for processing active data but becomes increasingly expensive for storing cold data that is seldom accessed. While we have previously examined a few weaknesses of cloud storage such as latency and bandwidth issues, we have not really examined the actual costs of cloud storage in any detail to see the potential costs of storing large quantities of cold data and archive data long term in the cloud, or retrieving any of that archived data until now. There is a reason that many organizations are now starting to question their decision to store large quantities of cold and archive data in the cloud long term.
Doing the Math
First, we will examine what a relatively large amount of cold data archives would cost if stored long term in the cloud.
Checking long-term storage pricing from several major Cloud Service Providers (CSP) we find that the lowest price is $.007 per GB per month for archive class storage.
If an organization needed to store just 100TB of data in the cloud for the next 10 years, what would it cost? At $.007/GB per month, the numbers would look like this:
- Average Monthly Cost for 100TB stored in the cloud is $700.00
- Annual Cost for 100TB: $700.00 x 12 = $8,400.00
- Cost for 100TB for 5 years: $8,400.00 x 5=$42,000.00
- Cost for 100TB for 10 years: $8,400.00 x 10=$84,000.00
If the same organization needed to store 500TB of data in the cloud, the numbers would look like this:
- Average Monthly Cost for 500TB stored in the cloud would now be $3,500.00
- Annual Cost for 500TB would be $42,000.00
- At 5 years, the same 500TB would have cost $210,000.00
- At 10 years, that 500TB would have cost $420,000.00
Additionally, if that same organization needed to store 1PB of data in the cloud, the costs would be:
- Average Monthly Cost for 1PB stored in the cloud would be $7,000.00
- Annual Cost for 1PB would be $84,000.00
- At 5 years, the same 1PB would have cost $420,000.00
- At 10 years, that 1PB would have cost $840,000.00
These base costs are simply for storing the data quantities listed above and do not take into account that data will typically continue to grow 50% or more each year at a compounded rate. Also not included in these costs are other additional transaction costs that are incurred when data is accessed, deleted, downloaded or transferred to another region or location.
Costs to retrieve archive data are often complicated to compute and can result in substantial charges depending on the quantity of data to retrieve, how the data was stored (individual files or large archive files) and how it is retrieved.
Ordinarily cold and archive data is seldom accessed but unexpected e-discovery requests triggered by litigation from private or government entities can result in the need to retrieve significant amounts of data in a very short time frame. Additionally, many organizations are beginning to see potential value in their historical data and this is leading to data mining operations to monetize that historical data, which can also require the retrieval of substantial amounts of data.
There are also some other considerations such as cloud side encryption; who controls the encryption keys, the actual chain of custody of your data and security. Data that is stored electronically on any network is always potentially susceptible to hacking attacks.
A good example of this from a couple of years ago was Code Spaces, a code services hosting company that was well known in the IaaS (Infrastructure as a Service)/DEVOPS community, and that was forced out of business by a hacking attack. This company was state of the art in 2014 and had the bulk of its resources and infrastructure fully in the cloud. This included its backups, some of which it considered as being off-site because multiple copies were distributed throughout their CSP’s network. However, a hacker was able to gain control of their CSP account and when Code Spaces tried to regain control, the hacker proceeded to destroy almost all their data, machine configurations, virtual machines and their backups. Their backups may have been off-site but they were still online. Unable to recover their data, the company was forced out of business. This incident underscores the importance of having off-line backups of your data as well as proper security and backup protocols.
But What about Tape?
A check of the web shows a current average price for LTO-6 tapes at approximately $30.00 each. Using native capacity, this makes the cost per GB for tape approximately $.00012, which is less than cloud storage cost. There is also one other key difference between tape and cloud storage. The cost of cloud storage is a recurring one that you keep paying month after month, year after year, decade after decade, but the acquisition cost for tape is a one-time expense.
Storing 1PB of data on LTO-6 tapes at their native capacity of 2.5TB each would require 400 tapes. At $30 each, that would be a onetime cost of $12,000.00. At the compressed capacity of 6.25TB each, you would only need 160 tapes for a onetime cost of $4,800.00.
Another cost factor with tape would be for offsite vaulting at a vendor that provides a secure climate controlled facility with proper chain of custody safeguards. The cost of this type of service will vary depending on the vendor, number of tapes being stored, how they are stored, and charges for each pickup and delivery trip as well as any fuel surcharges.
Using figures for one secure off-site vaulting service, storing our 1PB of data on 160 LTO-6 tapes, in containers, with one pickup/delivery trip per day yielded the following costs:
- Storing 160 containerized tapes at $0.89 per tape is 160 tapes x $0.89=$142.40 per month
- Storing 160 containerized tape for one year would cost $142.40 x 12 = $1,708.80 per year
- Storing 160 containerized tapes for 5 years would cost $1,708.80 x 5 = $8,544.00
- The cost at 10 years would be $85,440.00
Additional ongoing costs that would also need to be considered are for tape hardware and maintenance as well as new media. Since LTO-6 has been out for a few years now, most if not all organizations using tape libraries have already upgraded their drives so the main cost here would be for maintenance on their existing hardware. These costs will vary by organization depending on their service providers and the hardware being covered and response times the organization requires.
Another soft cost will be for the worker that handles loading and rotation of tapes to and from the library and the vaulting facility. This will also vary by organization depending on their hardware setup and tape rotation requirements.
Something else to consider is that well established large organizations and some of the larger SMBs (Small and Medium-sized Businesses), that have been around for over a decade or more and that have large quantities of data to store, also have data centers, infrastructure and storage, including tape libraries and cloud gateway appliances. Therefore, they have already invested in the things that cloud storage might save in costs for an organization that does not have them. It is also fairly certain that these organizations are not likely to rip out and throw away all this equipment.
Tape today provides very compelling features such as low acquisition cost, backwards compatibility, scalability, high performance, longevity, high capacity, and portability. The LTFS open tape format and backward compatibility with earlier LTO versions, helps ensure that you will be able to read and restore data in the future without the need for proprietary applications.
More importantly tape can also provide a final line of defense against hacking attacks and data corruption that may affect copies of data stored on disk in the enterprise or in the cloud.
Ultimately, each organization will need to examine closely the costs of storing their cold data and archive data locally versus in the cloud to determine which strategy, “rent vs buy”, will be the most cost effective for them.
Sponsored by Fujifilm Dternity, Powered by StrongBox