Facebook recently put out a video showing their engineering manager with a new optical juke box Facebook has designed to hold BluRay DVDs that store “cold data”. According to the company, these are users’ oldest pictures and videos that are consuming hard disk space and power, but most likely won’t ever be viewed.
This is what I like to call the “Andy Rooney Archive” – from a comment the old “60 Minutes” columnist made years before anybody had ever heard of “Big Data”. He said “you never throw anything away until you make a copy of it.” That’s a pretty good description of this kind of data. It’s like the box holding your spouse’s stuff in the garage that they have probably forgotten about, but will remember as soon as you put it on the Goodwill truck. Facebook probably figures it’s safer to stick these files on a shelf in their garage (or their DVD library) than to chance deleting them.
I agree that spinning disk is no place for this kind of data and at first blush, a zero-power medium like optical makes economic sense. In fact, their calculations show it will save them 50% in storage costs and 80% in energy costs over their existing ‘cold’ hard disk storage. My question is “why DVDs?”
Why not Tape?
Facebook mentioned using 100GB disks, but even if these are available (BluRay.com only lists 50GB, dual layer disks), this technology still makes no sense compared to LTO tape. It’s more expensive than LTO6 and will get much more so when it’s compared with the upcoming LTO7 and LTO8 generations. It’s also less dense on a TB/frame basis so data center floor space consumption will also be significantly more with a DVD archive implementation. And then there’s the issue of a non-manufacturer like Facebook getting into the hardware design business at this level.
Using the 100GB DVD assumption, a rack of 10,000 disks would hold 1PB of data. All comparisons are uncompressed since pictures and videos don’t compress well with any data reduction process. A tape library frame full of LTO6 cartridges would give you just north of 2PB (~900 tapes * 2.4TB per cartridge). That’s twice the storage density, again, assuming 100GB DVDs and currently available LTO6.
100GB DVDs are not yet available but 50GB disks show up on retailing websites for a little more than $10, but we’ll assume Facebook can get these for $5. That puts the cost of a 10,000 disk frame at $50,000, for 1/2PB of data. LTO6 comes in at ~$75 per cartridge, so let’s assume Facebook could get them for $50. That means a frame of LTO6 tapes (Spectra Logic’s T-Finity, as an example holds 920 tapes) would cost $46,000 – for 2PB of data. Bottom line: LTO6 is 1/4 the price per GB of currently available DVDs.
If we assume Facebook can get 100GB DVDs for the same price, they’re still twice the cost of LTO6. When LTO7 comes out, that frame of LTO tapes would now hold almost three times more (6.4TB/cartridge) and LTO8 will be double that. This puts LTO7 at roughly 1/6 the cost of DVDs (the 100GB versions that aren’t yet available) and LTO8 at 1/12 the cost. And then there’s the question of roadmap. The LTO consortium has done an excellent job turning out new generations every couple of years, each with committed-to increases in capacity and performance. Can Facebook rely on the DVD industry?
LTO tape is alive and well, and actually thriving in this new world of Big Data Archives and cloud storage. If Netflix and other streaming services continue their march, Facebook may be the only DVD consumer in the world. And what about BluRay drives? This may be a bigger issue than simply higher media costs, as keeping an otherwise obsolete technology alive is difficult (and expensive). Just ask the US government about the cost and complexity of using technology that’s years out of date.
Building a robot?
Building a viable robot is not an easy do-it-yourself product, even for a company with Facebook’s resources. It’s true, they’ve figured out how to design and build (or have built for them) scale-out storage servers. But designing and building a robot that’s robust, reliable and still cost effective will be a tougher challenge than engineering data electronics using off the shelf storage and server hardware.
Storage Swiss Take
This isn’t a hit piece on Facebook for trying to find an answer to what’s undoubtedly a pressing issue. Every time a new generation of iPhone comes out with a higher resolution camera, the disk drive vendors open more champagne since they know every Instagram user will now consume that much more capacity. We agree that long term storage of archive data is best left to technologies that are more dense than HDDs and don’t consume power. We just don’t understand why they’re not using a medium like tape that’s more economical, more proven and more likely to be around in 5 or 10 years.