Like a lot of technologies that are past their prime, RAID will continue to serve a function for a very long time. It’s an inexpensive way to string together multiple drives and protect yourself against the loss of one or more of those drives. There is a huge amount of inertia behind RAID, volumes, file systems, etc. Let’s face it, the IT world gets them.
But when it comes to storing petabytes of unstructured data, RAID systems have more limitations than advantages. The biggest issue is the rebuild time of multi-terabyte drives — measured in days — and unfortunately drives are only getting bigger. If you’re running RAID 3-5 and you lose a drive during the rebuild, you also lose data. The whole point of RAID 6 is to mitigate the risk of a double-disk failure during a multi-day RAID rebuild! And yes, it’s possible to lose three drives in the same RAID array, and RAID 6 won’t help you if that happens during your rebuild. And of course the performance of the RAID array is not good during the rebuild time.
Another issue not addressed by RAID is the undetectable bit error rate of SATA drives, which is 10-14, or one undetected error every 10 TB. Think about that, this means an undetected error on every disk drive with a 10 TB drive (which are shipping in quantity now). With this in mind, IT professionals really need to consider spreading the risk out over more than a couple disk drives.
There is also nothing in RAID that deals with the issue of geographical dispersion. It is up to some higher-level process to get the data to some other location, and this process is complicated at best. It’s also extremely expensive. It forces you to make a number of complicated decisions, such as whether to replicate asynchronously or synchronously, replicate the entire volume or just part of it, and how many different locations to replicate to. And since replication also replicates human errors and corruption, you’ll need some type of version control or backup software.
Enter Object Storage
Object Storage takes away a lot of those questions and addresses the human error issue as well. Here’s how it works: treat data as objects and specify what type of protection each object should have. Want to be able to survive the simultaneous loss of three different data centers? No problem. Just configure your object storage system to do that.
They will do this via multiple copies, called replication or erasure coding. Erasure coding is a parity-based method of protecting data, but that’s where its similarity to RAID ends. Explaining its details is beyond the scope of this blog post, but suffice it to say that it uses a parity system that spans geographic locations (and does not have the same issues RAID has), and it ensures that each object can be read in at least n locations. The multiple copies method is easier to understand, and just makes sure each object is copied to n locations. If you lose one or more of those locations (or a node within that location), it copies the missing object(s) to another location using one of the other locations as the source.
User error, corruption, and malicious attacks are easier to overcome as well. Each new version of an object creates a new object, and previous versions are still available. A restore can be as simple as changing a pointer to the older version of the affected object(s).
Object storage systems are also infinitely scalable, as there are no volumes or file systems to manage. There are no forklift upgrades to replace large amounts of data. Simply insert a new node into the system, replicate copies into it, and retire the old node.
Object storage may not have killed RAID, but it sure makes RAID look ill-prepared for the future.