One of the things I like about participating on the forums on Spiceworks is I get great, raw questions. One came in this week during a discussion around object storage. “I’d like to consider myself reasonably clued up, but I don’t even really know what object storage is vs. traditional block or file, let alone when you’d use it. When should I start looking at object storage? We have about 30TB’s of capacity.” In other words, is object storage for normal data centers?
That is a very fair question, and one the object storage industry struggles to deliver a crisp answer to. Let’s cover the first part of the question, “What is Object Storage?”. Object storage at its most basic is an alternative means of organizing and accessing data. Instead of using the classic file directory structure, you reference files by a number. The analogy I like to give is the dry cleaner. If dry cleaners didn’t give you a ticket with a number they would have to find your clothes by your last name and then your first name. Which for a dry cleaner of any size would mean a long wait for you. But since dry cleaners give you a ticket with a number on it, and if you lose the ticket, they can look it up. That number gives them the exact location of your clothes on the conveyer belt. Much faster.
Your second part of the question, “who needs it”, is even more interesting. The truth is, not many organizations do today. A standard NAS or a scale out NAS is good for most. But there are a few reasons that organizations should consider it, especially if their file data is larger than 30 or so TBs, like our questioner. The first reason, and one that most object storage vendors overplay, is single volume capacity. Which, for object storage, is in theory limitless but certainly in the dozens if not hundreds of PBs. The problem is that the “devil” you know, NAS, is quite capable and can also scale to multi-PB capacities per volume.
Second, and another popular reason that object storage vendors give is massive file (or object) counts. For very large environments file count is a problem. For example, I work with a well known online photo sharing site that stores 70 PBs of data. The sheer number of files is more than most NAS systems can handle. Reality Check – most data centers don’t yet have a need for hundreds of PBs or need to store trillions of files.
Why Do You Really Need Object Storage?
The first reason in my opinion that object storage makes sense for data centers (even though a NAS can handle the capacity and file count demand) is reliability. Object storage systems are typically built from a group of clustered servers or storage nodes. They can be set to survive two or more node failures and still serve data uninterrupted.
Even more important, object storage is not dependent on RAID 5 or 6, and even RAID 10. Rebuild times with these techniques can be measured in days, especially with the high capacity drives that you would want to use to store so much data. Object storage leverages either replication or erasure coding (or sometimes both) for protection from drive failures. Rebuilds do not require the entire drive to be rebuilt, only the data on that drive.
The other reason is cost. Most object storage systems can be built from commodity, off the shelf servers, using commodity off the shelf drives. And again, thanks to the above mentioned reliability, this can be done safely.
The big objection to object storage is accessibility. It doesn’t use CIFS, SMB or NFS like a NAS does. Ideally, you would alter your applications to interface with the object storage system via its RESTful interface. But almost every object storage vendor provides a gateway that allows more standard protocol access without having to recode your application. This allows you to get started without altering your applications or learning a new interface. Many data centers use object storage via one of these gateway services and have no plans to change.
Object storage is not for every data center, but it can be a valuable infrastructure for more data centers than what most people think. It solves the reliability and cost issues that many data centers, regardless of size, face today. If you are over a couple dozen TBs, you should take the time to learn about object storage and see if it is for you.
Storage Switzerland is the place to learn about object storage, simply type “Object” into the search box at this top of this page and you will see over a dozen search results. But the articles below are a great place to start: