In previous columns we’ve discussed what exactly object storage is (and is not) and what advantages this technology brings to the storage infrastructure. It’s more scalable than a NAS system, more economical than a traditional RAID-based storage array and able to provide better data protection than even storage systems that create multiple copies of data. But where are object storage systems being deployed today?
Big Data Archive
Object storage is ideal for replacing scale-out NAS in large, file-based archives, providing a cost-effective solution for petabyte-scale data sets. Its ability to support a file system layer and capability for near limitless expansion within a single name space is very appealing for many use cases. Examples of these include companies that have to maintain digital content libraries that can grow extremely large but still need faster access performance than can be delivered by tape. And now, thanks to recent developments tape now supports REST, allowing it to augment object storage infrastructures as a high capacity and/or long-term archive tier.
In addition to scalability, cloud storage and backup providers need an infrastructure that’s also resilient, since these are often the ‘copies of last resort’. But it also must be cost-effective. Compared with traditional storage methods like RAID and replication, object storage systems that leverage erasure coding are much more capacity-efficient, which keeps per-GB costs down. They’re also more effective at protecting data, being able to endure multiple failure instances without losing information. And, by geographically dispersing data objects, these systems can provide multi-site DR protection as well.
Object storage is also becoming the predominant architecture for web-scale environments like social media sites and web-based enterprises that handle file data. In order to support near-unlimited capacity requirements, commodity, scale-out hardware is the norm and as a software-based technology object storage fits right in. It can be integrated with low cost, modular hardware and purchased as turnkey storage nodes as ShutterFly has done or used to create a storage infrastructure from the ground up, as Amazon and Google have done.
Science and Intel
For the same reasons that commercial use cases mentioned above are looking at this technology – scalability, economics, file orientation, REST connectivity, etc – object storage is finding its way into the public sector as well. Its impressive list of capabilities is making object storage the standard for projects that create some very large unstructured data sets in areas such as radio astronomy, life sciences, remote sensing and intelligence, to name a few examples. And not surprisingly, open source software such as Ceph and OpenStack are becoming more common as well.
Big Data Analytics
Distributed processing technologies like Hadoop and NoSQL are becoming more common in use cases that involve data ingest, data mining and real-time analytics. With erasure coding object storage can provide data protection without creating multiple copies of each file, as the Hadoop File System (HDFS) does, making it an attractive replacement for HDFS. And as a software solution it can be integrated into scale-out storage nodes that have the compute power to also be processing nodes. The open source object storage solutions mentioned above are a good fit for many Hadoop environments as well.
Storage Swiss Take
As a storage architecture there’s a lot to like about object storage, a conclusion supported by the growing list of organizations adopting it in both the public and private sectors. Designed primarily to support data access by computers, instead of people, object storage systems combine the best of SAN and NAS technologies. The primary use cases, currently, have been those requiring scalability into the PB range and a cost structure that they can still build a business around. But vendors are now coming out with object-based storage systems capacities as low as few tens of terabytes, something we expect to see more of. With tape and now disk drives being designed with a REST interface object storage is poised to be the future of cost-effective, highly scalable, long-term file storage in many different environments.