OpenStack is a set of tools for building and managing computing platforms for cloud data centers. Public cloud providers have widely adopted OpenStack, and now the toolset is working its way into the enterprise in the form of private clouds. One of the critical decisions that these data centers need to make are which of the OpenStack storage modules they should use. The choices range from the much assembly required OpenStack Swift, Cinder, and Ceph to turnkey solutions like SwiftStack, Red Hat’s Ceph and storage vendors’ support of Cinder.
Don’t Overthink It
The best advice when trying to determine which storage option is best for OpenStack is not to overthink it. OpenStack is often a green field, net-new, project in the data center, so while it does provide the opportunity to think differently about storage, IT professionals can still leverage their previous experiences. For example in the traditional environment most databases are stored on block storage, virtual machines are often run on block or NAS and storage backup is high-density NAS. In OpenStack, databases are stored on block storage, virtual machines on block or CIFS/NFS and object storage stores file data.
Know What Needs To Be Stored
As a result, the first step in selecting the right storage module for an OpenStack implementation is understanding what needs to be stored. The selection typically comes down to a structured vs. unstructured data decision but as is typical of a cloud infrastructure a couple of other considerations factor into that decision.
First, there is the consideration of data distribution. Is all the data going to be accessed from a single location or will it need to be distributed across many data centers around the world? Second, unstructured data has two aspects to it; can it be stored as objects or must a traditional POSIX file system be used? The scalability and durability of object storage will better serve modern applications as well as updated legacy applications.
Many data sets will have a combination of these different types of data models. In general, for data that is going to be accessed by legacy applications from a single location, block-based protocols like Ceph and Cinder make the most sense. For unstructured data, an object storage solution like OpenStack Swift or the more turnkey SwiftStack is the likely choice. The exception is a legacy application or device that needs a POSIX filesystem like NFS or CIFS. An advantage of SwiftStack vs. OpenStack Swift is that it natively supports NFS and CIFS.
Block Based Comparison
For applications that will benefit the most from block-based storage, there are two primary choices; Ceph and Cinder.
Sage Weil created Ceph as part of a Ph.D. project at the University of California, Santa Cruz. It started as an open source project in 2004. Weil was CTO at InkTank, a company that sold a subscription based, commercially supported version of the product. Ceph was purchased by Red Hat in April of 2014 to beef up its open-source storage offerings.
Cinder was originally a component of the Nova project which is the codename for the OpenStack Compute service. It was known as “nova-volume” until the project broke off into the independent Cinder project. It was first released in the latter part of 2012.
Ceph is primarily for OpenStack administrators that want to build their scale-out storage infrastructure using commodity servers with internal storage. These servers essentially become nodes to the OpenStack infrastructure. Cinder is primarily for existing storage hardware vendors. It allows them to integrate their storage hardware and its software more seamlessly into the OpenStack infrastructure.
There are two likely decision points when trying to decide between the two options. First, how much time does the IT team have to assemble and build their storage hardware solution? If they have the time to evaluate and assemble the hardware that will run the Ceph software, then this solution can deliver acceptable performance at very aggressive price points.
The second decision point is dependent on the type of workload. Ceph, like most scale-out storage solutions, performs better when there are many workloads making many parallel random I/O requests. A solution like Cinder, however, will typically perform better if there are a few workloads making many random or sequential I/O requests.
Object Based Comparison
For distributed databases and especially large unstructured workloads, object storage is going to be the go-to choice. While Ceph can provide some object storage capabilities, the two primary options are either OpenStack Swift or SwiftStack. In 2009, one of the original supporters of OpenStack, Rackspace project to build what is now Rackspace Cloud File, which replaced the former Mosso CloudFS. Since its formation within OpenStack, the leading contributors to the OpenStack Swift project has been SwiftStack, which also maintains a subscription based, more enterprise-class product built-on OpenStack Swift.
Deciding between OpenStack Swift and SwiftStack is similar to choosing between Cinder and Ceph, except that SwiftStack is entirely software. Being software based means that either choice will require the “building” of the standard server hardware infrastructure. Investing the time in hardware makes sense when building an object storage infrastructure. This justification comes from the need for massive, affordable capacity built with standard servers and HDDs. Also performance, while important, is less critical in most object storage environments, so there is less pressure on making the perfect hardware selection.
SwiftStack includes enterprise features that make it more compelling for the enterprise. It provides features that make it easier to integrate, manage and monitor the object storage environment. SwiftStack also provides a complete support and professional support experience.
For example SwiftStack provides support for both CIFS and NFS, while OpenStack Swift does not. For many environments this is essential since the objects they want to store and process are from devices or sensors that can only transfer data via one of these legacy protocols. Storage Switzerland provided a detailed comparison of OpenStack Swift and SwiftStack in its recent article “OpenStack Swift – Should Enterprise Customers DIY or use SwiftStack?”.
Mixing Block and Object
In the end most data centers will require both block storage and object storage. The block storage will be for databases and virtual machines while object storage is for the unstructured data that those databases and virtual machines will likely process. Object is also a perfect fit for storing backups of VMs and databases. The good news is that OpenStack provides integration for both types of storage. It also provides options to the type of storage that fits the use case and the data center’s capabilities the best. One of the strengths of OpenStack is that the interaction with these different types of storage can all be automated and integrated through the OpenStack management framework.
The choice of which storage type to use should not be another case of paralysis by analysis. The decision making process is similar to legacy environments. There are a few more options but those options can be easily compared to the data center’s needs and capabilities. Again, most data centers will end up with both object and block storage. Most enterprises should consider strongly the more turnkey approaches offered by a storage vendor that can provide integration via Cinder and a turnkey object storage solution like SwiftStack.