The easy button for OpenStack Swift gets better

Early this year we compared the work involved in building a storage infrastructure around OpenStack Swift versus using SwiftStack. While the names OpenStack Swift and SwiftStack are similar, the process of getting to the point where you can write data is not. OpenStack Swift can be a tedious process that requires a lot of time and patience, where SwiftStack is a much more turnkey experience.

The OpenStack framework has developed the unfortunate reputation of being the framework to use “if you have more time than money” which, given the reality of IT staffing in the enterprise, is often a non-starter for them since they often have neither. This is the business case for SwiftStack, to bring automated storage deployment and operation to OpenStack Swift, making it more appealing to enterprise data centers.

Adding Capacity

One of the certainties in life (other than death and taxes) is that cloud storage systems will need to add capacity. With scale-out there is tremendous freedom to expand. Though when it comes to adding capacity to a storage node, just like with any storage system, you can’t just plug drives in.

For example, if you want to add some drives to a node in the cluster, the cluster by default will expand with all of the added capacity once the drives are inventoried and assigned. With Swift, the cluster works to maintain balance, a great thing when you need it, but not ideal when it is unwanted. The amount of data to rebalance is proportionate to the percent of capacity that’s in use.

If a cluster is ~75% full, it means as much as 75% of the added capacity is now heading to the new drives from various points in the cluster. Savvy OpenStack Swift administrators will instead bring only small amounts of the capacity into the cluster at a time by using the weight parameter. The administrator will gradually increment the amount of the drive to be used so as to not flood the cluster with data to be transferred. This process can take hours of manual intervention as the weight parameter is slowly increased.

SwiftStack Alternative

Drive inventory and management that is complicated in Swift is made easy with SwiftStack. SwiftStack, with OpenStack Swift at its core, has to add capacity in the same way. The SwiftStack Controller though uses the weight parameter in Swift to automate the gradual addition of capacity.

SwiftStack takes complexity and risk out of adding capacity, enabling Admins to execute expansion during work hours vs having to wait until off hours for expansion. It even provides a monitoring interface so you can check the progress of the capacity expansion within the cluster.

The Same Thing in Reverse

When drives are removed from a cluster the same rebalancing must occur. To avoid unwanted rebalancing in Swift, Admin intervention is required to choose how to choose the appropriate action. Specifically, if a node goes offline, nothing happens immediately as the same node could return. Only when an admin confirms it will be gone for a long time or forever does the rebalancing start.

This rebalancing when taking a node offline (e.g. for decommissioning) is similar to a capacity expansion. It should be done gradually over a period of many hours. With SwiftStack, experts have perfected the way of doing these operations with minimal risk, to the point the operations are automated and simple for Admins to execute. That way, focus and energy go toward the important things that make a difference up the Stack, and SwiftStack takes care of the important things you can rely on down the stack.

Now with Erasure Codes

In the time since our last post, the OpenStack Kilo release has come, and with it support for Erasure Codes in Swift. Erasure codes are like RAID 5 for a cloud object storage system, where data and parity fragments are distributed across drives in nodes in a unique as possible fashion.

The result in comparison to data protection with Replicas (which is like RAID 1) is the opportunity to have more usable capacity with the same number of drives. As is the case with RAID, there are tradeoffs when choosing which level of data protection to use.

Conclusion

Adding capacity or removing in OpenStack Swift can be challenging if doing it yourself from scratch. The same with adding nodes to a cluster or upgrading nodes within the cluster. With standard OpenStack Swift it is a series of manual steps, where with SwiftStack those same steps are automated by the SwiftStack Controller as a management capability from the web-GUI interface.

Twelve years ago George Crump founded Storage Switzerland with one simple goal; to educate IT professionals about all aspects of data center storage. He is the primary contributor to Storage Switzerland and is a heavily sought after public speaker. With over 25 years of experience designing storage solutions for data centers across the US, he has seen the birth of such technologies as RAID, NAS and SAN, Virtualization, Cloud and Enterprise Flash. Prior to founding Storage Switzerland he was CTO at one of the nation's largest storage integrators where he was in charge of technology testing, integration and product selection.

Tagged with: , , , , , ,
Posted in Blog
One comment on “The easy button for OpenStack Swift gets better
  1. Tim Wessels says:

    Well, just how realistic is it to consider adding individual HDDs to a storage cluster node? If you are adding HDD capacity you are really talking about adding another storage server to the cluster, not a couple of HDDs. Replacing failed HDDs in storage servers is a maintenance task whose scheduling is based on size of the storage cluster and the types of data protection being used. Object replication is not like RAID-1 or mirroring an entire HDD with another HDD. Object replication can be 2x, 3x, 4x or higher depending on your requirements. Most object-based storage software defaults to 3x replication. Most erasure codes (forward error correction codes) are based on Reed-Solomon or its variants. Erasure coding may be superficially compared to a RAID-5 disk array, but it is much more than that in reality. Object replication is best suited for warmer data that may be stored in clusters that span multiple locations. Erasure coding is best suited for colder data that is stored in a single location, although hierarchical erasure coding looks like it will be able to address the latency issues that affect reading erasure coded objects stored in multiple locations. SwiftStack’s value-add to OpenStack Swift is a management control plane that makes Swift storage consumable and manageable by storage administrators.

Comments are closed.

Enter your email address to follow this blog and receive notifications of new posts by email.

Join 22,247 other followers

Blog Stats
  • 1,556,081 views
%d bloggers like this: