The Benefits of Software-defined Server-side Storage

Posted on July 24, 2013 by George Crump

Hyperscale Data Centers, Managed Service Providers, Cloud Service Providers and large Enterprises all face a similar challenge; how to cost effectively scale their cloud/virtual infrastructures so that maximum return on investment can be achieved. The answer is to build scale-out compute infrastructures that support very high numbers of virtual machines, something called “big virtualization”.

Data center managers in these environments know that each physical server added to the infrastructure eats away at that ROI. Consequently, they strive to design very dense virtual machine environments in which each server’s compute resources are maximized before adding another server. The problem is that big virtualization exacerbates these storage infrastructure challenges. In this article we will discuss how software-defined, server-side storage can overcome the storage issues related to big virtualization.

The Legacy Storage Challenge to Big Virtualization

As virtualization began to move from test/development environments into production, shared storage was required so that its key capabilities could be exploited, like virtual machine migration, distributed resource management and site recovery management. In the early phases of virtualization, virtual machine count-per-host was low enough that current shared storage infrastructures could be used. However, as the number of virtual machines in these environments began to increase, the random I/O from multiple applications running on a single physical server sharing a single network connection showed the weakness in this design.

First, the storage system, which was likely hard-drive based, was causing latency as hard drives thrashed back and forth, rotating into position to locate the data that each VM was requesting or storing. Second, the network itself became a bottleneck as those single connections and relatively low bandwidth infrastructures, typically 1GbE or 4Gb FC, became saturated from multiple virtual machines making those storage requests.

Legacy Storage Band-Aids

The initial fixes for these storage performance problems included dramatically increasing the number of available hard disks and adding multiple or multi-port network cards to the servers. While these steps did increase virtual machine counts, they typically didn’t tax the processing power of the physical servers, meaning the potential of even further VM density existed if the storage I/O challenge could be addressed. Also, while inexpensive, these solutions were essentially band-aides and more serious steps needed to be taken.

The next approach was to use the performance of flash-based storage, leading to the flood of solid-state-assisted and solid-state-only storage systems and the investments they drove in network upgrades. This led to low, double-digit virtual machine counts which began to stress the controllers on these storage systems. In other words the processing capacity of the storage became a bottleneck. And while eased, the network between physical servers and the fast SSD storage remained a bottleneck.

The next strategy attempted to solve the storage controller bottleneck with a scale-out topology, using a similar architecture to that of physical servers to get around the limitations of a single storage controller. While effective, scale-out storage suffers from the same problems that server infrastructures do, the lack of effective CPU utilization and inefficient storage capacity management. This inefficiency comes from the requirement to add storage nodes in order to increase performance, since each node adds more capacity, whether it’s needed or not.

The ensuing requirements to fine tune the storage network and balance the use of hard disk drives for capacity and SSD for performance have created a very complex environment. This is especially true in the provider markets mentioned above which have limited control over which applications are run on their infrastructures but still have specific performance service levels to adhere to.

The problem is that none of these band-aids truly embraces the architecture that drives these providers’ storage challenges in the first place, a scale-out compute architecture where all the individual nodes are both independent and supportive at the same time. They also remain totally dependent on a storage network middle tier in order to connect the servers to the storage. The costs, time and skill set required to keep this network tuned for maximum performance against rapidly changing workloads may be too much to ask of any data center.

The Move Server Side

As a result of all of this expense and complexity many companies have begun to investigate server-side storage technologies to off-load storage network I/O. These are typically done in the form of SSD caching solutions installed in the server. While they do off-load I/O from the storage network, it’s typically only read activity. Also, each of these solutions complicates some of the advanced features that provided the sparks to light the fuse for virtualization and the cloud in the first place – the benefits of application mobility. Shared storage and a storage network still had a role to play.

Software-Defined, Server-Side Storage for Big Virtualization

The answer to the providers’ and large enterprises’ challenges, when creating a scalable environment that can support dozens of virtual machines per server, may be to combine the server, storage and network infrastructures into a single layer similar to what Compuverde has done with their software-defined storage solution. To be cost effective this effort would need to leverage hard disk drives for capacity, which are shared for mobility, but also use host SSD cache for performance.

The key is to leverage the scale-out storage model, and virtualize that controller function to run in the compute server along with the other virtualized workloads. This addresses the inefficient CPU utilization common in scale-out storage environments. Virtualizing the storage controller function also allows the physical storage used to be installed inside the physical server which is less expensive and easier to manage. In other words, the storage node of the scale out storage architecture is merged into the physical server of the compute infrastructure.

In this design each physical server is a now running a virtualized storage controller in addition to standard virtual machines. These virtual controllers work together similar to how physical nodes work together in a scale-out storage system to stripe the available storage capacity on each physical host so that it’s presented as a single storage element. The virtual controllers leverage the same network that the physical servers use for communication allowing that storage to be shared between all the servers and VMs. This means that workloads can now shift easily between various hosts as needed but no additional network fabric has to be purchased or maintained.

Compuverde takes this concept a bit further by leveraging internal SSDs or NVRAM in each of the physical servers, not as a cache but as a hard tier that holds all or most of the data for each VM resident on that physical system. This allows the primary copy of data for each physical system to be served from solid state technology and the hard disk layer to be used for redundancy and VM transportability.

Conclusion

Big virtualization can create a rapid and long lasting ROI for the providers and enterprises that choose to invest in it. But big virtualization also leads to big storage challenges. There are ways to address these challenges in legacy environments by upgrading and tuning the storage network as well as adding and managing solid state disk, but they each bring layers of complexity, which can be the death of any large scale project.

Software-Defined, Server-Side Storage allows the consolidation of three layers of data center infrastructure (and three sources of data center problems), compute, network, storage, into a single scalable and manageable solution. For environments looking for ways to significantly increase virtual machine density it deserves strong consideration vs. the band-aides being applied to legacy storage.

Compuverde is a client of Storage Switzerland

About George Crump

George Crump is the Chief Marketing Officer at VergeIO, the leader in Ultraconverged Infrastructure. Prior to VergeIO he was Chief Product Strategist at StorONE. Before assuming roles with innovative technology vendors, George spent almost 14 years as the founder and lead analyst at Storage Switzerland. In his spare time, he continues to write blogs on Storage Switzerland to educate IT professionals on all aspects of data center storage. He is the primary contributor to Storage Switzerland and is a heavily sought-after public speaker. With over 30 years of experience designing storage solutions for data centers across the US, he has seen the birth of such technologies as RAID, NAS, SAN, Virtualization, Cloud, and Enterprise Flash. Before founding Storage Switzerland, he was CTO at one of the nation's largest storage integrators, where he was in charge of technology testing, integration, and product selection.

Tagged with: Compuverde, Data center, OpenVMS, Solid-state drive, SSD, Storage virtualization, Virtual machine
Posted in Article