There are two popular trends dominating the storage landscape today. The first is scale-out storage where performance and capacity can expand linearly as nodes are added to a cluster. The second is the use of flash as a way to accelerate performance and allow a single storage system to respond to a variety of workloads. These two technologies are being combined to create scale-out storage systems that leverage flash. The objective is to create a single storage system that can scale to handle the wide variety of workload types in the data center today and eliminate the sprawl of storage systems dedicated to a single workload type.
Why Scale Out?
Traditionally scale-out storage was designed for capacity-centric environments. These environments were constantly adding new storage systems to keep up with the demands of perpetual data growth. The problem was two-fold. First was the obvious problem that a traditional scale-up storage system could only grow so large. Second was the not-so-obvious problem that as capacity was added to the traditional scale-up storage system, performance began to decline because the storage controllers had to manage more and more disk drives.
Scale-out storage was initially introduced to fix both of these problems. Because each node increased network bandwidth and processing power along with capacity, the performance of the overall storage architecture did not decline as the system grew. And because nodes can be added almost infinitely the raw capacity problem was resolved.
Now though, more environments are facing an I/O performance challenge in addition to a capacity challenge. This is caused by the creation of denser environments where server hosts are asked to do far more than they used to be. Instead of hosting a single application the modern server is supporting many applications running inside of virtual machines.
To address the need for more I/O performance, scale-out storage systems are incorporating flash technology into their designs. The problem is that flash can overload traditional controllers designed for disk and flash performance can also easily be negated by the behind-the-scenes latency that is incurred as you scale-out a cluster of storage nodes.
The Scale-Out Storage Challenge
As stated above scale-out storage was supposed to be the answer to storage system sprawl since this technology can reduce the need to implement dedicated storage systems per workload. However, scale-out is usually a better fit for large file sharing or archiving systems, use cases and adding flash doesn’t necessarily make sense for transactional workloads like virtualization infrastructure and databases. As scale-out systems have tried to move beyond file sharing and address the demands of large databases or virtual infrastructures, their inter-node communication mentioned above tends to become the bottleneck.
Inter-node communication is needed in a scale-out storage system so that all of the members of the storage clusters can be aware of the others and aware of what data each node holds. This back-end communication requires a backend network suitable for extremely high, but also small, I/O transmissions. Some vendors have gone to the extent of dedicating a private Infiniband network to try to solve this specific problem. The downside of this approach is that Infiniband is expensive.
Others vendors will suggest creating faster nodes that leverage either a mixture of hard disk drives (HDDs) and solid state disks (SSDs), or a complete switch-over to SSD. Similar to a faster backend network, this approach merely allows the latency to be dealt with faster. It does not directly eliminate or reduce the inefficiency of inter-node communication.
Adding flash to a storage system removes the bottleneck in the disk drives, but it moves this bottleneck to other parts of the storage system. For example, the inherently slow speed of mechanical hard drives meant the single or dual controllers in traditional systems could handle the load. Now, the speed of flash storage means a single PCIe flash card can overload a single controller, so in a scale-out architecture, both the controller and scale-out network need to be examined for bottlenecks.
A high performance backend network and the use of flash are both vital for scale-out storage systems to be suitable for an increasing number of workloads like server and desktop virtualization. But these solutions also need to address both controller bottlenecks and the inter-node communication issue to deliver effective performance scaling for a variety of workloads.
The Software Defined Network Solution
The inter-node communication problem that haunts scale-out storage is essentially a networking problem, and one that’s not solved alone by making the underlying storage faster with SSDs. It’s also not solved by making the network itself faster. The communication that goes across that network has to become more efficient and more intelligent. This is leading companies like Coho Data to integrate software defined networking (SDN) into their storage systems.
SDN for A Storage Professional
While storage administrators may have heard of the term ‘software defined networking’, many may not know exactly what the technology is or what it means for storage. Networking, at least in the IP sense, has always decoupled control and data. This enabled the forwarding of data to be done as fast as possible. The higher level layers consisted of protocols like spanning tree, but the control plane was isolated on the switch itself.
The problem was that each switch from each manufacturer had to be managed independently, requiring a very labor-intensive networking operations process to provision and configure network connectivity. Take for example a new business unit that needs a separate network for some functions but must still have access to the rest of the corporate network for other functions. To set this up properly requires a lot of planning and a very slow rollout. SDN’s goal is to reduce the time required to fulfill that sort of request.
OpenFlow is effectively an API set that generalizes that kind of configuration work. It allows a network administrator to configure flow forwarding rules. For example, a rule could be set up to send certain types of traffic to a specific set of switches. Instead of having to manage every port in the environment, SDN provides a single point of control that executes network policies based on a traffic rule instead of a port number. A data center typically has far fewer types of traffic than it does numbers of ports, in essence reducing individual points of management.
What SDN Means For Storage
If a storage system has evolved to leverage an SDN API set, it allows that storage system to act on data traffic while it is in-flight instead of at the end of the network communication. A capability like OpenFlow allows this in-flight adaptation to happen on commodity, off-the shelf networking hardware instead of proprietary or custom hardware.
The net result is an adaptive storage system that can support a wide variety of workloads and data types, while still leveraging commodity hardware, which keeps costs down. It would allow the storage system to make management and data placement decisions centrally with a holistic view of the storage infrastructure. This is a significant improvement for IP-based storage systems that leverage NAS or iSCSI. It allows the head to be decoupled from the storage management intelligence and resolves many of the latency concerns that exist with traditional scale-out storage systems, where a single node becomes that bottleneck as it parses data.
With SDN powered scale-out storage systems, the control process can be moved around to the least loaded node in the cluster. That node can then receive and parse its data set in parallel to other nodes acting on other data sets.
Leveraging SDN for storage also allows the storage administrator and eventually the storage system to make data placement decisions based on data type. Since the storage SDN understands which data is being sent across its network, like a traditional network based SDN does, this means a particular workload could be tagged with commands like ‘flash only’, ‘prefer flash’ or ‘force to disk’. In the future these systems could add network-specific Quality of Service (QoS) assuring not only storage performance but I/O performance as well.
Leveraging SDN also allows for better control over where in the storage cluster data is placed. Traditional scale-out storage makes all the disk act like a single pool of storage where all data and all nodes are treated equally. SDN will allow for the intelligent placement of data in terms of storage type, number of nodes and which specific nodes are selected.
Intelligently placing data within a storage system is not necessarily a new concept, but leveraging SDN to make decisions about data while it’s still in transit is new. Companies like Coho Data are leveraging software defined networking and software defined storage to provide highly flexible, highly reliable and highly cost effective storage systems that can support a wide variety of workloads. In a single system, they can reduce cost and storage system sprawl while addressing the storage performance demands of the modern virtualized data center.
Coho Data is a client of Storage Switzerland