Hyperconverged Infrastructure (HCI) seems like a dream come true for IT professionals trying to deal with rapid data growth. HCI is built from a cluster of nodes, with each node providing compute, storage performance and storage capacity. When the organization needs more of a particular resource, they just add another node. The problem is that adding a node wastes resources, increases network complexity and is limited in the variety of workloads it can support.
Creating a Storage Problem to Solve a Storage Problem?
When HCI solutions claim simplicity, they are basing much of this claim on eliminating the need for dedicated storage systems and storage networks. And initially, at three nodes, there is a case to be made. But when the data center tries to increase the level of HCI adoption or grows the environment, problems set in.
The reality is that most hyperconverged solutions are software-defined storage solutions repackaged to run within a hypervisor construct. Instead of being installed on dedicated hardware, the software is now installed as a virtual machine on each node in the cluster. Data, depending on the vendor, is striped across nodes or written locally, then replicated to other nodes (the source of a lot of east-west traffic).
Is Scale the HCI Achilles Heel?
The vision for HCI is to simplify the data center. Each node within the HCI cluster has everything that a data center will need as it expands; compute, hypervisor, storage software, performance storage, capacity storage and networking. Initially, with the first three nodes, all is well. However, as the data center grows and adds nodes, the HCI vision begins to get foggy.
The problem with HCI scale is that each node comes with all the above resources, where most expansions requires only one, typically capacity, thus wasting resources. Additionally, most HCI vendors know that nodes are limited in the number of workloads they support and only recommend running one workload type per node, wasting even more resources. Both situations lead to rapid growth in unevenly utilized nodes, which creates a networking problem.
The Resources Dilemma
Again, as IT adds nodes to the cluster, each of the three primary resources are added; compute, storage and networking. However, most data centers only need to add one particular resource consistently. Some data centers are always capacity-constrained, so their primary reason for adding nodes is to get capacity, and in an HCI environment, they end up with very low CPU utilization. Conversely, there are data centers that mostly need more compute, and they end up with massive amounts of free storage space.
To combat this problem, HCI vendors have tried to create products that have more compute than storage or more storage than compute. The problem is that most HCI vendors can’t mix these types of nodes, or they can’t differentiate between node types within the same cluster. As a result, data centers that have one workload that needs compute and another that requires capacity, end up having to create and manage two separate clusters.
The Flexibility Problem
The storage supporting virtual infrastructures are getting pushed harder than ever as workloads like MS-SQL, Exchange other traditionally stand-alone mission critical applications are virtualized. HCI vendors typically recommend the customer creates a cluster for each of these workloads, or at a minimum, dedicate a node. This makes supporting mixed workloads and scaling beyond initial configuration more complex. Again, having multiple clusters clouds the original HCI simplicity vision.
The Networking Bottleneck
If the customer can rationalize the wasting of resources and the potential existence of multiple clusters, they face one more hurdle. A cluster, as node count grows, becomes increasingly complex to manage. The networking of dozens of nodes requires careful design and consideration.
The communication between nodes, often called east-west traffic, becomes almost overwhelming. Studies indicate that in a large HCI cluster as much as 75% of network traffic is storage IO and node synchronization.
What was considered a cheap alternative to a series of dedicated networks suddenly becomes expensive as IT professionals find that they need to upgrade network switches and implement specific network management tools to monitor and manage network communications.
In virtualized and cloud environments, storage is always a challenge and has given HCI its early beachhead. Nevertheless, the HCI vision assumes that storage complexity is eternal, that systems can’t be designed that are easier to use and are more in line with the virtualized / cloud data center strategies. The truth is that there are storage systems that are VM / cloud aware and can fit easily into that plan while maintaining the predictability and efficiency advantages of separate dedicated compute and storage tiers.
Sponsored by Tintri