Converged infrastructure (CI) and hyperconverged infrastructure (HCI) are common terms, but open converged infrastructure (OCI) is new. What is it and how is it different from the other architectures? This article will start with an examination of the similarities and differences between CI and HCI, as well as addressing their advantages and disadvantages. It will finish with a description of OCI and how it was designed to give you the advantages of CI & HCI without their disadvantages.
The Evolution of Virtualized Infrastructure
X86-based virtualization started almost 20 years ago and was simply software that ran on the same hardware that regular operating systems ran on. Instead of installing Windows or Linux, IT professionals installed a hypervisor like VMware and created a virtualized server. One challenge was that the use of off-the-shelf components not designed for virtualization led to mixed results depending on which components were selected.
Eventually some companies began to feel that preselected components could provide a more predictable result and faster startup when using virtualization, and converged infrastructure (CI) was born. The companies who managed to lead in the space used established storage, compute, and network vendors to build their systems, and tested them against each other for interoperability. But even these companies were not usually using components specifically designed for converged infrastructure; they were simply selecting components that worked well in a virtualized environment.
The traditional storage array products that were used to build CI product were often very expensive and inflexible. They were not the software-defined products that are prevalent today, and they suffered from many of the limitations of traditional arrays, including a central bottleneck in the storage controller and LUN-based management.
Hyperconverged Infrastructure (HCI)
The creators of hyperconverged infrastructure (HCI) products were created to address many of these limitations. They did so by changing one major aspect of how traditional infrastructure products were built: sharing. Both traditional virtualization and CI systems shared a storage array between multiple servers, thus maintaining a stateless server node. Any node that has access to the same storage can assume another node’s workload if it were to go offline, which led to such innovative products as vMotion. But nodes in an HCI cluster are not stateless, as the reader will see in the following paragraphs.
HCI systems typically store data across the nodes of the cluster. That data is either striped or copied to specific nodes. As a result HCI systems are always sold in clusters because data needs to be stored in more than one location for redundancy purposes.
Since they are not using a shared storage infrastructure, each write must be replicated to one or more nodes for redundancy. The replication of data requires a lot of inter-node traffic, which also creates an interdependence, and occasionally performance impact, between the nodes. This is why nodes are no longer stateless – they depend on each other for both storage and compute functions. One challenge with larger cluster sizes is that one node going offline can impact the availability and data integrity of the entire HCI cluster should a drive or entire node fail elsewhere in the cluster. One can maintain three copies of data to mitigate this risk, but at a significant cost premium.
Because each node also has to carry a portion of the compute and storage load, a second challenge is that most HCI configurations also require nodes to be similar in their configuration, especially for systems that stripe data across the nodes in a RAID-like fashion. Adding newer, faster nodes to an HCI cluster that stripes data won’t necessarily add performance to the cluster, since the overall cluster performance will be held back by the slower nodes. This is due to the interdependence between the nodes that is present in an HCI cluster.
HCI configurations have definitely created a simplified way to add virtualization to your environment. Customers simply purchase one cluster and install their VMs. Their compute, storage, and even data protection needs are addressed. However a third challenge with scaling an HCI cluster is there is no way to independently scale compute or capacity. If an organization needs more storage but not more compute, it will end up buying more compute as it buys more storage. This leads to waste, inefficiency and management headaches as the system scales.
Since each node is stateful, data protection is required at all levels of the configuration. The storage must be redundant both within the node and across nodes, which means that there are multiple layers of storage redundancy. This also tends to create waste due to excessive use of parity data for the multiple levels of redundancy.
Open Converged Infrastructure
A new type of virtualized infrastructure, referred to as open converged infrastructure (OCI), is attempting to address the limitations of virtual infrastructure, converged infrastructure, and hyperconverged infrastructure. At first glance, one might think that it’s architecture – stateless nodes that share common storage – looks a lot like a converged infrastructure model, but it’s actually quite different than that.
An OCI system consists of two main components: one or more compute nodes, and one or more data nodes. Each compute node has compute, network, and internal flash storage. The data node has NVRAM and spinning disk.
VMs run in the compute node, which performs compression, encryption, erasure coding, and snapshots of all the data prior to writing it to the local flash in the compute node. All data is also synchronously written to NVRAM on the data node, where it is globally de-duped against all other data and stored on spinning disk. Compute nodes are typically sized to handle the entire active working set of VM data (after compression), and the data nodes are sized to hold all data, including historical versions of the active working set (e.g. older snapshots).
This separates two aspects of storage: performance and capacity. The local flash in each node meets the performance requirements of the VMs. The capacity requirements, created by additional copies and historical snapshots of a VM, are stored on the data node. This allows the data node to use less expensive components, as performance is not a factor in its design. In effect, the compute nodes handle all primary workloads and the data node provides efficient secondary storage for persistent data.
How is This Open Converged?
A CI environment was very closed; everything stayed inside a single, monolithic configuration that typically uses traditional storage. An HCI configuration is so tightly knit between each node that it was also rather closed. The term open converged is meant to convey how returning the nodes to their original stateless configuration makes things more fluid and open. It allows for flexibility that previous architectures don’t.
There is no inter-node traffic except during a vMotion. This means that compute nodes are no longer reliant on each other, and as a result performance is more predictable. In addition, all nodes except one compute node can fail and the system remains available.
All data is protected because all data is also stored on the shared data node. There is no need for multiple nodes to protect the data. This makes each node stateless again, which simplifies the configuration.
It also allows for more options when purchasing nodes. Faster nodes can be purchased for VMs that need more performance, and less expensive nodes can be purchased for those who do not—all supported within a single system. As an environment ages, newer nodes can be added to the configuration and the power of those nodes can be immediately realized, since they are not reliant on the compute capabilities of the older, slower nodes.
An open converged system provides a turnkey system with a single point of support. But unlike CI and HCI architectures, an OCI product can allow customers to use their own hardware and service it themselves. In OCI, architecture is completely software defined, offering customers all the advantages of that concept without some of the limitations mentioned above of HCI.
The OCI architecture offers a new variation on architectures purpose-built for virtualization. It allows high-performance for those apps that need it, while allowing customers to buy less expensive nodes for those apps that don’t. It allows compute nodes to go back to being stateless, as they were before the advent of HCI architectures. Perhaps most important is it offers the simplicity of the HCI architecture without its limitations.
Sponsored by Datrium
Hey Curtis. I enjoyed reading this piece and will look forward to seeing the OCI architecture come to fruition. Take care, Greg