The All-Flash Array market is at a crossroads. The SSD drives they use are getting faster but the systems that hold those drives are not. Today a single NVMe SSD is capable of greater than 700k IOPS, but most AFA can only sustain 500k IOPS. The limitation is not the drives those systems use, but the internal and external networking they use to communicate to the drives and the physically attached servers.
The Persistent Local Storage Workaround
The most common workaround for the difference between the performance of the NVMe SSD and the speed of network connectivity is to create a locally persistent storage model. The advantage of local persistent storage is extremely high performance with very low latency.
The disadvantage is there is a lack of availability, if the server fails data goes with it. And there is a lack of mobility. If the container or virtual machine moves to another physical host then its data needs to be copied to the new server, which means downtime or degraded performance until the copy is completed.
Most persistent local storage designs overcome these disadvantages by either using multiple replicas or erasure coding. Multiple replicas have an advantage of maintaining local performance, creating higher availability and providing mobility at least to the hosts that have a replica. The downside is the 3X (or more) consumption of expensive, high performance capacity.
Erasure coding reduces the capacity overhead of replication but re-introduces network latency and increases internal CPU utilization.
The answer to achieving high performance is not to abandon the advantages of shared storage but to fix the problems with it. In the end, the key bottlenecks of shared storage are the controller architecture, the internal network and the external network.
Introducing E8 Storage
E8 Storage is a high performance, low latency storage system built on NVMe and high speed networks. The goal is to deliver more performance on fewer servers with locally attached latency. The system uses NVMe SSD drives internally and is designed to work with technologies like Mellanox’s Remote Direct Memory Access (RDMA) network adapters.
There is also a client portion of the solution which provides not only direct access to the remote storage but also distributes some of the IO load from the E8 Storage appliance itself, enabling the E8 Storage appliance to maintain a more consistent performance profile. E8 Storage also leverages RDMA over Converged Ethernet (RoCE), a network protocol that allows remote direct memory access over an Ethernet network. This means that the data path operations from the client to the storage bypass the controller CPU on the E8 Storage appliance, removing those controllers as potential performance bottlenecks as they are with legacy systems.
The combination of NVMe drives, split IO / compute responsibilities, and the use of RoCE, delivers performance and latencies very similar to locally attached storage. But, of course with all the availability, data protection and instance mobility that shared storage is known for. E8 Storage claims its system can sustain ten million, yes 10 million, 4K read IOPS, 40GB/s read bandwidth and latencies only slightly higher than locally attached SSDs.
E8 Storage was one of the first NVMe native storage systems to come to market and one of the first to provide NVMe over Fabric type of connectivity up to the hosts. As is often the case the initial release of the solution does not come with the complete complement of data services that enterprise systems offer.
Most environments that need E8 Storage levels of performance, like SQL and NoSQL databases or real-time analytics, don’t need or even want those services. These are environments where every additional increase in IOPS performance pays for itself many times over, and E8 Storage with its 10,000,000 IOPS may be just what the doctor ordered.