In most cases CPU utilization is a direct result of how quickly can the storage architecture respond to the IO request. Generally speaking, the lower the CPU utilization, the more time the CPU is waiting for the storage architecture to respond. While All-Flash Arrays delivered a step-level improvement lowering IO wait time and increasing CPU utilization, there is still vast room for improvement. And there are environments, like Oracle and MS-SQL that can benefit if IO wait time is further lowered. If CPU utilization can be increased another step-level, organizations can improve response times, lower costs and simplify architectures.
Typical CPU utilization when connected to an all-flash array is around 30%. This increase in utilization is up from the 7% CPU utilization common during the hard drive array era, but again, there is obvious room for further improvement. While an all-flash arrays gives some of the data center more performance than it will ever need, applications like Oracle and MS-SQL need more. The workaround for most organizations is to stand up an ever increasing number of servers in a cluster. That gets expensive both from a physical server hardware investment and especially from a software licensing perspective since most database application vendors charge by the CPU core used.
Where All-Flash Arrays Fall Short
Most all-flash arrays have three key components. First, there is, of course, the flash NAND which is essentially the same from system to system. Then there is the CPU and the storage software. The CPU manages IO through the system and it drives the storage software, which provides the feature set. In a scale-out architecture the software also has to manage the storage cluster.
The problem is the interaction between these components takes time, which adds latency and decreases performance. This is why when an all-flash array vendor upgrades the storage system’s CPU, it typically will report an increase in performance even though the other internal components did not change. The flash is often the same, and sometimes the software adds more features (which adds more latency). The more powerful CPU adds the ability to process all of these tasks faster and it provides access to more storage server RAM to improve metadata handling.
NVMe will help the systems reduce latency further but not to the point that application servers will reach even 50% CPU utilization. All-Flash vendors will either need to substantially increase CPU performance, optimize their storage software or off-load some of the storage IO processing. And more likely, all of the above.
Vexata is a new data systems company, focused on the business critical application performance market, where applications like Oracle, MS-SQL, SAS and KX can get lower latencies from the storage architectures, CPU utilizations can increase and software licensing costs can be driven down. In short, the goal of the Vexata system is to deliver closer to 90% CPU utilization on the same application hardware.
Vexata offers a shared storage system, the VX-100 System, that attaches via traditional Fibre Channel protocol, something very common in the Oracle, MS-SQL, SAS and KX environments. The system can scale-out from 4 to 16 enterprise storage modules (ESMs). Storage nodes are blade servers and each node has 4 NVMe SSDs installed on them. The nodes are clustered together via 10GbE ports (up to 64) that aggregate the available storage. Metadata handling is done on a per node basis, essentially eliminating any overhead from metadata management process, which in turn allows for greater scale.
What makes VX-100 Systems unique, though, is each node leverages an FPGA (field programmable array) that Vexata calls the VX-OS Distributed Operating System. The FPGA is an acceleration engine for cut-through, high bandwidth and reliable IO distribution. It also handles services like RAID 6 calculations and encryption.
The VX-OS also allows Vexata to use a single socket X86 processor to drive its storage software, VX-OS. The combination allows Vexata’s VX-100 to support 7X as many users, to deliver 20X as many IOPS with a 4X reduction in latency. All without having to rip and replace the entire infrastructure or to leverage more risky server-side in-memory models. Vexata claims that all of this performance can be had at prices competitive with current all-flash array technology. Most all-flash arrays require a high-end processor to deliver any reasonable level of performance. Vexata’s use of an FGPA allows it to deliver better performance on a much less expensive single-socket CPU.
While all-flash has alleviated some performance problems for the enterprise, most organizations have a few applications that demand more. And these are often the applications that make the organization money or keep their customers happy. These applications need better performance than what the typical all-flash array can provide. To meet this performance need, companies are willing to invest high percentages of their IT budgets. It is possible that with Vexata they won’t have to do that. For the cost of an all-flash array they can get a system that will allow them to see better utilization of their application server’s CPU which means more users and lower licensing costs.