When an organization implements an all-flash array, its objective is to meet user demand for improved application responsiveness. But the organization should also expect the flash array to drive down the cost of delivering that level of performance. The problem is that while all-flash arrays do tend to improve the user experience, they don’t enable increase application utilization of server CPUs to an extent that costs are reduced.
How can an all-flash array drive down costs?
An all-flash array is almost always more expensive than the storage system it replaces, so how can it drive down costs? The flash storage system can reduce cost in several ways. First, it should reduce the number of servers per application the organization needs to deploy. Before the introduction of flash arrays, the average CPU utilization for an application’s server was about 7%. All-flash systems have moved that utilization up to 30%, so that should mean fewer applications.
But even with the increase in utilization most environments still have CPUs that are idle 70% of the time! NVMe-based flash arrays are supposed to resolve the utilization problem, but in reality, most applications server utilization is only 50% when connected to an NVMe-based flash array. That still means there is an opportunity for another 30% to 40% increase in utilization and a further reduction in physical server requirements.
The lack of fully utilized servers means an organization has to buy two to three times as many physical servers as it should. It also means the organization has to buy two to three times as much software licenses as it needs. Keep in mind that in just a couple of years of use, the cost of the software licensing can far outstrip the cost of storage.
What’s the Bottleneck?
Why can’t current systems deliver higher levels of utilization? Mostly, it is because all-flash systems were built from a hard-disk drive ancestry. While storage software is more flash-aware, it is still typically loaded with features that add latency.
In the end, the CPU that drive the storage system is asked to do too much work. These CPUs have to drive all IO through the system, run the storage software and manage data protection functions. They no longer have the latency of hard disks to hide behind, flash responds instantly, shining a light on all the other storage system components.
The Solution to Driving Down Cost
We can find the solution from the past. As shared storage systems first came to market CPU horsepower was not as plentiful as it is today. Storage systems often offloaded some of the IO or data management features to FPGAs or ASICs. Then, in the mid-2000s CPU horsepower did increase, while becoming less expensive. The extra CPU horsepower plus the continued use of hard disks provided storage system vendors with just enough processing and more than enough latency to hide sophisticated functions behind.
Again, flash exposes that latency, plus the complexity and scale of workloads are increasing exponentially. It is time to return to off-load engines to help drive storage performance to the next level. There is precedence of off-load throughout the modern data center, graphics processors in big data for example, storage vendors should follow that lead so they can deliver systems that will allow 75% or greater CPU utilization.
To learn more about why flash systems are not living up to their full potential and how to design systems so they can, join us for our on demand webinar, “All-Flash For Databases: 5 Reasons Why Current Systems Are Off Target.”