Why Do All-Flash Arrays Need QoS?

Posted on June 13, 2017 by George Crump

If all-flash arrays provide instant, unlimited performance, why manage how much of that performance they deliver through techniques like quality of service (QoS)? For most organizations one of the attractions to all-flash arrays is the set and forget answer to performance. Install it and all performance problems magically disappear. Until they come back. Instead of opening the performance floodgates, IT professionals should control the flow to make sure that mission critical applications get the performance they need.

The All-Flash Adoption Cycle

Organizations buy most all-flash arrays to solve a very specific performance problem. It could be a database that can’t handle the user load or an analytics jobs that can’t process data quickly enough. In almost all cases, that initial flash array purchase solves, to the point of eliminating, the problem.

At that point the organization realizes that it has excess performance available to it, so much so that it feels unlimited. And thanks to the data efficiency techniques common in all-flash arrays, they can add additional applications without consuming too much of that expensive flash capacity. It seems like nirvana.

As time goes on, IT adds more and more workloads. Then, the original application has a spike in performance need but now instead of having the whole flash array performance available to itself, it is sharing that performance with dozens if not hundreds of workloads. The result? The all-flash array can’t deliver the performance to the application IT used to justify the flash purchase.

Managing Performance Allocation

Managing storage capacity is relatively easy, even with data efficiency doing its job. IT administrators can see capacity being consumed. Data has gravity, it is essentially permanent. And, again thanks to data efficiency, it is relatively hard to fill up an all-flash array.

Performance, in terms of IOPS and bandwidth, is more temporal. An application needs peak I/O response for a short period of time, typically a few seconds, and then it doesn’t. The time between I/O demands is seconds or minutes, the hardware typically has the time to respond to all the requests. But if the array is loaded with dozens of workloads all requiring moderate performance and then suddenly a few of those applications need peak performance, problems occur.

There are only a few choices when it comes to all-flash performance management. First, the organization can buy an all-flash array that will handle peak I/O demands of ALL the workloads it supports. That’s expensive. Second, the organization could limit which workloads go on the flash array to make sure that there is no conflict with the original requirement. That means performance and capacity on the system go unused most of the time. The better alternative is to manage the performance with QoS capabilities.

What Does All-Flash QoS Look Like?

All-flash quality of services should have two aspects to it. First, it should have a reservation capability so that certain applications always have a certain amount of performance set aside for them. This is the performance equivalent of thick provisioning capacity. Then, there should be a minimum and maximum attribute to these settings. For example, an MS-SQL application should always get a minimum of 5,000 IOPS but never more than 15,000 IOPS.

If the organization decides on a system that has QoS capabilities then as it on boards applications, it should set all new workloads to a maximum of 5,000 to 10,000 IOPS. Then let the application that needs more performance have its performance dial turned up from the remaining pool. The result should be a more consistent performance experience while enabling a greater number of workloads to be deployed.

StorageSwiss Take

The goal of an all-flash array investment should be to place the maximum number of workloads on it while not impacting performance of mission critical applications. Data efficiency will manage much of the capacity concern but performance needs to be managed too. QoS allows IT to do a controlled release of performance so that it will always have some in reserve when mission critical applications need it.

Managing performance is just one aspect of an all-flash deployment. Click here to watch our on-demand webinar, “You Bought All-Flash, Now What?“, to learn what else you should be paying attention to as you deploy all-flash.

Watch On Demand

About George Crump

George Crump is the Chief Marketing Officer at VergeIO, the leader in Ultraconverged Infrastructure. Prior to VergeIO he was Chief Product Strategist at StorONE. Before assuming roles with innovative technology vendors, George spent almost 14 years as the founder and lead analyst at Storage Switzerland. In his spare time, he continues to write blogs on Storage Switzerland to educate IT professionals on all aspects of data center storage. He is the primary contributor to Storage Switzerland and is a heavily sought-after public speaker. With over 30 years of experience designing storage solutions for data centers across the US, he has seen the birth of such technologies as RAID, NAS, SAN, Virtualization, Cloud, and Enterprise Flash. Before founding Storage Switzerland, he was CTO at one of the nation's largest storage integrators, where he was in charge of technology testing, integration, and product selection.

Tagged with: All-Flash, Bandwidth, Big data, Capacity, Cloud, HPE, Hybrid, Hyperconvergence, IOPS, IoT, MS-SQL, QOS
Posted in Blog

2 comments on “Why Do All-Flash Arrays Need QoS?”

Dan Dunham says:

June 13, 2017 at 11:39 am

Could there also be an argument for doing this based on the known slowing that occurs after initial installation of flash. I.E. if performance is limited for most apps, they will never notice the system slowing and not have a reason to complain.
Datagres says:

June 27, 2017 at 12:26 am

We would go for an all flash array only if we don’t want to manually manage the dataset in it. That would definitely mean huge dataset being utilized by multiple applications. Now, in such a scenario if a Rough Application hogs up all the IOPS constantly, then the other Applications are going to suffer and would get low IOPS. In this case, if there was a scenario for QOS per Share/Volume then that would guarantee that no Rough Application would be able to mis-behave

Comments are closed.