When evaluating a new storage system, especially an all-flash array, the number of IOPS (Inputs/Outputs per Second) that the storage system can sustain is often used to differentiate one storage system from another. But is this really a standard that has any merit given the demands of today’s data center and the capabilities of today’s storage systems?
There are three factors that when combined tell the full story of storage performance; bandwidth rate, latency, and IOPS. Most storage vendors tend to focus on IOPS to brag about how fast their storage system is. But measuring storage system performance by IOPS only has value if the workloads using that storage system are IOPS demanding.
Transfer Rate vs. IOPS
There are many variables to consider when trying to determine the overall performance of a storage system. There are external factors like how the data is being read from or written to the storage system and the speed of the storage network fabric itself. There are also internal considerations like the CPU power of the storage compute engine (the storage controller), the efficiency of the storage software and, of course, speed of the storage media installed in the storage system.
For the purposes of this article, we’ll assume that all the external factors are equal. If that is the case then transfer rate is essentially the speed at which the storage controller can move a contiguous data block through the storage software to the storage media. It is typically measured in MB/s and a high transfer rate is important, especially for workloads that are sequential in nature.
IOPS, however, are different; they are measured as an integer number. It refers to the maximum number of reads and writes to non-contiguous storage locations. These operations are typically dominated by seek time, or the time it takes a disk drive to position its read/write heads to the correct location. Because this positioning of heads is so time-consuming, the importance of storage controller CPU power and the efficiency of storage software are greatly minimized in a hard disk array. Flash arrays virtually eliminate seek time from consideration and as such, they make the other variables like the power of the storage controller and efficiency of the storage software far more important. The storage controller and storage software can no longer hide behind the bad performance of the hard drive. Flash exposes them for what they are.
By way of example let’s compare how two workloads accessing the same amount of data require a significantly different amount of IOPS. The first workload requires reading ten 750MB files, 7.5GB and it takes 100 seconds for the transfer to occur. This means that the transfer rate is 75MB/s and consumes 10 IOPS, which is well within the capabilities of a single hard disk. The second workload requires reading ten thousand 750KB files, the same amount of data, 7.5GB, but it consumes 10,000 IOPS. Since the typical disk drive can’t generate more than 200 IOPS this request won’t get done in the same 100 seconds. This is an example of how different workloads can require significantly different performance while using the same storage capacity.
Do IOPS Matter? – Simple Answer, No
With the definition of IOPS out of way, the next question is should an IT Professional be concerned about the potential IOPS performance of a storage system? IOPS were a far more important measurement in the hard disk array era because the potential number of IOPS was often less than what the data center needed. In the all-flash array era the opposite is true. Most all-flash arrays will deliver far more IOPS performance than the most data centers will need.
IOPS Measurements Can’t Be Trusted
The other problem with using IOPS as a way to differentiate between flash storage systems is that there are too many ways to generate an IOPS number, as our illustration above indicates. IOPS can be significantly impacted by the size of the block used, the mix of read/write activity and the amount of randomness in that I/O stream. Even if vendors all standardized on how each of these variables were to be set, it would have little relevance to the data center. For example, if all vendors state they would report IOPS from tests using 4k block sizes and 50% random read/write mix, the resulting number would have little meaning to a data center whose workloads were generating 32k blocks with an 80% read to write ratio. Finally, most data centers are going to have multiple workloads running on their all-flash array. It will likely support a wide variety of workloads with varying read/write mixes.
The Right Measurement
The right way to measure the performance of an all-flash array or even a hybrid array is to develop performance statistics based on particular workloads or a mix of workloads. For example, run a SQL performance test and a VDI performance test at the same time on the same storage system and instead of reporting on IOPS consumed, report on data that is more tangible and relevant to the data center. In this case, it might be the number of simultaneous SQL users and VDI instances supported while still maintaining acceptable response times.
As stated above most all-flash arrays will deliver more performance than most data centers can take advantage of today. But today is the operative word here. As virtual server and virtual desktop density, as well as users per database instance all, continue to scale, data centers will need more and more performance. The flash media itself will become slightly faster but the key roadblock to expanding performance will be the storage controller and the efficiency of the storage software.
The features and capabilities of the storage software will add overhead to flash performance. The efficiency of that storage software in how it executes these various capabilities is critical to overall performance. Fortunately, the all-flash vendor has access to ever-increasing compute power that can mask most of the overhead of the storage software. It is critical though that the flash vendor be able to provide an upgrade path for their controller hardware so that their customers are able to take advantage of the increased power of each iteration of Intel CPUs.
Using IOPS as a way to differentiate between flash arrays is a risky practice. Most systems provide more IOPS than the typical data center needs. These data centers can better spend their time by looking for a flash array that provides the features they need at a price they can afford, as well as an upgrade path to continue to stay ahead of performance demand.
For those data centers that do need high enough performance that IOPS could be relevant, the vendor-provided IOPS numbers have too much variability between them to provide a meaningful differentiation between them. These data centers are better off requesting specific results based on a mixture of workloads that closely match their environment.
Excellent points. Many are confused by what sounds good and what really matters. IOPS is one, and IO latency is another. First, IOPS or IO latency does not automatically translate to specific I/O performance. Second, I/O performance does not automatically translate to application performance. Focusing solely on simple IOPS or IO latency results in a “twice-removed mindset” from the reality of what really matters – application performance. This article helps to promote the correct understanding toward achieving the right balance.
Great article guys and spot on. However, neither transfer rate nor IOPS gives a full picture of workload. There are two other big factors: IO block size and transaction RESPONSE TIME as measured at the server or VM. Why does nobody talk about this? Because only Virtual Instruments can measure response time (server or VM to storage total response), and Virtual Instruments can correlate all on one chart IOPs, MB/sec, IO block size and Response TIme. That gives you the FULL picture of workload and will lead to optimal design of storage, switches, servers and VMs.
This is not going to resolve itself soon. Will agree that benchmarking in a workload context is ideal for customer decisionmaking. For example $/desktop is a key metric for VDI decision-makers. But this requires a non-trivial investment on the vendor side and only covers the most popular and well understood workloads, and ignores the widespread challenge of defining performance for a multi-workload virtualized (blender) environment. VmMark is another good benchmark but it reflects performance of the full reference architecture rather than merely storage.
Vendors and customers can always benefit from a naive benchmark such as the usual 4KB 70/30 IOPS or sequential read throughput because we need an apples-to-apples competitive comparisons that are simple, unambiguous, and not dependent on the nuanced complexities of networking, host configuration, and application best practices.
I’m glad to see more discussion of IOPs & bytes/sec, and the workloads that being the different measures to greater importance.
I hope a growing number of systems plan for peak and typical number of outstanding ios as well. With more powerful cpus on one side, and more capable storage on the other, many systems experience a frustrating bottleneck of queued IO in the middle, keeping both CPUs and storage more idle than they need to be.
[…] CPU, memory, network bandwidth and storage input/output operations per second (IOPS). IOPS is a common performance measurement for storage […]