Storage Switzerland was recently commissioned by Brocade, Dell, Emulex and Violin Memory to audit a lab test in which we were able to design an environment that delivered over 2 million IOPS. As we went through the testing process I was struck by two things. First, to generate this type of performance takes a village or, in data center terms, an infrastructure.
Each component of the test, the Dell servers, the Emulex Gen 5 Fibre Channel HBAs, the Brocade Gen 5 Fibre Channel Switch and of course the Violin Memory Arrays, played critical roles in our achieving this goal. My second thought was “so” as in “so what?” In other words what does this mean for the average data center that doesn’t need anything close to 2 million IOPS?
With the participating companies, we will be producing a detailed report, chalk-talk video and a webinar in which you will be able to ask us questions about the results directly. For now though I wanted to provide you with a preview of what we found.
Our test environment consisted of four Dell PowerEdge R910 servers, each with four Emulex Gen 5 Fibre Channel (FC) LPE-1602 cards. Those cards were used to connect the servers to a Brocade Gen 5 Fibre Channel 6520 Switch with 16 Gbps performance. On the other end were four Violin 6616 Flash Memory SAN Arrays. In other words this is the best of the best. The environment was tuned for performance without much regard to cost. There are some environments that need and can justify this configuration, frankly, many others can’t. If you’re in the latter group this test should still be important to you and we’ll cover that in the “So What” section below.
What is Gen 5?
As we covered in a recent column, Brocade switches and Emulex HBAs leverage Gen 5 Fibre Channel with 16 Gbps performance. Gen 5 Fibre Channel is the purpose-built, data center-proven network infrastructure for storage, delivering unmatched reliability, scalability and performance. This allows them to unleash the full potential of high-density server virtualization, cloud architectures, and flash storage.
With the gear assembled it came time to start our tests. We found that many of the standard benchmarking suites available are not ready for this type of system. They expect hard drives and ask for configurations that only make sense in a hard drive based world. Sometimes simpler is better so we went with FIO, a standard, easy to access testing tool. We ran a variety of tests that ranged from 100% reads to various read/write mixes. In our report we will detail a number of these scenarios because no two workloads are identical and the differences matter.
As an overview though our results ranged from 2.3 million IOPS on 100% random read tests to 1.7 million IOPS with a 70/30 read/write mix (using 8K packet sizes).
The Flash Village
We expected the Violin Arrays to be fast, it’s how well the Gen 5 Fibre Channel infrastructure responded that impressed me. In the report we will detail the results of the testing with fewer arrays but it was the infrastructure that allowed us to create a shared environment that generated the combined IOPS result.
The Dell PowerEdge servers were needed so that FIO could generate the data streams needed to stress the environment. The Emulex adapters played an obvious role by getting that data off the servers and onto the network and probably most important the Brocade switch made sure that traffic flowed seamlessly back and forth through the network. Remember, data was both read and written so the importance of the network cannot be overstated.
This test supports Storage Switzerland’s long-standing position that performance tuning requires a holistic approach. The move to All-Flash arrays has to be justified by delivering performance that enables the business to innovate. Putting very fast memory based storage at the end of a sub-optimal infrastructure does not deliver the return on investment (ROI) needed to justify the flash investment. The whole environment needs to be flash optimized, not just the storage system.
What Does This Mean To Me?
Probably the most important part of our report is going to be explaining to the reader just why achieving 2 million IOPS is important to them when more than likely the majority of the readers need less than 500K IOPS. If we don’t accomplish this than all we have is an interesting science experiment, not a practical application of flash technology.
I think most users are under the impression that storage system vendors have vast labs filled with storage guys just running tests all day. In reality this is rarely the case. Businesses, especially in the modern era, are lean and focused on the customer and the project you have going. It is rare when we can slow down long enough to go through the process of actually testing the limits of these devices, especially devices that can generate this level of performance.
This isn’t saying storage companies don’t do performance tests on their products – Violin certainly does, exhaustively. It’s just not the norm to put together an infrastructure that few customers will need (today). But like the ‘concept car’ that customers want, these projects teach manufacturers (and analysts) more about their technologies and push the envelope to benefit the products storage designs the majority of their customers will buy.
The result is that tests like this become significant opportunities to verify what we think is theoretically possible and to learn when our theories don’t pan out. As noted above this test verified the importance of the community of vendors as well as the raw performance potential of the Violin arrays. They also leave you wanting more; as I write this I can already think of four other scenarios that I’d like to either test or re-test.
But I Only Need 500k IOPS (or less)
OK, so what if you’re in that broad category of users who only need 500K IOPS? The good news is that some of the scenarios we ran were focused on more cost effective, single-unit configurations, which we’ll detail in the report as well. Until then if you don’t need 500K IOPS we can still say that this test confirmed the importance of the infrastructure, the ‘community’, in optimizing the performance of a complex system.
In other words, getting the most out of a single flash array requires the same thing a four-array ‘concept car’ does; the right compute, storage and networking elements, all properly configured and tested. This was evidenced by the impressive results we got with a singe array: read-only performance in the 870K IOPS range and mixed read/write performance around 457K IOPS.
Storage Swiss Take
The good news from our tests of the Brocade, Dell, Emulex and Violin Memory Systems lab test is that there’s something for everyone. For those that need to and can push the limit, we have, for those that can’t we can show how to design an environment that provides optimal performance for you.