If all storage were the same price, all storage would be solid state storage. We would end data tiering, there would be no performance tuning and the discussion of storage performance problems would be a thing of the past. Of course all storage is not the same price and as a result the data center has to deal with fairly complex methods to affordably deploy solid state storage, including mixing it with mechanical hard disk drives. Flash-only storage systems promise to deliver the ‘solid state data center’ but can they overcome the problem of cost?
Cost is the overriding challenge that can prevent flash-only storage systems from going mainstream in the data center. There’s no question they can deliver the performance demands of the typical enterprise and reliability concerns are quickly being addressed. Cost is a concern, but thanks to its extreme performance capabilities all-flash systems can leverage a multitude of storage efficiency techniques to drive that cost down.
Companies like Pure Storage are helping the industry reach the point were all-flash storage systems can be price competitive with high-speed (15K RPM) mechanical hard drive systems. Theoretically, high-speed disk array systems could leverage the same storage efficiency techniques that all flash systems do. The problem is that hard disks don’t have the performance required to manage and execute those functions.
Multi-Step Approach To Driving Out Flash Cost
Thanks to the excess storage performance that an all-flash system has, companies like Pure Storage can use every storage efficiency technique available. And they’re still able to deliver performance that’s considerably higher and more consistent than mechanical hard drive arrays, even those assisted by a flash tier or cache.
The first step for companies delivering flash only storage solutions is to thinly provision all volumes on the system to make sure that capacity is only used when it’s actually needed. In certain environments, like VMware and even in some storage systems, the overuse of thin provisioning can significantly impact performance. This is because each write of data has the overhead of provisioning volume space associated with it. The impact is also seen on both mechanical hard drive systems and HDD systems assisted by flash, since most don’t use flash for write I/O.
Because of the way that flash storage has to clear out a flash cell before writing data to a flash cell, the technology is slower at writing data than it is at reading data. Despite this reality, flash storage, usually, outperforms hard disk storage when writing data. This performance advantage widens when the flash properly managed using advanced flash controllers and storage software designed specifically for flash is leveraged as is the case with Pure Storage. As a result the extra I/O caused by some write functions have little to no impact on the server/user experience.
Most studies show that in traditional “thick provisioned” volumes only 30-40% of storage allocated to applications actually has data written to it. This means that 60% or more of storage that’s been allocated is wasted. Since 100% of allocated storage is in use in an all-flash system the effective cost for capacity is greatly reduced.
The next step is to decrease the amount of data being written to those storage systems through the always on deduplication and compression. All-flash systems, again leveraging their excess performance capabilities, can perform deduplication and compression on data as it’s being received by the storage system. This has several positive effects.
First, inline deduplication and compression make sure that no redundant data is ever written to the device, meaning that no capacity is wasted even for a moment with duplicate data. In primary storage the rate of deduplication effectiveness is lower than when it’s used in the backup process. Stated another way, “mileage may vary”. The typical use case produces a roughly 5X increase in efficiency and when these systems are used in a virtual server or virtual desktop environment, that rate can grow to 10X or more.
All-flash systems also enable one more storage efficiency feature in the virtual environment use case, the aggressive use of cloning. For the same reasons that thin provisioning is avoided (dynamic provisioning on write I/O) in virtual environments, cloning of virtual machines is often used sparingly. In an all-flash storage environment this ability is again available and can be used aggressively since the performance impact of dynamic provisioning features like thin provisioning, snapshots and cloning will typically go unnoticed.
All-flash systems have another advantage beyond the storage efficiency techniques detailed above. They can leverage less expensive flash media other than the SLC flash that’s commonly used with tiering and caching techniques. All types of flash media have a specific endurance, or greatest amount write cycles they can sustain reliably. The greater the endurance, the higher the cost. SLC has the highest endurance (about 100K write cycles) but also costs the most. Tiering and caching systems are high-turnover environments where data is constantly being moved into and out of the flash storage area as it becomes active or inactive. This high turnover means significant write traffic and as a result, requires high write-endurance media like SLC.
All-flash systems, like those offered by Pure Storage, don’t have multiple tiers to move data between. They use the above storage efficiency techniques to fit all the data in the environment on flash storage. As a result the turnover rate of data coming into and out of their systems is similar to a standard tier 1 storage system. And these systems can use more cost-effective eMLC or MLC memory to store data and still meet the levels of reliability required in the enterprise.
The ROI Of All Flash
The storage efficiency techniques described above add up. Initially less storage needs to be purchased than before thanks to the combination of system-wide thin provisioning (which reduces the upfront purchases by potentially 60%) and compression/deduplication (which reduces the real capacity by 5 to 10X). These two techniques also curtail the rate at which the storage demands will grow over time, further adding to the cost savings.
Finally, there is the advantage of safely using less expensive flash media that’s about 1/2 the price of the premium SLC flash that caching/tiering processes are forced to use. Also, as Storage Switzerland discusses in the article “The Challenges with SSD Caching and Tiering” the users of all-flash systems don’t have to contend with the impact of cache misses.
When the storage efficiency techniques and the ability to safely use lower cost memory are combined all-flash storage systems can typically match the price of high performance, tier 1, HDD based storage systems. What’s more, they’re now working their way into the price-band of tier 2 storage systems. In addition, all-flash systems don’t have the complexities associated with managing the tiering and caching processes, nor do they need IT managers to constantly analyze how much of their IT budget should be spent on the SSD tier.