Is All-Flash Deduplication a Must Have?

In the early days of All-Flash Arrays (AFA) deduplication was a key catalyst for adoption. Server and Desktop virtualization environments benefited greatly from the technology because of the similarity between virtual machine images. The environments also didn’t have the same performance demands of the system as scale-up databases and other applications. Using deduplication on database and other application environments doesn’t deliver the same capacity gains, thus wasting compute resources. Now IT planners need to reconsider if they “must have” deduplication in their AFAs.

Deduplication is an investment. In a storage system, the vendor is investing CPU and memory resources hoping that the return on that investment is a reduction in capacity requirements. Deduplication on production storage is nowhere near as effective as it is in backup storage. While some vendors claim 5:1 deduplication ratios, most customers actually see 3:1 when workloads like databases and virtualized infrastructure are blended together.

In an era were flash capacity was more than ten dollars per gigabyte, the investment made sense if the customer could get $30 of storage for every $10 purchased. It saves the customer $20. Today, flash storage is lower than $1 per Gigabyte, so that 3:1 ratio only saves $2 per gigabyte.

If deduplication is applied in such a way that it is 100% seamless to production storage and has no relative cost, then even a fifty cent per gigabyte savings might be worth its application. Deduplication though always has an impact. Storage vendors can hide the impact by using more powerful processors and more memory but all the attempts to hide deduplication’s impact, increase the cost of the all-flash array.

Another challenge facing vendors is there is less latency within the storage infrastructure to hide behind. First generation all-flash arrays use SAS based flash drives and connect to servers via traditional SCSI protocols. The overhead of SAS and SCSI create enough overhead that the impact of deduplication might be less noticeable but in most cases it often was still noticeable. As next generation flash technology and networking technology come to market, based on NVMe, latency is significantly reduced and the overhead of deduplication is more exposed than ever.

The final concern with deduplication is one of need. First generation flash drives were 128GB to 256GB, now flash drives are available in double digit terabytes. Many organizations may meet their capacity needs with a single AFA and half a dozen drives. Deduplicating data across those drives means an organization may end up with 3X the capacity that it will actually need.

The Dedupe Tax

Another issue with deduplication is that vendors with the feature actually tend to charge more per raw GB than vendors without deduplication. This is known as the dedupe tax. For example if the cost of flash capacity is $3 per GB raw but thanks to deduplication the vendor can drive that cost down to $1 per GB, yet they charge you $2 per GB, then the organization is actually paying more per GB than it should even though they are saving $1 per GB.

Life Without Deduplication

All-flash array vendors have conditioned the IT mind to demand data deduplication. IT professionals need to reconsider deduplication‘s value. It may be a nice to have for certain environments but it is certainly not a must have and it does not lower the price.

To learn more about why all-flash arrays cost so much and what to do about it watch our on demand webinar “How to Define a 92TB, 500k IOPS for less than $95k”.

Sign up for our Newsletter. Get updates on our latest articles and webinars, plus EXCLUSIVE subscriber only content.

George Crump is the Chief Marketing Officer at VergeIO, the leader in Ultraconverged Infrastructure. Prior to VergeIO he was Chief Product Strategist at StorONE. Before assuming roles with innovative technology vendors, George spent almost 14 years as the founder and lead analyst at Storage Switzerland. In his spare time, he continues to write blogs on Storage Switzerland to educate IT professionals on all aspects of data center storage. He is the primary contributor to Storage Switzerland and is a heavily sought-after public speaker. With over 30 years of experience designing storage solutions for data centers across the US, he has seen the birth of such technologies as RAID, NAS, SAN, Virtualization, Cloud, and Enterprise Flash. Before founding Storage Switzerland, he was CTO at one of the nation's largest storage integrators, where he was in charge of technology testing, integration, and product selection.

Tagged with: , , , , , , , , , , ,
Posted in Blog
One comment on “Is All-Flash Deduplication a Must Have?
  1. Simona says:

    Hi George! Interesting read! It actually made me think of our environment (mixed SSD/SATA). We have very heterogeneous data. On one of our 80TB SSD Volume Group I have 3:1, on the other 10:1 saving. I find it extremely hard to estimate the savings before buying. This makes the decision difficult. Plus, Vendors tend to make All-Flash arrays (or even more so – NVMe) and two-digit TB disks so expensive, you don’t think twice about it. Also, you want a certain amount of disks to spread the I/O. For us, deduplication ist just an added bonus, not a reason for SSD.

Comments are closed.

Enter your email address to follow this blog and receive notifications of new posts by email.

Join 25,553 other subscribers
Blog Stats
%d bloggers like this: