Should you be able to turn All-Flash Deduplication off?

Deduplication, along with compression, provides the ability to more efficiently use premium priced flash capacity. But capacity efficiency comes with at least some performance impact. This is especially true on all-flash arrays where data efficiency features can’t hide behind hard disk drive latency. This has lead some all-flash vendors, like Violin Memory, to claim that an on/off switch on all-flash should be a requirement. Is that the case?

History of the Deduplication problem

When networked flash was initially introduced to enterprise data centers, most of the vendors came from the DRAM storage system market. Vendors like Violin Memory, Texas Memory Systems (now IBM) and Kaminario delivered storage systems that were high on performance but short on features. The concern was that each feature added would lower performance and in their markets performance was everything, or so they believed. These vendors clung to these beliefs as they introduced their first flash systems. They were very high performance-wise, but very sparse on features.

Then vendors like Pure Storage, SolidFire and Tegile all introduced flash systems that were feature rich, or at least had more data management options than the ones mentioned above. The most notable features added were deduplication and compression, which promised to bridge the price gap between flash and HDD. Interestingly, these systems performed at about half the storage I/O rate of the performance focused systems.

Clearly something was different. The focus on affordability lead to a system that had more features and whose hardware was not purpose built for the low latency and high performance attributes of flash.

Does it Matter?

The big question is though, does it even matter? Does the focus on purpose built hardware, with features that are added only when needed, make a difference? If all you do is look at the numbers then the answer is yes. A purpose built flash system without features should be able to deliver 700k+ IOPS in a single unit (i.e. without scale-out). Most feature rich flash storage systems that are built on more commodity hardware are delivering between 250k to 400k IOPS.

So, does it matter? If you have an environment or even an aggregate of multiple environments that require more than 400k IOPS then the answer is a resounding yes. But the reality is the overwhelming majority of data centers don’t need anything close to 250k IOPS let alone 400k IOPS. In fact, most of the data centers that we work with can not generate aggregated performance of more than 50k IOPS. The demand for performance will grow over time, but there is certainly some head room here.

Always on Dedupe? We vote Yes

For most (99%) of data centers, the ability to turn deduplication on and off is a non issue. In fact, a case could be made that since it won’t impact your performance experience, then leaving deduplication and/or compression turned on is a must. This is because in almost every environment there will be some efficiency gain. Also in order to generate the I/O requests required to make an all-flash array sweat, we need aggregate workloads. Some of those workloads will deduplicate and others may not. It is simply easier not to have to separate these different workloads and let deduplication work if it can, and just be background noise if it can not.

Dedupe will get Better

The other mistake that the anti-deduplication crowd is making is they are working under the assumption that deduplication as a technology won’t get any better. This, again, is simply not the case. Deduplication will get better, dramatically so. It will get better at identifying duplicates and will be able to verify redundant data much quicker.

A case in point is Permabit; their deduplication appliance can perform inline deduplication at a rate of 180k IOPS and it is only limited by the CPU inside the appliance. So, even if they do nothing with their software code, their deduplication rate will get faster as Moore’s law continues to work its magic on processing power.

Maybe your Dedupe is bad?

This is not to imply that vendors who provide the ability to turn deduplication on and off have an inferior product, they just took a different path to the market. They typically started with a performance rich but feature poor solution and then added features like deduplication and compression. This gives them the ability to service the performance fringe. The 1% of data centers that actually need that performance will appreciate the capability, the other 99% won’t care.

Storage Swiss Take

First, the overwhelming majority of workloads will benefit from deduplication and/or compression. We have also seen significant value in these two data efficiency technologies when they are integrated and functioning in tandem rather than operating as stand alone processes. Second, for most data centers the answer to the dedupe on or off question is a resounding “doesn’t matter”. Their performance demands simply are not going to push any all-flash array, regardless if it has deduplication or not. Finally, for environments where extreme performance does matter, that purchase is likely to be an entirely separate decision well outside of the purview of core IT.

Click Here To Sign Up For Our Newsletter

Eight years ago George Crump, founded Storage Switzerland with one simple goal. To educate IT professionals about all aspects of data center storage. He is the primary contributor to Storage Switzerland and is and a heavily sought after public speaker. With 25 years of experience designing storage solutions for data centers across the US, he has seen the birth of such technologies as RAID, NAS and SAN, Virtualization, Cloud and Enterprise Flash. Prior to founding Storage Switzerland he was CTO at one the nation's largest storage integrators where he was in charge of technology testing, integration and product selection.

Tagged with: , , , , , , , , , , , ,
Posted in Blog
7 comments on “Should you be able to turn All-Flash Deduplication off?
  1. Reid Earls says:

    George, Violin Memory does work with customers that can generate aggregated performance of more than 50K IOPS. These same customers are intelligent and understand that for every positive aspect of a storage efficiency feature, there is also a negative. When it comes to deduplication, the negative is that it can add unnecessary latency to I/O operations where the application data does not deduplicate. For example, databases, images, and encrypted data. Unlike AFAs targeted at the SMB space, Violin Memory gives customers the freedom to choose what is best for them.

  2. Reid Earls says:

    Always-on deduplication helps the storage vendor. Selective deduplication helps the customer.

  3. George Crump says:

    Reid, Thanks for reading and commenting…. I have no doubt that customers with a need for more than 50k IOPS exists, we speak with them frequently. Always-on deduplication is irrelevant in sub-400k IOPS data centers. To say that it “helps the storage vendor” is a bit of a stretch. And a case could be made that non-integrated deduplication, “selective dedupe” has a higher overhead than integrated deduplication. As I clearly pointed out in the post, I don’t think selective deduplication is bad, I just think the claims of its advantages that vendors like Violin are making are overstated for 99% of the world’s data centers

  4. Reid Earls says:

    George, I can point you to multiple AFA vendors who proudly state that “always-on” deduplication helps their architecture minimize writes to the flash (thus extending its life) and, as a result, increases their overall performance because they’re writing less data. They lead each and every presentation with this statement. So to state it helps the vendor is not a stretch at all. What it also helps to give them – when they choose to spin it in their favor – is an attractive useable $/GB. When they cherry-pick VDI uses cases with 6:1 efficiency, it makes their outrageously expensive raw $/GB solution look more attractive from a useable $/GB. That DEFINITELY helps the storage vendor.

    However, When a dataset doesn’t deduplicate well, attempting to deduplicate it anyway:
    – does not help minimize the writes to the flash media, and therefore…
    – does not help their performance, and
    – it adds unnecessary latency to the I/O operations

    Based on these facts, it makes perfect sense to give customers the freedom to choose NOT to deduplicate datasets that they know will not benefit. This helps the customer’s performance and doesn’t waste controller CPU cycles doing operations that result in no value.

    So I’ll say it again. Always-on deduplication benefits the storage vendor. Selective deduplication benefits the customer.

    • George Crump says:

      And again I say that for environments with less than a 500k IOPS requirement it is a non issue. For those who do it may be but as I state in the post the actual impact vs. the return on dedupe investment probably still makes dedupe worth it. Im not saying that selective dedupe is bad, I am just saying that demanding it be listed as a requirement is a stretch for 99 percent of data centers

    • Najt says:

      Read my comment at the bottom regarding always on deduplication.

  5. Najt says:

    The only workload that I saw in reality and that can not be deduplicated (or deduplication gives you marginal results) is when you are using encryption. Everything else can be pretty much deduplicated.

    First example:
    Windows server 2012R2 with deduplication enabled on it’s data disk reports 73% deduplication rate (almost 4:1 reduction), when this server was moved onto pure storage, the pure storage reports (for LUN on which now this server resides) a 2,8:1 data reduction. Even if pure storage wasn’t able to find (much) duplicated blocks inside this particular VM, it sure found quite some duplicated blocks of date inside other LUNs across the array. So this is clearly benefit of array wide deduplication.

    Second example:
    VDI with linked clones, where all 250+ VM’s share 6 replica disks and persistent data disks range from 5 to 15 GB in size. On this LUNs hosting this VDIs we observing reduction ration around 5:1

    So in my opinion
    violin memory is dragster racing car, extremely fast but niche use case
    pure storage is BMW M5, still very very fast although not as fast as dragster, but much more more usable and comfortable

    P.S. I am not connected to pure storage in any way, I’m just a happy customer.

Comments are closed.

Enter your email address to follow this blog and receive notifications of new posts by email.

Join 21,566 other followers

Blog Stats
%d bloggers like this: