The Top 3 Deduplication Trends

When combined with compression and thin provisioning, deduplication makes a backup disk a cost effective strategy and now makes considering a primary storage tier consisting only of flash reasonable. In our on demand webinar Permabit CEO, Tom Cook and I discuss the top trends in data efficiency and how it will impact the market. Below is my take on these trends. Join us for the webinar to see where Tom and I disagree.

Trend #1 – Deduplication as Table Stakes for All-Flash

Almost every all-flash array vendor has shipped some form of data efficiency with their flash arrays, but few have delivered all three components at the same time. The problem is that each member of the data efficiency trifecta has a significant role. Thin provisioning gives back otherwise captive storage that cannot be optimized. Deduplication identifies redundant data across files, ideal for virtualized environments.  Compression identifies redundancy within a file and is typically ideal for database environments. If data efficiency will be counted on to drive the cost of flash down, then the vendor should deliver all three components.

But data efficiency is no longer the sole path to reduced flash cost. With the introduction of a high-density triple level cell (TLC) flash and 3D NAND, non-optimized systems can meet very aggressive price points. These systems will need to have a small SLC or MLC tier to act as a write shock absorber for the less durable TLC tier, but practical use of TLC is well within reach.

Trend #2 – Cloud Independent Deduplication

There is an incorrect assumption that cloud storage vendors are already deduplicating data stored in the cloud. Many cloud vendors do not implement deduplication, and those that do implement the technology often don’t pass the savings on to their subscribers. Even if the cloud storage vendor did deduplication and passed the capacity savings on to their customers, you may not want it because you could be locked in to their data efficiency solution.

Cloud independent deduplication allows for data efficiency to not only provide a capacity consumption savings, but improves efficiency as data moves in, out and between clouds. The significant gain, in addition to capacity savings, is network transfer savings.

Trend #3 – Global Deduplication

The concept of leveraging deduplication to save on network transfer is not new, WAN vendors have done this for years. But the combination of leveraging deduplication to minimize network transfers and to maximize storage capacities across internal and external storage could be the most significant aspect of deduplication. Think about how much data transfers between storage systems today, globally applied deduplication could eliminate much of those transfers.

Global deduplication could play a role in a virtual environment that is leveraging VVOLS to downgrade or upgrade the service levels of a virtual machine. For example, if an administrator needs to downgrade a VM from an all-flash array to a high capacity hard disk array, VVOLS engages Storage vMotion. It transfers the entire virtual machine from the flash array to the HDD array, even though the HDD array may have dozens of VMs already on it that are very similar. When using global deduplication, only the data that is unique to that VM would need to be transferred. Storage vMotion could complete in a fraction of the time and the typical network concerns would be alleviated.


These trends promise to keep deduplication and data efficiency at the top of the conversation for years to come. The amount of data to be stored is growing faster than the declining costs of storage. Data efficiency is evolving so that it is more than a technology to reduce storage cost, but a technology that can enable a faster response to data center demands.

George Crump is the Chief Marketing Officer at VergeIO, the leader in Ultraconverged Infrastructure. Prior to VergeIO he was Chief Product Strategist at StorONE. Before assuming roles with innovative technology vendors, George spent almost 14 years as the founder and lead analyst at Storage Switzerland. In his spare time, he continues to write blogs on Storage Switzerland to educate IT professionals on all aspects of data center storage. He is the primary contributor to Storage Switzerland and is a heavily sought-after public speaker. With over 30 years of experience designing storage solutions for data centers across the US, he has seen the birth of such technologies as RAID, NAS, SAN, Virtualization, Cloud, and Enterprise Flash. Before founding Storage Switzerland, he was CTO at one of the nation's largest storage integrators, where he was in charge of technology testing, integration, and product selection.

Tagged with: , , , , , , , , ,
Posted in Blog

Enter your email address to follow this blog and receive notifications of new posts by email.

Join 25,553 other subscribers
Blog Stats
%d bloggers like this: