All-Flash Arrays are a performance sledgehammer, obliterating most performance problems and eliminating a great deal of storage management complexity. But, this sledgehammer is expensive and because of this All-Flash vendors are attempting to deliver a variety of solutions to drive down the cost per GB of these systems while maintaining their performance advantages. Front and center to these solutions is deduplication and compression.
While almost every All-Flash Array vendor has announced deduplication and compression as features, only one actively promotes the use of both and others say they have it yet do not recommend it beyond X or Y amount of TB. Ever wonder why?
The Deduplication Vapor
While most All-Flash Array vendors state that they have deduplication and compression many have not yet delivered both of them in a single system. Some vendors who claim to have deduplication will ask you to turn if off under extreme performance conditions or when the capacity of the system grows too large. This is an indicator that their deduplication code can’t scale to meet performance or capacity demands. Still other All-Flash vendors will say they have deduplication “on the roadmap” and have had it there for a long time, with no real delivery in sight. I’d call that check-box dedupe. “Yes we’ll have it but don’t use it!”
Lack of Deduplication Effectiveness
Another challenge that we see in All-Flash Arrays is a lack of effectiveness where the efficiency rates are not nearly as high as we would expect from the technology. The more granular you are at data examination, the more redundancy you can find and the higher the level of efficiency you can deliver. The problem is that the more granular you are with data examination the more time and resource consuming that process is. As a result vendors sacrifice maximum efficiency for performance, lower the potential return on the deduplication investment and raise the cost per GB of the storage system.
Why the differences?
At its core deduplication is essentially a mathematical exercise. At the outset it looks easily do-able but as you peel back the layers it gets increasingly complex . There are different formulas that can be leveraged to derive the deduplication answer and as always some of those formulas are better than others. But rest assured that each has trade-offs and it takes experience to weave your way through the complexity. Companies that try to develop deduplication “in house” are essentially part-time math students leveraging parts of formulas from other deduplication systems to fit the into their unique environment. Eventually it becomes clear that this is bigger and more complex than initially was thought.
The best solution is to go to the professors of deduplication. Companies that specialize in the technology and have years of experience implementing it. We are fortunate to have two of those professors on our upcoming CEO roundtable on deduplication “The Truth – All-Flash Deduplication”; Tom Cook and Jered Floyd from Permabit.
This Storage Switzerland roundtable offers IT planners and storage system designers insight on what to look for and what to look out for when considering adding deduplication to All-Flash Arrays. The live event will cover topics including what deduplication is, why it is so valuable in the All-Flash use case and what can go wrong if not properly implemented. Attendees will have an opportunity to ask questions of the participants at the end of the discussion.
Vendors and end users looking to add or upgrade deduplication to their Flash Array offerings can register for the event here. All attendees will receive an advanced copy of Storage Switzerland’s lab report “Breaking the 1 Million IOPS Inline Deduplication Barrier”.
- What To Look For In All-Flash Deduplication (storageswiss.com)