The Tier-1 Storage vendor Deduplication and Compression challenge

Posted on September 11, 2014 by George Crump

Data centers are ready for increased data efficiency that they can get from primary storage deduplication and compression, especially for flash based storage systems. The cost of flash justifies these data reduction technologies and flash’s high performance actually enables their use. The problem is that, for the most part, the only vendors that have been able to deliver a consistent data efficiency strategy are startups, not tier-1 storage vendors.

Tier-1 storage vendors have a much bigger set of challenges when it comes to providing primary data efficiency to their customers. They have to integrate storage into legacy code bases, support deduplication / compression across multiple storage product lines and they have a set release cycle by which they can release these changes to their customers.

The cycle time it takes for tier-1 vendors to overcome these challenges is allowing startups to gain footholds in legacy accounts, footholds that often come by peeling off certain IT projects or initiatives. For example, one startup vendor may establish their initial beachhead by providing storage for a VDI project, then another may be chosen to support a database application. This gradual loss of the enterprise data center equates to “death by a thousand cuts.”

The Startup Advantage

Why are startups winning? Most are winning with all-flash or flash-assisted storage systems designed to bring most of the performance of flash (enabled by high IOPS inherent in flash technology), but do so at a reasonable price (enabled by dedupe/compression technology). Almost every startup vendor now claims price parity to what they call “high performance disk systems”, a.k.a tier-1 storage. This claim is supported partly by startups’ lower overhead and their willingness to take lower margins (sell for less). But, the key driver to hard disk price parity is an unapologetic implementation of data efficiency technologies like deduplication and compression. These startups also have the advantage of a consistent message and a non-overlapping product line. Their sales and marketing people wake up every day knowing exactly what they have to sell.

The Challenges to Tier-1 Deduplication and Compression

What are tier-1 vendors waiting for? Don’t they see the gradual erosion happening before their eyes? Partly the answer is “no” because at this point the success of startup vendors can be described as “only a flesh wound”. But, large vendors do know they need to move to address the demand for deduplication and compression, especially in flash based primary storage.

First, the process of implementing data efficiency into a tier-1 storage array, after that array and its software have been on the market for over a decade, is no small task. It requires the integration of deduplication and compression algorithms into what is likely a very large software source code file.

The reason is that deduplication and compression are features that significantly modify and sometimes not even write data, even though the application thinks it did. If this software is written incorrectly or integrated poorly, both data integrity and performance could be degraded. And, as a tier-1 vendor, new features have to work across tens of thousands of customer deployments perfectly and consistently.

Startups have a clear advantage here. Without an installed base, they’re able to start with a clean slate and integrate deduplication and compression directly into the storage software as it is being created. They also start small, one customer at a time. If deduplication and compression cause a performance or data loss problem, it can be quickly (and quietly) addressed.

This reality of the tier-1 development process has led to the second challenge: tier-1 storage vendors are starting to deliver deduplication and compression separately. Some vendors have delivered deduplication only (great for unstructured data) and others have delivered compression only (great for structured data). While the efficiency that even one of these features can provide has value, delivering both is critical to providing an enterprise class solution. When delivered in a single storage solution the data center can benefit from efficiency across mixed workloads of both structured and unstructured data, which is what most enterprises have.

The third challenge stems from the second, data efficiency is showing up on some of the tier-1 storage vendors’ products, but not others. Tier-1 storage vendors typically have two or three storage solutions targeting different customer segments or workload types. These different product lines have often come via acquisition, which of course means a separate source code tree. It’s not uncommon for a tier-1 vendors larger customers to have a mixture of that vendor’s products.

As a result of the different source code trees, vendors are rolling out different deduplication and compression engines for each platform; they’re not integrated and typically can’t work together. This also means that some of a vendor’s product lines have complete data efficiency, others have deduplication OR compression and still others have no data efficiency features whatsoever.

This lack of consistent deduplication and compression engines and deployment strategies makes the development effort that much harder for these vendors and it makes day-to-day operation more confusing for their customers. If a customer moves data between one of the tier-1 vendor’s storage product lines to another, the data needs to be rehydrated (un-deduplicated) and then deduplicated again when it gets to the new storage product. As a result, one of the key benefits for staying with a single vendor, consistency, is eliminated and the customer often decides they might as well take a chance on a startup.

The Tier-1 Fix

To overcome these challenges, tier-1 storage vendors should be looking for data efficiency software in either a development kit (SDK) or rapid deployment model using linux that provides deduplication and compression functionality without having to code it themselves or go through a painful integration process. Data efficiency software would enable them to integrate deduplication and compression uniformly across all product lines. Companies like Permabit Technology are providing data efficiency software functionality via their Alberio SDK or Albireo VDO offerings that leverage years of real-world use and thousands of implementations worldwide.

There is a challenge with tier-1 storage vendors even using an SDK or rapid deployment approach like VDO. While these tools can speed development, they still take time to integrate and fully test. Plus, the right timing of when to deliver a significant upgrade to their storage software may include far more than just the integration of the deduplication and compression software. Tier-1 vendors need a gateway to deduplication and compression, something that will bridge the gap between future full integration and losing specific storage projects today.

Data efficiency vendors need to begin providing capabilities like deduplication and compression via a gateway. This would allow tier-1 vendors to bring these capabilities to their existing offerings, now, and to stop startup storage vendors from surgically cutting away at their business.

Conclusion

Startup storage vendors are making inroads into traditional tier-1 storage accounts through the intelligent combination of flash, deduplication and compression. If tier-1 vendors are not careful, these footholds will slowly grow into complete account loss. Tier-1 storage vendors need to move now to stop the bleeding, they can no longer wait until internal development efforts fulfill the promise of tier-1 storage data efficiency.

This Article is Sponsored by Permabit Technology

Click Here To Sign Up For Our Newsletter

About George Crump

George Crump is the Chief Marketing Officer at VergeIO, the leader in Ultraconverged Infrastructure. Prior to VergeIO he was Chief Product Strategist at StorONE. Before assuming roles with innovative technology vendors, George spent almost 14 years as the founder and lead analyst at Storage Switzerland. In his spare time, he continues to write blogs on Storage Switzerland to educate IT professionals on all aspects of data center storage. He is the primary contributor to Storage Switzerland and is a heavily sought-after public speaker. With over 30 years of experience designing storage solutions for data centers across the US, he has seen the birth of such technologies as RAID, NAS, SAN, Virtualization, Cloud, and Enterprise Flash. Before founding Storage Switzerland, he was CTO at one of the nation's largest storage integrators, where he was in charge of technology testing, integration, and product selection.

Tagged with: Compression, Deduplication, Efficiency, Integration, Permabit, storage array, Tier-1
Posted in Article