Is Dedupe Overrated?

Advanced data reduction technologies are central to the purpose built backup appliance (PBBA) market segment. They shrink the amount of data that’s actually committed to storage and help make these relatively fixed capacity devices feasible. But data reduction only provides temporary relief from the problem of data growth, and in some ways has insulated backup systems from the design requirements they need to keep up. Instead of relying on dedupe, backup storage needs to be scalable, economical and transparently upgradable in order to meet the challenge of exploding backup data sets.

Watch the on demand webinar "Four Assumptions that are Killing Your Backup Storage"

Symptom relief

Deduplication has reduced the sheer volume of data that backup systems have to store by identifying segments of a file that are the same as those in other files or previous versions already stored. But these are temporary benefits. As companies are seeing, data growth eventually wins out, especially with unchanging data sets or file types that don’t respond well to deduplication.

The result: backup storage systems are filling up. The standby data reduction tools that have made disk-based backups attractive for the past decade aren’t enough. What’s more, companies are finding out that backup systems aren’t typically designed for incremental scaling. This results in an accumulation of backup systems, also referred to as “backup appliance sprawl”. It’s becoming clear that backup storage needs to be designed more like modern cloud or object storage systems which can incrementally scale, almost without limit, and remain cost-effective.

Scale-out arch, not silos of PBBAs

PBBA vendors need to take a page from the cloud storage systems vendors’ playbook. They need to develop systems with the ability to expand easily and grow to much larger proportions. A scale-out, object-based architecture is well suited for this use case because it’s flexible enough to start small, typically with a few nodes (or a single node) and support incremental capacity growth as the backup data set expands. Some can even add processing power as needed to maintain performance as they scale that capacity.

As backups continue to grow an object storage architecture can enable the system to expand capacity to much greater proportions than traditional backup appliances or NAS systems can reach. This is the storage technology employed by the largest cloud providers to give them almost unlimited scalability in a single namespace.

Object storage and erasure coding

As backups get saved for longer periods of time, these storage systems need to support non-disruptive upgrades so that the latest hardware and software generations can be implemented without manually migrating data. As these data sets get larger, creating second copies becomes less feasible. In these situations, erasure coding can provide the data resiliency needed while keeping capacity requirements to a minimum.

Of course, deduplication is still important, but it must be flexible enough to support both global and in-line operation. This assures the greatest data reduction efficiency with the best possible performance.

It’s clear that backup storage system design needs a reboot. To learn more about the options available, tune in to the StorageSwiss webinar “Four Assumptions that are Killing your Backup Storage”.

Watch On Demand

Watch On Demand

Click Here To Sign Up For Our Newsletter

Unknown's avatar

Eric is an Analyst with Storage Switzerland and has over 25 years experience in high-technology industries. He’s held technical, management and marketing positions in the computer storage, instrumentation, digital imaging and test equipment fields. He has spent the past 15 years in the data storage field, with storage hardware manufacturers and as a national storage integrator, designing and implementing open systems storage solutions for companies in the Western United States.  Eric earned degrees in electrical/computer engineering from the University of Colorado and marketing from California State University, Humboldt.  He and his wife live in Colorado and have twins in college.

Tagged with: , , , , , , , ,
Posted in Blog
2 comments on “Is Dedupe Overrated?
  1. Richard Arnold's avatar 3ParDude says:

    True deduplication isn’t the answer to all companies ever growing backup volume woes. But part of the answer is not a technological one but a behavioural and processes based change. Most companies struggle to classify their data based on importance and retention requirements and so keep everything forever and back it all up in the same way. Not everything needs to be kept forever and most data would benefit from being classified and having a data management policy applied to it i.e. should it be on the fastest storage, archived, backed up hourly or not at all.

    Companies need to leverage an approach which utilises technological innovations such as dedupe hand in hand with good old fashioned housekeeping and planning.

    http://www.3pardude.com

  2. Erik Ottem's avatar Erik Ottem says:

    Some applications are great for deduplication, some aren’t. The best approach is switch it on for good fits, and switch it off for poor fits. You can check the blog “The problem wit always-on deduplication” on Violin Memory’s site.

    http://www.violin-memory.com/blog/problem-with-always-on-deduplication/

Comments are closed.

Enter your email address to follow this blog and receive notifications of new posts by email.

Join 17.4K other subscribers
Blog Stats
  • 1,979,429 views