Organizations need to backup their data, but if you view backup as a one-size-fits-all job, where all data is treated equally, you’re missing an opportunity for substantial cost savings.
Ideally, you could tier storage and create an “active archive” that lets you protect “cold” data while removing it from the backup path to save time and money. After all, if the data is not changing, why keep making more and more copies of it? Doing so only increases backup infrastructure costs and backup windows. And it really never creates a long-term data preservation strategy.
In the past, legacy Hierarchical Storage Management systems that performed this archiving task were difficult to deploy and too often impacted the user experience. As a result, backup processes are now the default archive system for most organizations.
Next-generation file management solutions now resolve the challenges of legacy HSM. These new solutions employ scale-out object storage and file management software to lower costs, simplify implementation, and deliver transparent data access to users regardless of data location.
Backup vs Archive and HSM
A surprising number of organizations today are under the false impression that their backups can also function as archives. This is not the case. A backup is the recurring, systematic copying of active data, which is being frequently accessed and modified, in order to preserve its active content. These backup copies are made at regularly scheduled intervals so this active data can be restored in the event of a system failure, file overwrite or file deletion, whether deliberate or accidental. In most cases the data required to meet that restoration request is the most recently protected copy.
An archive, however, is a static copy of logical groups of older inactive data not needed for daily operations. This “cold” data does not change and is only accessed occasionally for historical reference, if at all.
Administrators tried to migrate and archive “cold” data either manually or with software tools. First generation Hierarchical Storage Management (HSM) software automated the process of identifying “cold” data then migrating it in logical groupings to the least expensive tier of storage available.
Unfortunately, early HSM products had three primary issues and shortcomings that made them unacceptable to organizations. One challenge was the destination archive was typically tape based which is ideally written to in big batches, not as data trickles, which is often the case in an archive. Second, the access time to that tape device was slow and often caused an application to time-out or a user to lose patience. Finally integration of the HSM software with the operating systems was kludgy at best, and did not deliver a seamless experience.
Modernizing HSM – Step 1 – Moving to Object Storage
Many organizations now use cloud storage to address their increasing storage capacity needs while containing their costs. But the public cloud does not meet the needs of all use cases. Concerns include performance, access costs, security and data custody issues, to name a few. As an alternative, many businesses are looking to bring the cost and scalability benefits of cloud storage into their own data centers with on-premises object storage technologies.
The right object storage system can address the first two problems of legacy HSM solutions: it can handle data that trickles in, and it provides near instant access to data.
But object storage by itself does not address the third issue: creating a seamless experience for the users as well as easing the management burden.
Step 2 – A Next-gen Data Management Solution
Next-gen data management platforms now complete the picture with storage analysis that identifies cold or dormant data, and automated data migration via scale-out data movers. These solutions employ user-defined policies to identify and move data to object storage, and then enable transparent data access to users when the need arises. Unlike legacy HSM products, next-gen HSM solutions use a dynamic link technique rather than static stub files, thus ensuring un-interrupted data access.
The combination of the object storage with next-gen data management software frees up valuable primary storage tier space while also creating true data preservation of “cold” data. This removes that “cold” data from the backup path as well. The removal reduces backup software cost and the backup load, thus making the process faster. The end result is that both “hot” and “cold” data are properly protected, while freeing up valuable primary storage and forestalling the need to purchase additional storage capacity. The back-end object storage system allows it to scale out to handle very large quantities of data.
Sponsored By Cloudian