Snapshots are incredibly popular and a powerful way to protect data. They enable an organization to protect data rapidly without consuming disk capacity until additional changes occur. Most storage systems can also roll back snapshot images for rapid restores and even allow IT to “drill inside” the snapshot to recover specific files. But, snapshots are not perfect, and they shouldn’t be used as the organizations only data protection strategy any more than RAID should.
The Problems with Snapshots
There are several problems with snapshots as the only data protection strategy for the organization. The first and most obvious is that the protection is occurring on the same system as the production data, which is no different from a user copying a set of files from one directory to another. Any failure in the storage system results in loss of the snapshots.
Second, while snapshots have zero capacity impact at backup, as data in the production volume changes, snapshot capacity requirements will increase. The capacity consumption is on the organization’s most expensive tier of storage, production. As a result, while most organizations will take snapshots multiple times per day, they won’t hang on to each snapshot for more than a week or so prior to offloading it to secondary storage. However, offloading to secondary storage is a secondary process. Most storage system vendors have no ability to copy data contained in snapshots directly to a cost effective secondary storage tier.
Third, there is no cataloging of snapshots, so finding a specific version of a file requires knowing what snapshot has the particular version that the user wants, then mounting that snapshot and manually copying that file from the snapshot back to production storage. Snapshot technologies lack a search capability that enables IT to search for a file by name or other metadata, then have a results page show each version of that file and which snapshot contains those versions.
Fourth, each storage vendor has its own method of executing and maintaining snapshots, and since most data centers have anywhere from three to six different primary storage systems, counting on snapshots for data protection creates a system management challenge. IT needs training on each storage system snapshot interface, each of which requires a unique set of skills to manage. As a result IT needs to interface with three to six GUIs to monitor snapshot success.
Finally, while storage system vendors have improved their integration with operating systems to ensure quality, consistent copies, they often don’t integrate at the application layer to guarantee application consistent copies.
Snapshots as a Complement to Backup
Snapshots should complement the backup process, not replace it. Snapshots should be the source that the backup application copies data from to avoid impacting production data. Some backup solutions will leverage snapshots through the use of scripts, but scripting introduces its own set of problems. For example, each storage software or application upgrade means that the script has to be re-tested and potentially re-written.
Backup should go further in its support of snapshots and not require scripts to get there. A modern data protection solution should actually manage the snapshot process, creating a centralized console for all data protection operations. Integrating snapshot management into the data protection software will enable the software to trigger the snapshots, ensure the application is in a ready state and allow IT administrators to take advantage of the data protection solution’s built-in cataloging capability, making finding recovery data much easier.
The challenge for integration is ensuring support for enough primary storage vendors to make the effort worthwhile. If the backup software only supports one or two primary storage vendors then the likelihood of a single, centralized view being possible is unrealistic. The key is for the data protection software to take the lead and create a framework that makes integration an easy task for storage system vendors.
Conclusion
Snapshots are an ideal way to recover from application corruption or user mistakes. Snapshots however, are not backups but they can aide in the creation of protected copies of data on secondary storage. The challenge that organizations face is making the connection between the two. Data protection solutions should take the lead and create an easy way for vendors to interface with the backup software, in order to create a single data protection console that spans a variety of primary storage systems.



