Unstructured data is hard to protect. It is growing at alarming rates. It is a crucial target of cyber-threats like ransomware. Even the makeup of the data is problematic. Unstructured data is often made up of millions, and in some cases billions, of files. In an attempt to overcome these challenges many data centers now count on snapshots to protect unstructured data. The problem is that snapshots, while they have a role to play, are not well suited to meet the requirements of unstructured data.
The Single-Point-of-Failure Problem
The first problem when using snapshots to protect data is that the snapshot occurs on the same storage system as the production copy of data. A failure on the storage system means not only the loss of production data but also the loss of all the “backup” copies. While snapshots are fine for a quick copy of data, it is critical to use the snapshot to create a copy of that data on a secondary system.
The Search Problem
The second problem when using snapshots to protect data is a search problem. While most storage systems on the market today can manage hundreds if not thousands of snapshots, almost none of those systems provide any form of granular search within the snapshot. Snapshots are well suited to restore the latest protected copy of data requested but not to fulfill a request to find the 5th version of a file that is four weeks old.
The Snapshot Capacity Problem
Snapshots, when first executed, take almost no additional storage capacity because the only thing copied is volume or filesystem metadata. As the snapshot ages, however, if the system, one way or another, has to track the various changes, then data consumption does increase. Week or month old snapshots, especially on very active filesystems can consume quite a bit of disk capacity. The problem is that the actual capacity consumption of an old snapshot is very hard to track and even harder to predict.
The Snapshot Integration Problem
The final problem with snapshots is that their integration with other components of the data protection process is limited. While some backup software can trigger a snapshot, backup the data from the snapshot and then delete the snapshot, the capability is rare and often found in only the most high-end backup solutions. Moreover, even these solutions only integrate with a handful of data protection software.
Use Snapshots for what They Were Designed
It was not the intention for snapshots, as a technology, to be a long-term backup. Instead, the intent was to use them, as the name implies, as a short-term representation of data at that moment in time. Instead, the system should quickly copy them to another data protection system and delete the original snapshot.
Snapshots should be more than an external integration between two separate solutions. Instead, the backup solution should “be” the snapshot and fully manage it. That snapshot should then feed the backup solutions secondary storage solution, which should then tier to a cloud-based storage repository and set up the organization to archive on-premises production data.
To learn more about the new requirements of unstructured data protection watch our latest on demand webinar “The Three New Requirements of Unstructured Data Protection“. Attendees get immediate access to Storage Switzerland’s exclusive eBook “Modernizing Unstructured Data Protection and Management“.