In an upcoming webinar, Storage Switzerland will make the case for using snapshots as a primary component of data protection. For this strategy to work several things are needed from the storage infrastructure. First, it must be able to keep an almost unlimited number of snapshots; second, it needs to have a replication process that can transfer those snapshot deltas (the changed blocks of data) to a safe place; and third, the entire storage infrastructure has to be very cost effective. In this column we will look at that first requirement, the ability to create and store a large amount of snapshots without impacting performance.
Click to register for the upcoming webinar "How Snapshots CAN be Backups!"
Types of Snapshot Technology
Modern storage system software handles data at a very granular, often sub-block, level and understands the way data is stored on the underlying hardware. It uses this knowledge to provide a wide variety of features including thin provisioning, advanced RAID protection, data tiering/caching and, of course, snapshots.
In general when a snapshot of a volume or LUN is taken all data is essentially frozen in time, hence the term “snapshot”. But that volume or LUN remains available to the connecting servers and applications. How the storage software tracks subsequent changes to that data is the critical differentiator between snapshot methods.
For years the most common way for a storage system to track updates to volumes under a snapshot was with a techniques called “copy-on-write”. When a data block is about to be changed or updated, the storage system software copies the old block of data to a reserved section of storage, then the modified block is added to the original volume. The snapshot version of the volume or LUN is then updated so that it points to the old block of data.
The challenge with this approach is that every write operation creates at least one additional write transaction, and more than likely an additional RAID parity bit creation as well. The more snapshots, the more potential writes that need to occur as a result of a one-block change. And of course, an active volume or LUN may have thousands of blocks changing every second. This means that storage systems using copy-on-write techniques typically have a very low (4-5) threshold of active snapshots that can be supported, and can show a performance impact when even the first snapshot is taken.
For the purposes of giving primary storage a larger role in the data protection process, the copy-on-write technique is less than desirable. Four to five snapshots is not enough to provide a complete set of recovery points and having the data protection process impact storage performance is never acceptable.
The second form of snapshot is typically called a “re-directed” snapshot. Again, it starts very similar to the copy-on-write method, but handles block updates very differently. When a block needs to be changed, instead of copying the old block to a separate volume, it merely updates meta-data pointers. The active volume points to the newest version of the block and the snapshot version points to the old version of the block.
Assuming the storage software is running on storage hardware with appropriate processing power, the users and applications should experience no real performance impact no matter how many snapshots are retained. This redirected form of snapshots is ideal for a data protection design that is going to count on the primary storage infrastructure to carry more of its own weight.
Retaining a high number of snapshots is just the first step in the process of a primary storage based data protection strategy, but it’s a critical step. Copy-on-write snapshots bring no advantages to this design because they are too limited and typically impact performance.
Understanding snapshot types is just the first step though. In our webinar we will review this information plus spend more time going over the other two requirements; a replication process that can transfer snapshot deltas to a secure storage area, and making the entire storage infrastructure cost effective so that three storage systems can be bought for the price of one.