An application aware snapshot is one that is taken with the knowledge of an application that is storing its data on the volume. If you need a primer on snapshots, please see our previous blog on the subject. The taking of snapshots without the knowledge of the snapshots are crash consistent snapshots.
The danger of taking snapshots that are not aware of the application is not so much with user files or machine generated log data. The concern is with the files or blocks that a relational database is using. If you take a snapshot of the file system or volume with database data, without interfacing with the application, you are asking for trouble. In snapshot parlance, what you are doing is called crash consistent snapshots, and it’s important to understand why they have that name.
Crash Consistent Snapshots
One reason crash consistent snapshots have that name is they are as consistent as a crash. They are the exact equivalent of flipping the power switch on your storage array and then backing up the discs in that array. Another reason for the crash consistent moniker is that if you need to use a crash consistent snapshot in a recovery scenario, the database in question will have to go through what is called a crash recovery or media recovery process of removing any inconsistencies in the stored data.
Relational databases perform what the database world calls a two-phase commit. That is that all changes are stored in a transaction log first and then written to the database files second. The transaction is not considered committed until it has written the data to both places. If a server crashes after a transaction is written to the transaction log but before it has been completely written to the database files, the transaction log will show that this is the case and will attempt to either rollback or roll forward the transaction is appropriate. This is what is referred to as the crash or media recovery process and it works almost all of the time. Sometimes it does not work – that’s why we have backups.
Application Consistent Snapshots
Since the crash or media recovery process doesn’t always work – and that’s why we have backups – you can see why the idea of basing your backup system on crash consistent snapshots scares anyone familiar with this process. This is why whenever possible one should use application consistent snapshots, which have that name because they are created at a point in time when the application is aware of their creation. And the application can do things differently to make sure the snapshot will be something it can work with during a recovery.
For example, if you’re going to take a snapshot or backup of Oracle database files, you issue the command alter database begin backup, which tells the database instance to go into backup mode. Oracle continues writing to the database files, but it changes how it writes the data to the redo logs (what Oracle calls their transaction logs). It switches from logging the change vectors (e.g. add 7 to value X in record Y) to logging which blocks will be changed (e.g. block X will change from those bytes to these bytes). This allows it to be able to repair any blocks that were in the process of being changed at the exact moment when the snapshot or backup occurs.
The Role of Backup Software and Application Aware Snapshots
Since making application aware snapshots can be a complicated process, backup software can play a vital role in this process. Depending on the application, they may be able to directly interface with the application to ensure that the snapshot is application aware. A perfect example of that would be interfacing with volume shadow services (VSS) in Windows. If the backup application is unable to directly interface with the application, the process is usually done with pre-and post scripting which the backup application can run for you prior to and after making a snapshot.
Different database products prepare for snapshots differently, but they all need to be aware you are creating a snapshot for it to be reliable during a recovery. Making snapshots without talking to your application will probably work most of the time, but will definitely fail at some point. Sure, you could use a previous snapshot and use your transaction logs to roll forward to the current time, assuming all of your transaction logs are intact and the process works flawlessly. It just seems an unnecessary risk for such an important task.
Sponsored by Commvault