A recent blog by B&L Associates brought to light the problem of data loss. The entry cites surveys from Kroll Ontrack and EMC, which indicated that organizations see data loss as their single most significant risk and that the cost of losing data is almost two trillion dollars worldwide. Assuming that most of these organizations have some form of a data protection process in place and that those processes include some form of disaster recovery, why does data loss occur and how can it be prevented?
Why Does Data Loss Occur?
A data protection process should include some form of backup application and a secondary storage device. Most data centers have multiple applications and storage devices. Obviously, the primary purpose of this process is to make sure that data loss does not occur. If we’re to believe (and we see no reason not to) the Kroll OnTrack survey’s finding that 63% of IT professionals see data loss as their number one concern then obviously these processes are somehow falling short. There are three main reasons this is occurring.
Your Recovery is only as good as your Copy
The first cause of data loss from within a protected data set is that the copy was either no good in the first place or it degraded over time. There is a marketing phrase that data protection vendors like to throw around, “It is not about backup it is about recovery”. But this is totally inaccurate. Data protection IS all about backup. If the data copied is not perfect then any form of recovery is a waste of time. This means that data not only needs to be copied in a static state it also needs to be stored on a device that can maintain the integrity of that data over the course of time.
Getting good copies means selecting backup, replication, and snapshot software that can interface directly with the applications they are protecting to make sure that its data is backed up in a valid state. Integrity means that the data protection software and associated storage has the ability to audit itself and make sure that data has not degraded over time. Look for a disk backup or tape backup system hardware that can perform these audits of the media either in conjunction with the software or independently.
For applications, the real verification of copy quality is achieved through actual recovery. Thanks to virtualization and intelligent data protection applications, there is almost no excuse not to test the recoverability of a data center’s key applications on a regular basis – as often as once a week. Many of these applications can instantiate an isolated copy of all of an application’s components and data to make sure that a complete and valid copy of data was captured. This process also provides IT administrators the regular practice they need to flawlessly execute a real recovery when the time comes.
Frequency of Protection
The second form of data loss occurs when data is created or modified and then subsequently lost before a data protection event occurs. Today, since almost every project or task starts in digital form with no hardcopy ever made, data protection processes needs to occur much more frequently than the traditional once per night backup. Organizations should be leveraging snapshots, replication and change block tracked backups to perform lower impact data protection events more frequently so that these during the day changes can be captured.
As the number of applications that protect data and the frequency of protection events both increase it is important that IT document which applications are protecting which data sets and how often. This documentation should provide a map to an operator as to which protected copy should be leveraged for each particular restore situation.
Disaster Recovery As a Checkbox
Disaster recovery (DR) is a requirement for almost any organization of any size. But many organizations just tick the DR checkbox, with no real thought given to what will happen if there is an actual disaster. For example many companies simply ship or replicate data to a secondary site, but that site is not capable of sustaining the organization in the event of an actual disaster. It doesn’t have the power, cooling, networking or in some cases servers needed to run even a subset of the organization’s data.
Another DR challenge is that the secondary site is not far enough away from the primary site. Our advice is that if you can drive to the DR site then it is probably too close. In the event of a natural disaster the DR site is likely impacted by the same event as the data center.
Bonus Cause – Shadow IT
Another potential cause of data loss is shadow IT. Thanks to users self-selecting IT services, there is clearly the chance that data can sustain its entire lifecycle outside of IT oversight. We’ll discuss getting a handle on data loss caused by Shadow IT in an upcoming column.
Data loss is a big problem, but solutions exist that should practically eliminate it. Those solutions need to be combined with well documented IT process to make sure that data integrity and application recovery is verified and that operations knows exactly where to recover from and under what circumstances. In our Backup 2.0 Road Tour Workshops, done in conjunction with TechTarget’s Storage Decisions events, you can learn how to eliminate data loss as well as how to create processes and procedures to make sure that the right data is recovered at the right time under the right circumstances. The Backup 2.0 Road Tour is coming to a city near you, click here to find the upcoming schedule. Don’t see one close to you? Leave a comment about where else we should visit.