While the applications that protect data have vastly improved over the last 20 years, they still often struggle to keep up with the technical challenges of data growth, shrinking backup and recovery windows and demands for greater disaster resilience. At the same time, user expectations have also risen.
Application owners and storage and virtual machine administrators are now insisting on an increased level of control and visibility into the data protection process. This often leads to these individual stakeholders deploying their own data protection infrastructures. Sometimes this is simply the outgrowth of a desire to use the native backup utilities that come with applications, but other times it’s a reflection of a loss of confidence that their individual recovery objectives can be met using the existing tools and processes offered by the backup team.
For the most part, it is not the backup applications that are to blame, rather it is the backup process as a whole as it futilely attempts to meet those expectations. Consequently, the architecture of data protection needs to fundamentally change to deal with these new realities.
What is a Data Protection Architecture?
A data protection architecture is not selecting one enterprise backup application and forcing every production application and user to use its tools to protect their data. In fact, efforts at enforcing standardization are seldom successful, especially over the long term. Instead, data centers need an architecture that allows data protection to be delivered as a service, with the flexibility to allow different modules to be plugged in at the appropriate time to address specific user or application requirements.
With a data protection architecture there is no lock in to a single application or storage device, rather these individual tools can snap right into a broader framework. In short, it allows for a mixture of various data protection tools from stand alone software, to storage systems to dedicated backup appliances to be used as a service and managed as a single entity.
The Requirements of a Data Protection Architecture
As discussed above, the first key requirement of a data protection architecture is flexibility. The focus for both the application owner and the backup administrator should be on meeting the data protection and recovery need, not on how to conform those needs to the capabilities of a single tool. To best meet data protection requirements enterprise wide therefore, a variety of tools should be used.
For example, snapshots may be used for short term protection and rapid recovery while a disk backup appliance will be used for long term retention and compliance. A data protection architecture allows the protection of the production applications to be addressed by multiple tools. The ultimate goal of the data protection architecture is to be the framework by which disparate protection tools can universally access backup resources so that protection can be delivered as a service.
The second key requirement of a data protection architecture is that the individual data protection utilities need to be able to write directly to storage. It takes time for data to travel to an intermediary location like a backup server. Doing so creates extra trips across network infrastructures, both for backups and recoveries. Direct access is essential for meeting the continual demands of ever decreasing data protection and recovery windows.
Direct access also allows data protection tools to extend their usefulness. For example, as mentioned above, snapshots can write their most recent data to primary storage as they do today. As the snapshot ages, data could then be moved to disk backup appliance storage with deduplication for more cost effective retention of that information.
The third key requirement of a data protection architecture is for data to be kept in its native format. Since multiple protection utilities within the data protection architecture may be operating on this data, it should be in a common format. The native format is the common denominator needed to allow interoperability.
An increasing number of data protection tools have the ability to run an application directly from the backup storage location. If data first has to be exported out of a backup format prior to execution, the near instant restore time of direct recoveries is greatly hindered.
There is no longer a need to store information in a proprietary format. The justification for proprietary storage of backup data came from the days when tape was THE backup storage device. Proprietary formats were used so data could be packaged up and the tape drive better streamed. Now, thanks to disk backup appliances, prepackaging is no longer necessary.
The final requirement is that, similar to a loose collection of states, the data protection architecture needs a governing body to make sure all the required components work together to simplify management. A key component is a central catalog for indexing and searching for information when needed for recovery or compliance purposes. While many data protection utilities have these capabilities built-in, there is no single, global view across all the disparate catalogs. The advantage of a global view is that from one interface, the backup manager would know all the locations of a relevant piece of data and have the visibility to determine which copy of data is most appropriate for the recovery request at hand.
In addition to a central catalog, central control is also needed so that each individual tool does not need to be continually, manually interacted with. In addition, as this architecture evolves, data protection tools and applications may simply stop shipping with a user interface. Instead the application will simply embed into the central data protection architecture framework. This would eliminate a significant part of the usual application development effort and may lead to increasingly specialized tools for specific needs. Essentially third party developers won’t need to reinvent the wheel (basic backup foundations) in order to deliver a high quality data protection module.
Data Protection Architecture in Action
While in its early stages, companies like EMC are already starting to deliver solutions based on the foundations of a data protection architecture. EMC refers to it as a “Protection Storage Architecture”, but the point is they are taking the approach of applying the best tools for the job, both software and hardware, and integrating them to deliver an architecture-versus a point product.
It is this type of cross pollination of “best-of-breed” data protection modules between and across various data protection tools, that will characterize the interchangeability and flexibility of the data protection architecture framework. You can read more about their approach here.
The Data Protection Architecture ROI
The ROI on a data protection architecture should be a significant reduction in the amount of time spent managing and supporting the backup process. It should also lead to less shelf-ware, where products are no longer used because the data protection needs have outgrown them. The use of storage resources should also be more efficient, leading to lower capacity investment costs. The real value, however, will be meeting the customized protection and recovery needs of the application owners and users while reducing the backup manager headaches that often result.
EMC Data Domain is a client of Storage Switzerland