The backup architecture will likely store 5 to 10 times the amount of data that the primary storage system holds. Additionally, organizations are looking to use backup data for far more than just an insurance copy of data. They want to leverage the backup copy for testing and development, reporting and business analysis. To keep up with this growth both the software and hardware components of the backup architecture need to scale, ideally in lock-step with each other. The problem is, most architectures don’t scale and if they do they don’t compliment each other in the way they scale.
Software Scaling Issues
The basis for most backup solutions is the concept of a single backup server, which is the primary control point of the backup process. In many cases, it is necessary to send all backup data to this single server. The server also houses all the metadata information such as the backup catalog and media index. Some enterprise solutions also require sending data to multiple secondary servers, often called media servers or slave servers, which act as an alternate point to send data. Media servers protect the primary server from data overload by directing the incoming data to storage arrays or tape libraries they control. This enables the primary backup server to focus on job management and maintaining metadata and indexes.
The software faces two fundamental scaling problems. First, in most cases, it is totally dependent on the underlying data protection storage hardware, which as we will learn, has scaling issues of its own to scale capacity. Most solutions are very limited in their ability to distribute the metadata and indexing information that the backup software must maintain. Second, at some point, the data about the data becomes too large for the backup solution to maintain, forcing IT to either implement another primary backup server module or prune history from the current system.
Scaling Data Protection Storage
Most data protection architectures today count on hard disk-based storage for most, if not all, of their protection storage. Making sure the raw capacity of that system can grow to meet the storage demands of the enterprise is critical. It is also important, however, that protection storage grows to meet the performance demand of the process.
In the past, the primary performance concern was how quickly protection storage could ingest data received from the various backup and media servers the enterprise had implemented. While ingest performance is still critical, a new measure of performance is now required to meet the desire to use backup storage to host recovered volumes and support copy data tasks like test-dev and reporting.
Scale-up hardware solutions have a predefined wall in terms of capacity and performance. Once that wall is hit, either the organization needs to purchase an additional data protection storage solution or it needs to perform a complex forklift upgrade to a new system and migrate old backup jobs to the new system.
The Cost Problem
The scaling limitations on both hardware and software lead to unpredictable costs. When either the backup software architecture or data protection storage reaches its capacity or performance limits, the “solution” is to buy an additional system or to upgrade the existing one. In either case, these “solutions” require the purchase of new hardware and more than likely additional software, not to mention additional support contracts. These additions are not small increments but major purchases that in most cases weren’t properly planned for.
Conclusion
When scale-up hardware or software hits the wall, it puts the entire data protection process at risk and data protection stops until IT addresses the limitation. In most cases these limitations appear without warning, and IT needs to scramble either to prune historical copies or to purchase additional hardware and software. IT needs to look for solutions that can incrementally scale both the software and hardware components of the data protection architecture as needed, as well as provide some predictive analysis of resource utilization in order to plan for the next increment.



