Thanks to recent advancements in software and hardware, data centers have a unique opportunity to make their storage infrastructures more responsive, more cost-effective and easier to manage. For the past few years, primary storage has had this opportunity because of all-flash arrays and hybrid flash arrays. Flash based systems have the potential to consolidate data center production storage to a single system. Secondary storage, on the other hand, is a mess. Bringing this same simplification to the various protection and replication processes is a greater problem. Converged Data Management (CDM) solutions may be the best hope yet to solving this problem.
Defining Converged Data Management
Many IT professionals may mistake advanced disk backup appliances as CDM; they are not. While disk backup appliances are becoming more sophisticated, offering features like tighter integration with applications, they are not CDM solutions. A CDM solution is designed to support all types of secondary storage use cases. These use cases can range from providing replication services for Tier 1 or Tier 2 apps, to backup, and archive. CDM solutions should also provide robust search, so that the proverbial needle, of data, can be found in the haystack. They can even provide basic file serving as well as file sync and share capabilities. Because of these new use cases, CDM solutions need to have I/O performance optimized for fast ingest as well as serving as a live data source.
CDM solutions however are more than just multi-purpose hardware, they also have the ability to perform many of the data protection and management functions of separate software solutions. CDM solutions will leverage available API sets to provide agentless integration with existing environments and protect them. VMware is an ideal example, its VADP API allows a CDM solution to interface with the hypervisor and performs backups of its virtual machines (VM). The ideal CDM solution will also provide its own APIs that allow for integration into users’ existing management consoles and automation solutions.
The Five Components of a CDM Solution
CDM solutions will have five critical components. The foundational element is a storage software layer that will provide a scale-out architecture similar to the distributed computing models used in hyper-scale data centers. This allows the storage software to provide capabilities like non-disruptive scalability and true multi-node deduplication and compression.
The second component is the hardware. While built on a software foundation, the speed at which most data centers operate requires that even vendors that are software focused provide the equipment. IT professionals want the turnkey experience.
The third component is the software that interfaces with the application APIs mentioned above. Environments like VMware, Oracle, MS-SQL and Exchange all have well-defined APIs. Power APIs mean that agentless interfacing with these environments can provide the benefit of quick implementation while providing application consistent protection.
The fourth component is the ability to provide “live I/O”. Increasingly IT professionals are looking to recover their applications directly on the data protection device. The ability to recover on the CDM eliminates recovery time lost to a network transfer or rehydration, and reduces the downtime associated with waiting for a production storage system to be fixed.
Finally, the fifth component of a CDM solution is that it should have a “cloud-out” capability. Cloud-out allows a data center to leverage the cloud to store a disaster recovery copy of data or to act as an archive to curtail on-premises data growth. Once that data is in the cloud, it should also have the ability to be “acted upon” by analytics processes or cloud-based disaster recovery.
The ROI of Converged Data Management
The return on investment (ROI) of a CDM solution should be significant. The first element of its ROI is the overall simplification of the data protection process. There are so many moving parts in today’s “stitched together” data protection architectures it is a miracle that they work at all. A CDM solution, over time, will be able to replace most of the data protection software in the environment and all of the secondary storage hardware. Collapsing the data protection architecture to a few software applications and one hardware platform will make management of that platform significantly easier.
The second ROI element of a CDM solution will be the ability to use less expensive hardware. CDM solutions will typically be built using a distributed computing model similar to Google and Amazon. A distributing computing model enables the CDM vendor to leverage commodity servers and storage (albeit in custom configurations) to meet the requirements of the data center. While these systems will have the appropriate level of redundancy for the task at hand, they will still be less expensive than the traditional scale-up data protection hardware that needs to over-provision processing power and DRAM.
The third ROI element of a CDM solution is the hard cost gains made by eliminating most of the data management software, and in some cases all of it. While this elimination is part of the simplification ROI described above, the hard cost savings of not having to license multiple data protection applications can’t be understated.
Finally, because CDM solutions have the ability to provide “live I/O” they also have the potential to replace secondary storage for test and development. In this use case the power of the converged data management system’s snapshot becomes valuable so that test/dev teams can work on many near production copies of data without fear of corrupting primary data sets.
Moving to a Converged Data Management Solution
The biggest challenges that IT faces when trying to migrate to a new technology is deciding where and how to start. CDM provides an easy crawl, walk, run path to full adoption. The “crawl” phase of adopting CDM should be as a backup solution for newer virtualized workloads. The CDM provides an intelligent, scalable distributed software architecture that runs on commodity hardware. It should provide better performance at scale with better data efficiency at lower costs.
The “walk” step is to look for opportunities to use the CDM as a standard backup solution for current virtualized and physical workloads. The ability to provide replication at the CDM-level will be attractive for IT departments looking to reserve primary storage resources. File level search and restore will also provide significant time and infrastructure savings.
The final step, the “run” phase, is to start leveraging CDM platforms for Test and Development use cases where the CDM platform will serve as a mountable storage device. Doing Test and Development off of a CDM platform avoids the possibility of resource contention on primary storage and provides a user-friendly search and mount experience at the file-level.
While the production storage environment is becoming more streamlined, the backend data protection infrastructure is becoming more fractured. User expectations for recovery are becoming more demanding. To keep up IT professionals are forced to use an increasing number of hardware and software solutions, hoping that one of them will help them meet their commitments. CDM promises to change all that. Delivering a converged platform based on a distributed computing model, CDM solutions should allow a single hardware/software platform to protect the entire production storage infrastructure.
Sponsored by Rubrik
Rubrik’s Mission: To transform complex IT into simple, beautiful and complete products. The goal is to liberate IT professionals from the limitations of legacy IT by bringing web-scale technology and consumer design to the enterprise. The team is made up of technologists and engineers who have built Google File System, Google Search, YouTube, Facebook Data Infrastructure, VMware Virtualization, Data Domain File System, and Amazon Infrastructure. Rubrik’s converged data management platform eliminates backup software by integrating data protection, instant recovery, and DevOps infrastructure into a single fabric.