With EMC’s acquisition of Spanning, the data protection arm of EMC’s new core technologies division (CTD) now has a tool for protecting data in cloud based applications like Salesforce.com, Google Apps and Office 365. This was a needed acquisition that filled a hole in EMC’s data protection portfolio, however, as one of the analysts pointed out to Stephen Manley during last week’s analyst summit in Boston, this represents yet another backup data silo for EMC clients to manage. While on the surface these silos of backup may seem like a challenge, it fits into a much broader strategy that EMC hopes to leverage to transform their business as well as their clients’ – metadata mastery.
Catalogue of Catalogues
Between NetWorker, Avamar, ProtectPoint, Source One and now Spanning, there is the potential for EMC clients to have up to five separate backup and archive catalogues to manage and track their protected business data. Manley acknowledged that there is a need for a global cataloguing service that can aggregate all the backup metadata across the enterprise so that organizations can more easily manage and search for their data sets to support key use cases like data analytics, E-discovery and general purpose data recovery.
To be fair, this need for a “catalogue of catalogues” is something that Manley has been talking about for nearly 18 months. During the Data Protection and Availability Summit that took place in New York City in June of 2013, Manley stated that it will be critical to have a layer of intelligence that can span across hybrid cloud environments to provide businesses with the insight they need into their data assets, regardless of which storage or protection silo resource they reside in.
Metadata Mantra
During this most recent Analyst Summit, Manley emphasized that a big part of EMC’s strategic direction, with respect to helping organizations gradually migrate from 2nd platform, client/server computing environments, to 3rd platform cloud computing infrastructures, is for EMC to be a premier player in the metadata access layer. Metadata, or data about the data, is how data mining and business analytic applications can make complex queries across disparate data stores within the data center and/or hybrid cloud environments.
Metadata Lake Shore Drive
Metadata will also be a key enabler for allowing businesses to build what is referred to as a “data lake”. A data lake essentially is a mass grouping of data sets, regardless of the data format, into an enormous logical storage repository. Next generation or 3rd platform big data applications like Hadoop, can reference data residing in a data lake by querying against the file’s metadata. By building out an object storage infrastructure, for example, businesses can massively scale out their storage environment across private and/or hybrid clouds utilizing commodity disk to create the basin for a data lake. This information can then potentially be monetized by enabling data analytic systems to more easily access the information stored in a data lake.
Vehicles Gone Viral
As an example of the power behind how 3rd generation platforms can leverage information in new ways to totally transform industries, David Goulden, President of EMC II (core technology), referred to electric vehicle manufacturer, Telsa. By capturing machine sensor data on all their vehicles in real-time, Tesla constantly analyzes how their vehicles are performing. This information gives them a huge competitive advantage as they can make refinements to the software onboard their vehicles to iteratively improve the end-user experience. Compare this to traditional car manufacturers which can only provide refinements to their vehicles from model year to model year. It is this focus on both IT and the end-user that EMC believes will lead the way towards transforming IT and the industries they serve; and metadata management will be the cornerstone in that transformation.
More Choice Through Metadata
EMC’s strategic direction is to be a major player in the metadata management layer in this emerging 3rd platform. To be sure, EMC will continue to focus on driving revenue in the 2nd platform, as the lion’s share of their revenues are still derived from traditional client/server infrastructure. But EMC’s technology investments will increasingly be steered towards enabling their clients to bridge the gap between 2nd generation and 3rd generation infrastructure – as is evidenced by their recent acquisitions of Spanning, Maginatics and cloudscaling. These acquisitions are also examples of EMC’s often repeated mantra of giving customers “more choice”. In fact, EMC CEO Joe Tucci stated during the summit that there will be no single dominant player in the 3rd platform market. By focusing on the metadata access layer, EMC believes they will be in a position to transform both their clients businesses, as well as their own, for success in this brave new world of IT.