Our top read article, by far, recently is “What is Data Profiling”. My colleagues Eric Slack and Colm Keegan also just hosted a well attended webinar, “How To Attain Sustainable Storage Savings“, now available on demand. At the core of both the article and the webinar is the subject of data management. All of a sudden data management is cool again.
What is Data Management?
Data management had become a bit of a forgotten skill in data centers. The practice operates under the assumption that all data has value, but that value will vary over the course of time. There are different types of storage available to store this data that vary in cost based on speed of response and capacity. The goal of data management is to match the present value of the data with the cost of the tier of storage it is residing on. Since the value of data changes, where it is stored should change also.
The payback in investing in a data management process is reduced storage and data protection costs. The downside to managing data is that if there is a miss-match, data can unexpectedly become active and there is an impact, usually in the form of a “wait”, for the requester, which leads to user frustration.
In the data centers of the past, data management was a required skill because hard disk storage was expensive and its capacities limited. But as the cost of hard drives decreased and technologies like scale-out storage came to market, the need to migrate data to less expensive tiers became less pressing. While always more expensive than tape, disk systems were able to narrow the price delta with tape enough to not make the savings worth the above mentioned user frustration. Instead of managing data, users added disk shelves to existing systems which they viewed as a simpler solution to the problem.
Why is Data Management Cool Again?
There are two key changes occurring in the data center to make data management cool again. First, primary storage has once again become expensive. There is now a more pressing demand than capacity that data centers have to deal with – performance. In response, IT planners are resorting to implementing a variety of flash assisted and all-flash storage systems. Despite various technological advances to drive down the cost of flash, these systems remain more expensive on a cost per GB basis. While there are some data sets that have the performance that can justify the expense of flash there are other data sets that cannot, those still need to be on disk. Making those determinations is part of data management.
The second data center change that is making data management cool again is the rate at which data storage demands continue to grow. Organizations are storing more data than ever and that rate is accelerating thanks to the various devices that make up the internet of things (devices, sensors, cameras). In other words, it is not that data is growing, it is how fast data is growing and the causes of that growth.
Scale out NAS and Object Storage certainly provide the technological capability to store all this data on disk for a long time. The problem is the on-going cost of those initiatives is becoming daunting. IT is learning that it is not the cost of the disk itself, it is the cost of keeping thousands of disks active and available for the next several decades that becomes the challenge. The cost to power, cool and commit data center floor space to this data, especially at its growth rate, is overwhelming.
As a result, tape storage is once again becoming a popular option and when augmented with the right amount of disk, the technologies can become a perfect compliment for each other. But again knowing what data to place where and maybe more importantly, finding this data when it is needed, are key challenges. This again is the role of data management.
These two changes are part of the reason why data management is cool again but to become cool, data management can’t be looked on as a never ending chore that is the IT equivalent of organizing your sock drawer. For data management to become truly cool it also has to evolve into a process instead of an event. A process allows for automation and improvements to ease of use. In my next column, I’ll discuss what technologies we are seeing that is allowing Data management to become a cool process instead of a Herculean event.