Storage consolidation is a lost cause. Why? The reason is that both the storage hardware and the use cases for that hardware are too different from each other. I outlined the problems in detail in my blog The Need for Storage Fragmentation. Storage fragmentation is fine as long as IT has the time and tools to analyze data to make sure that the right data is on the right type of storage at the right time.
The problem is the one resource IT lacks the most is time. To solve this problem, most data center managers will consider one of three solutions; re-consolidation around flash storage, storage virtualization or data virtualization.
Storage Consolidation – A Flash Fix?
A potential solution to the storage fragmentation problem is for the organization to try, once again, consolidating storage. This brings to mind the old saying “insanity is doing the same thing over and over and expecting a different result.” The “difference” this time is that the consolidated device is all-flash, which essentially solves the problem by treating all data as high value forever. For the data that is high value, flash is fine. For data that is not, storing on flash is a waste of budget. That makes storage consolidation using flash too expensive for most organizations.
Storage Virtualization – Answering the Wrong Question
Storage virtualization attempts to solve the problem by consolidating various storage hardware systems into a single storage software platform. It replaces all the features the IT team may actually like and integrate into their processes.
In addition the entire storage system has to be assigned to the storage virtualization engine, often as a block device. If your application calls for NAS or object storage, you are out of luck. Finally the storage virtualization is typically an in-line appliance, meaning it is in-between every I/O. In a hard disk era, the extra latency that the storage virtualization appliance may add is probably negligible. In an era of flash, it can be a significant problem. Also the inline design rules out the use of server side storage resources like PCIe flash or memory bus flash.
In many cases, storage virtualization is guilty of answering the question no one asks. The problem, again, is not with the underlying storage hardware’s software, it is making sure data is in the right place at the right time, preferably automatically. While some storage virtualization solutions will automatically move data from a flash storage system to a hard disk based storage system, the movement of data is not granular. The entire volume has to move, not just the specific sub-set of data.
Data virtualization is a more granular approach to moving data. It is also an out-of-band solution. It works much like a DNS server: applications ask the data virtualization engine where a certain data set is and it points the application to that data. From that point forward, the communication is directly between the application and storage.
Because it is out of band, this also means that data virtualization can be more agnostic as to location and storage type. It can point to data residing on a server, on a storage network or in the cloud. A key requirement for data virtualization is analysis and automation. These solutions should be able to analyze data and available hardware, and automatically determine which storage solution is best for which data set. This lays the framework for managing data by objectives instead of by which user is screaming the loudest.
Storage fragmentation, at least from a hardware perspective, can be good for the enterprise. And in fact, “fragmentation” accurately describes the current storage situation at most enterprises today. The key to making fragmented ecosystem work efficiently is to make sure that data can move seamlessly between the various storage systems breaking down the silos, and allowing the characteristics (performance, capacity, price) of each system to be maximized. This is where data virtualization is emerging as an exciting new possibility.