Where and how a data management solution engages with data affects how much value the solution can deliver. The problem is that most data management solutions are passive; they wait until data reaches a certain state prior to taking action. Additionally their execution is based on a sequential crawl of the file system. Network integrated data management is real-time. It can act on data as it crosses the network, putting it in the right storage location the first time instead of waiting.
The Types of Data Management
Most data management solutions promise to move inactive data, transparently, to another more cost-effective storage platform. The goal is to drive down the cost of storage without impacting user productivity. Other solutions analyze data to determine if it is active (hot) and move it to a faster, flash-based system. Finally, some solutions create a global file system so multiple storage systems act essentially as one, and data can flow between them.
Why Don’t Data Centers Manage Data?
Data management solutions have been around for a long time but adoption has been slow. There are many reasons for the slow adoptions. First, many organizations simply didn’t have enough data to justify the investment. Now most organizations are busting at the seams.
Second, the data management targets are more varied. In the “old days,” the target was tape. While inexpensive, it was hard to query and index. Even as hard drive archive became available, there was not enough price delta between the two storage types to show a significant savings. Now though, there are plenty of very cost effective private object storage solutions as well as public cloud solutions.
Third, the software that does the actual data management left much to be desired. It required changes to the applications or user profiles, making initial implementation very difficult.
In order to make a seamless experience for a user, the data management solution typically replaces the moved file with a stub-file that points to the new location. Additionally in order to determine which data qualifies for movement they have to “walk” the file system to find files that match their criteria.
Introducing Network Integrated Data Management
Infinite io is unique in terms of data management because of where it lives. Most solutions run on a stand-alone server and passively manage the data by crawling the file system to find data that matches various policies. Infinite io’s solution is essentially a layer 7 switch that sits inline, between the storage systems and the users/applications that are accessing that storage. After an initial scan, all actions are real-time, at the network, before the data reaches storage. No more file walks to determine where data should be.
The Infinite io controller maintains all the metadata, up to three billion objects per node. That gives Infinite io the ability to move data between storage systems without leaving stub files. It also means it can accelerate performance by terminating all meta-data only requests in the controller. Finally, there is no need to change the application or user IO path. They continue to access the data as they always have.
The unit is sold as a single unit or as a clustered solution. In the single unit use case, if the controller fails for some reason then there is a relay in the controller and it just becomes a wire. Users do have to manually path to their data though. The solution is available as a cluster configuration, which not only minimizes the chance of failure but increases the number of files and devices it can support.
StorageSwiss Take
Data management is a must have for the data center of the future. IT can no longer afford to keep all of its data on primary storage, but finding a solution to provide cost effective data management across a variety of platforms is a big challenge. At the same time, many organizations are trying to create an active archive that is more responsive than archives of the past. Infinite io provides a real-time answer, that doesn’t require stub files and supports both private object storage and public cloud storage.
