Fixing the Hybrid Cloud Storage Problem requires an understanding of how organizations intend to use it. First, most organizations continue to use on-premises storage to support their applications and users. Second, these organization’s data centers have multiple storage systems in them and more than likely they want to continue to use those systems. They’d like the option, however, to not have to keep upgrading and allocating additional data center floor space to these on-premises storage systems.
Leveraging hybrid cloud storage builds on the premise that, for most organizations, users and applications no longer access more than 80% of data, 90 days after creation. A simple plan is to move the inactive data within that 80% to the cloud. The problem is identifying the data that belongs in that group, transferring that data and making sure it is easily recallable. The identification, move and recall process is what breaks most hybrid cloud storage solutions as we outlined in our last blog.
At the heart of the problem is metadata. Metadata is the data about the data. It includes information like path, creation date, modification date, and owner. Metadata accounts for well over half of all IO, up to 90%. Typically, metadata stays with the file, so if a solution migrates the data to the cloud, even a simple directory listing can take enough time that users will notice performance degradation. The problem is that most solutions, if they address metadata at all, do so as an afterthought.
InfiniteIO – Fixing Hybrid Cloud with a Metadata First Strategy
Instead of running as an agent on the server or as a server appliance, InfiniteIO runs as an inline component within the network. It uses deep packet inspection to capture all metadata about all the files and stores a copy of metadata locally. There is sometimes a concern about inline devices impacting performance, but InfiniteIO improves performance because it stores metadata within its RAM. Performance is enhanced because InfiniteIO resolves all metadata requests within the network so they don’t need to travel completely across the network to their intended target. Resolving metadata requests within the network is especially valuable for Hybrid Cloud Storage use cases because metadata requests don’t have to go across an internet connection for resolution.
Because InfiniteIO installs like a network switch, it works with almost any file system on nearly every vendor’s storage without making changes to existing users, applications or storage mount points. It also only needs to “crawl” the environment once; all future updates are captured automatically by InfiniteIO as the changes traverse the network.
With metadata captured, automatically updated and centrally located, managing metadata is simpler. IT can set policies based on a variety of attributes including file type and last access date to manage data. For example, the organization can have a plan that moves all data not accessed during the previous 90 days to cloud storage, an on-premises object store or a high-capacity NAS. InfiniteIO can move data multiple times. For example, it can archive data not accessed in the last 90 days to a high-capacity, low-cost NAS and then after 180 days of continued no access, the solution can permanently archive it to the cloud.
A key value of a metadata-first philosophy also makes data recalls seamless. One of IT’s hesitations toward aggressive use of Hybrid Cloud Storage is dealing with a recall. Typically, products either make this a manual process or use stub files or symbolic links, which are fragile. InfiniteIO records the file’s location along with the metadata it stores. If InfiniteIO moves a file, it updates the associated metadata. To the user the file “looks” to be precisely where it was before the move, but no stub files are required.
Conclusion
Hybrid Cloud Storage and on-premises Object Storage promise reduced cost and almost infinite scalability. Organizations struggle with identifying and moving data to these platforms to fully utilize the private cloud storage, and users struggle with accessing old data from them. Both Hybrid Cloud Storage and on-premises object storage struggle with metadata performance issues. These capacity utilization and performance issues may limit user acceptance of a hybrid cloud storage strategy. InfiniteIO addresses all of these challenges by automatically identifying and migrating data based on policy. The solution also makes recalling an old file seamless.