Storage solutions that offer customers familiar, flexible interfaces (e.g.files and file attributes), while also offering advanced features like automated data movement and cloud bursting, offer a unique solution to many problems. The more seamless we can make things for the end user and the simpler we can make things for the storage administrator, the better things are for everyone. We call such solutions a data fabric.
For those unfamiliar with the term data fabric, consider the following explanation from a blog Storage Switzerland’s Lead Analyst George Crump. He says a data fabric “sews together data management, data placement, performance optimization, and access management to enable storage resources to be automatically provisioned to requesting users or applications in a self-service manner. This means data can move between storage systems within a data center and/or to the cloud without changing user processes.”
If a data fabric is a good thing, how should it manage the data? Should it offer the full capabilities of a file system? Or should the fabric just move blocks around and not support file system attributes? Let’s take a look.
File systems are the predominant way end users and applications interact with storage. In fact, there are very few applications (e.g. a few databases) that only know how to talk to block storage. For both end users and administrators, file systems are simply much easier to use than block devices. This is why, when given the choice, administrators often choose a file system interface.
Consider the server virtualization market, where virtualization administrators have the choice of using block volumes or file systems for their datastore. They are overwhelmingly choosing file systems as their preferred method. With block-based LUNs, you are stuck with the attributes of the volume as it is provisioned. With file systems you have a lot more flexibility.
While the virtualization market illustrates the preference of many people for file systems over block devices, it does not demonstrate the many advantages of file systems. Block devices simply do not have the ability to store the kinds of metadata that can go with files. Every single file can carry a significant amount of metadata that can indicate a variety of things about that file, including origin, purpose, type of data, and how that data should be treated over its life cycle.
File systems also offer an unlimited level of granularity. Files can range in size from a few bites to many terabytes or even larger. If an application or user needs to create thousands of small files or handful of huge files, the storage administrator and system administrator needs to do nothing other than create the file system. This kind of flexibility simply isn’t possible in a block device without significant work on the part of an administrator.
File level granularity and metadata can be combined with a number of other services, such as quality of service features, data protection functionality, and movement of data. Files can easily be given different levels of performance, data protection, and different geographic locations based on metadata created by the user or application. Metadata will also contain information such as how often the file is being accessed, allowing an intelligent file system to automatically migrate lesser used files to a less expensive location (e.g. automatic migration of read-only file types, like MP4s, to flash devices with declining write endurance).
The one challenge to traditional file systems is that all applications and users that use them assume that they have unlimited space for their purposes, and obviously this is not the case. Historically, file systems reside on volumes that must be resized in order to meet the growth of the file system. While they can be resized to a certain extent, there are often limitations such as the number of files and the maximum size of a particular volume. Migrating data from one volume to another in order to reduce cost also creates difficulties because the application expects the file to be in its original location.
This is where a file system-based data fabric comes in. If a file system is used as the foundation for a full-fledged data fabric, one can get the advantages of a traditional file system without the disadvantages. Volumes can automatically grow as capacity needs dictate, files can automatically be migrated to other systems that are either less expensive or geographically closer to the application. Such a solution would also not suffer the file count limitations of traditional file systems. Finally, a file system-based data fabric would be able to automatically migrate files across multiple geographic locations, to/from the cloud, and between different underlying storage technologies, while still allowing the application to access the file via its original path.
The file system is clearly the predominant interface to storage today. It offers a lot of flexibility in a number of areas. The few limitations that it has can be solved via more modern file systems that blur the lines between multiple data centers and the public cloud, while still maintaining the familiar interface of a filesystem. A file system that combines QOS features with automatic migration between a variety of locations would be a valuable tool indeed.
Sponsored by Elastifile