In a recent article, Storage Switzerland introduced the concept of a data fabric. It is essentially a storage architecture that spans a variety of locations ranging from on-premises to the cloud. The goal is to create a data flow, where data moves to the right location at the right time. IT is struggling to find a storage solution that can service legacy applications, be the foundation for modernized IT, and enable an eventual (or occasional) move to the cloud. As IT evaluates potential solutions, on-premises use cases represent the first, and most critical, workflows requiring support.
Legacy Applications Matter
While modernized applications are getting all the headlines, most (if not all) enterprises have legacy applications that are critical to day-to-day operations of the business. They need a high performing, scalable storage infrastructure that supports standard protocols like NFS. Organizations would love to leverage cloud economics and scalability as they refresh the storage architectures these applications run on. The problem is that most of the modern storage designs do not provide backward compatibility and high performance. As a result IT treats the legacy application stack separately from the modernized stack, leading toward what many call bimodal IT.
If the organization looks at the storage architecture instead of the individual storage system, they could indeed find solutions that provide performance as well as backward protocol compatibility, all while leveraging cloud economics and scale. An all-flash data fabric, running on standard server nodes, would be able to deliver millions of IOPS of performance at very affordable price points.
Containers Are Coming
IT is under pressure to become more agile. Part of this agility is the ability to respond to new requests for application deployment or performance very rapidly. Many solve this by starting up additional instances of an application as user count or I/O requests increase. The problem is how to rapidly start additional instances. Virtualization was the initial answer to agility. Virtual machines recreate, virtually, the entire physical server, which means it takes time to create and deploy one. And it takes a lot of time to deploy hundreds of them, let alone thousands.
Cloud providers faced this challenge early on, and embraced container technology for their applications. Containers are much “thinner” than virtual machines, and a given server can hold many more containers than it could VMs.
The problem for enterprises, as they modernize their infrastructure and leverage containers, is many apps need persistent data storage, a concept that was foreign to initial containerization technologies. To achieve some form of statefulness, the data related to a container had to be copied, repeatedly, as each container came online. In other words containers scaled instantly, data does not. Recently, however, containerization technologies have evolved to support statefulness via externally connected storage. Now the question is simply which storage to choose.
A key component of a data fabric is its scalable multi-node architecture and data sharing capabilities. These capabilities are an ideal perfect match for persistent container storage. A true data fabric can enable containers to share data between nodes, enabling seamless container migration/restart without losing data. Also by providing a single, global namespace, a data fabric enables containers to share data between them, even if running on different nodes. This data-level flexibility is a perfect match to the container-level flexibility delivered by container orchestration platforms like Kubernetes or Docker Swarm.
Parallel Computing Paradox
An increasing number of enterprises are also now requiring the ability to process millions of very small files created by machine logs or data collected from sensors. While this data is small is size it is vast in number. Its discreteness makes it ideal for massively parallel, high-performance processing but the storage system that stores this data has to keep pace. Thanks to big data analytics, the storage architecture can no longer just be a cheap, deep storage system. It has to respond to potentially hundreds of compute nodes scraping it for data.
Once again, an all-flash data fabric is able to deliver not only the performance these analytic environments require, but a cost effective scale as well. These environments also have a mixed need for various protocols because often the devices sending data require legacy NFS protocols while in some cases the analyzing applications require more modern RESTful APIs.
The enterprise is in a state of transition as it moves from legacy scale-up applications to more modern scale-out infrastructures. Bimodal IT is often drawn with a hard line between these two states. A data fabric creates a single storage architecture allowing that line between legacy and modern IT to be more transparent, enabling modernization to flow through the line instead of having to jump over it.
Sponsored by Elastifile
Elastifile is redefining the way data is stored and managed, enabling seamless deployment of private and hybrid cloud solutions. With enterprises and service providers increasingly seeking to support both on-premises and cloud workflows, Elastifile delivers a cross-cloud data fabric that unifies the data across these environments within the single global namespace of a global file/object system. The easy-to-manage, easy-to-deploy, elastic, scale-out architecture intelligently shares the resources across all environments, providing the optimal solution for enterprise data and related application services. For more information visit www.elastifile.com.