Organizations need to re-think their storage architectures for the data-driven economy. How an organization captures, stores, and analyzes data, can dictate how successful it might be in this new economic model where data drives organizations.
The data-driven economy is not limited to using artificial intelligence (AI) and deep learning (DL) to make better decisions. While those applications are undoubtedly critical, other key drivers include the availability of 5G, which increases mobility and creates new edge/core relationships as well as rich media for training and monetization. Uses cases for the data-driven economy include autonomous vehicles, connected cities, personalized medicine, customized media and entertainment, new financial models and markets, as well as the “everything-as-a-service” phenomenon.
The data-driven economy requires new storage and data management architectures. These architectures need to move beyond traditional scale-out storage solutions that only provide distributed capacity. The storage architecture for the data-driven economy needs to leverage scale-out to distribute all data management services like metadata storage, data indexing, and search as well as security and analytics processing. Also, the storage infrastructure that supports the data-driven economy won’t reside in a single location or data center. Its distributed nature requires a single global namespace that enables storing and accessing data and applications from anywhere.
Unstructured data from sources like sensors and devices are the foundation of the data-driven economy. Unlike in years past, that data is a mixture of large and small files. The number of files can measure in the billions, and they can consume multiple petabytes of capacity. The storage infrastructure that supports the data-driven economy needs to feed graphics processing unit (GPU) compute farms for deep learning workloads. These use cases require a storage infrastructure that can support massive volumes of unstructured data. The organization, in many cases, can only capture the data once so that data needs long-term retention and insurance that it is not changed.
The combination of requirements for the storage infrastructure seems at odds with each other. For example, autonomous vehicles, as they move beyond level 2 autonomy, requires development vehicles that generate over two petabytes of data per car per year. These capacity requirements mean that storage infrastructure for the data-driven economy can’t just be an all-flash array. The cost delta between flash and hard disk drives is still too high. More than likely, the data-driven economy’s storage infrastructure will have a scale-out, highly parallel, hard disk-based storage system at its core and a high-speed flash at the edge.
The storage infrastructure for the data-driven economy looks similar to high-performance computing infrastructures in that it needs to deliver high-bandwidth, and highly parallel IO, but it also needs to store petabytes of data cost-effectively.
Introducing SwiftStack 7 – Storage Infrastructure Designed for the Data-Driven Economy
SwiftStack, at its core, has always been a leader at driving throughput performance at scale. As proof, the company often partners with NVIDIA on deep learning projects. Its software-only model enables an organization to build highly scalable, cost-effective storage solutions. Beyond the storage software, SwiftStack also provides 1space, software capabilities that allows SwiftStack to move data seamlessly between the on-premises data center, the edge, and the public cloud.
In its latest release, SwiftStack 7, the company is focusing on enabling the data-driven economy. The new version further improves performance and now delivers over 100GB/s of bandwidth at scale.
SwiftStack 7 adds distributed file services to support edge, core, and cloud with ProxyFS Edge. Instead of bottlenecking at the storage system with a gateway, SwiftStack uses its ProxyFS agent via a containerized deployment, directly to the edge providing almost unlimited scale. ProxyFS Edge also caches at the edge to minimize latency and improve edge application performance. The solution leverages a load-balanced, high-throughput API based communication back to the core for consistent, bottleneck free performance.
SwiftStack counts on 1space to move data between edge, core, and cloud. Its new 1space file-connector enables organizations to easily bring existing data from legacy file-systems and NAS devices into the SwiftStack namespace and move that data only if needed. The communication is bidirectional so that modern cloud-native applications can access existing data directly on the NAS without migration.
The SwiftStack Data Platform enables the data-driven economy. Its shared-nothing distributed architecture enables cost-effective, high capacity storage with ultra-scale performance to keep GPU compute complexes busy. The SwiftStack Data Platform allows data ingest and access from edge to core to cloud instead of forcing organizations to install point storage systems at each location and for each use case.
Also crucial in the data-driven economy is data immutability. In deep learning workloads, humans do much of the initial data capture, which means it is expensive to acquire. The data being captured and analyzed may need extremely long-term retention, and it often can’t be altered. The SwiftStack data platform provides both data durability and data immutability to enable secure long-term data retention.
SwiftStack is carving out a role for itself in enabling the data-driven economy. Its ability to provide cost-effective, high-bandwidth performance and seamless edge to core to cloud data movement make it a compelling infrastructure for IT to consider.