There are four storage trends driving data center modernization. These trends are enabling storage architectures that are both dense and have massive scale, while also providing rapid response to applications and analytics. At the same time organizations are global and demand data be available anywhere and at all times. These demands and trends for the future require a data fabric to support the storage architecture.
The Trends Shaping Storage
There are four major drivers for data center modernization. The first is the software-defined data center. Aided by modern CPUs, software is able to decouple itself from proprietary hardware. The data center can then safely leverage standard servers. Software vendors, freed from the need to develop hardware, should be more innovative in their solutions.
The second trend driving modernization is operating at scale. Storage clusters are increasingly able to fill the promise of scaling to any size the organization needs, and thanks to software-defined solutions, the organization can afford to scale. The challenge is, as the clusters scale to increased node counts, storage software needs to be more efficient at inter-node communication and at managing massive metadata indexes.
The third trend driving data center modernization is flash. Since the introduction of enterprise-quality flash modules, storage has gone from “worst to first”, media is now the fastest component of the architecture. Its density allows for even greater scale per node and its near zero latency enables rapid answers to queries across large data sets.
The fourth trend is the cloud. But modernization is not just using the cloud for cloud’s sake. It is the organization’s understanding of the cloud as a tool and how to best leverage it. Modern data centers are looking to use the cloud as a more temporal work space to handle peak loads, perform test/dev and to act as a bridge between multiple on-premises locations. The on-premises data center is for permanent use cases like data storage and production applications.
The challenge is unifying these trends so the organization can leverage data as a competitive advantage. Today organizations have to patchwork all the various types of data and locations together prior to driving any real benefit from it. The patchwork nature of data increases expenses, leaves actionable data out of the consideration matrix and slows the organization’s responsiveness.
Introducing MapR-XD
To help solve the patchwork data problem MapR is introducing MapR-XD Cloud-scale Data Store. For those already well down the path with big data analysis projects, MapR may be a household name. The company used that market as its initial beach head while developing a more universally applicable, scalable data storage architecture.
Now MapR is introducing MapR-XD, a storage architecture for a much broader set of use cases but that leverage the company’s expertise in the analytics market. MapR-XD is designed to create a highly scalable storage fabric that unifies data across the organizations data centers, edges and across multiple clouds.
It is designed to support legacy operation applications, media and entertainment applications, image processing, analytics processing and container based apps. It has capabilities that enterprises will expect like multi-tier data storage, scale-out growth (trillions of files, exabytes of capacity), point-in-time snapshots, distributed mirroring and quality of service assurances for specific applications.
In addition to covering the basics, MapR-XD also eliminates data silos by establishing a global namespace that can span core to edge to cloud data placement. Data can move seamlessly between those locations and its reference point remains unchanged to the users and applications. Users can access data via NFS, HDFS, and a special POSIX interface that provide 10X the performance of traditional NFS.
While scale-out architectures are not unique, most node counts are artificially limited due to poor software design. As node count grows the east-west communication between nodes, as well as massive metadata management, starts to cripple performance. MapR-XD is flash optimized to get the most out of the core components of the cluster. Multiple instances of the NFS file server are run in the cluster. Each instance runs on a single node as a single process. Multiple instances are a way to maximize the performance characteristics of the underlying storage system. Thereby, the maximum benefit of configuring multiple NFS instances can be achieved on an all-flash platform.
Additionally, MapR-XD provides automated capabilities to enhance node interconnect and flash performance such as logical partitioning, parallel processing for disparate workloads, bottleneck avoidance with I/O shaping and optimizations, to ensure maximum performance across the cluster.
StorageSwiss Take
Data fabric solutions are still unique but are becoming more common. What differentiates MapR-XD is its proven track record in the big data analytics market. It is a market that demands scale and performance. That experience shows through in MapR-XD, a data fabric solution that provides the critical cloud-scale data store features that the enterprise expects while continuing to allow the movement to a modern data center that is data neutral and location independent.