Designing Storage for MongoDB, Spark, MySQL, and Cassandra – Pavilion Data Briefing Note

Modern applications like MongoDB, Spark, MySQL, and Cassandra are disrupting traditional storage architectures. To keep storage costs down and performance high these applications use direct attached PCIe flash storage. Many organizations are rapidly integrating NVMe. The problem is that direct-attached storage in modern architectures suffers from the same resource inefficiencies that direct-attached storage has had for years. Reports suggest that modern application clusters use less than 25% of the available flash capacity. The situation becomes worse in modern applications because the primary media (NVMe flash) sells at premium prices.

The Modern Application Direct Attached Nightmare

The direct attached problem in modern applications is about far more than just cost. These direct attached architectures create a complex silo of clusters which require managing storage for each application separately, at both the node and cluster level.

The pace of compute nodes upgrades is faster than in more traditional server use cases. IT architects are quick to replace them as soon as new CPUs become available, enabling more computing in the same space. The lack of a shared solution also makes it time-consuming to upgrade application cluster nodes. Administrators either need to open the node up and pull out the current flash media or waste money disposing of it.

Why Shared Storage Falls Short

Shared storage solves most of the direct attached storage problems that modern architectures face, but they present problems of their own. First, a shared storage system introduces latency in the form of network connectivity, which also limits bandwidth. While it is possible to upgrade network architectures, the cost to do so is high.

Even with advances in networking and drives, shared storage systems introduce latency from their operating environment. The overhead associated with data management significantly takes away from the raw performance. The gap between raw and actual performance becomes more evident as technologies like NVMe flash drives and NVMe networking come to market.

Pavilion Data – Re-Thinking Storage for Modern Applications

Pavilion Data, founded in 2014 has recently entered the market with a storage solution designed to solve both sides of the modern application storage architecture problem. It is an all NVMe-oF storage array explicitly designed for modern applications. It is a shared storage system that delivers 120GBps of Bandwidth with 100 microseconds of latency. The solution is agentless and requires no host presence. It can provide from 14TB to 1PB of NVMe Flash in a 4U appliance. All the components are hot-pluggable and field upgradable.

Since the unit is shareable, it delivers better storage utilization to modern applications and crosses application clusters. A single solution can support the organization’s full variety of applications. It also frees up node level CPU processing from managing storage, allowing the application itself to run faster.

Architecturally the unit looks more like a director class switch than a storage array. It has ten dual controller line cards. Each card has NVMe flash and four 100GE ports. Designed to be highly available; the unit has four redundant power supplies and two management modules. It supports up to 72 NVMe flash drives with a starting capacity of 14TB and is expandable up to 1PB. Each controller has access to any drive.

It also provides complete, enterprise-grade, data management. Resiliency comes from dual parity erasure coding, non-disruptive field upgrades, and active-active multi-pathing. From a security perspective, the solution provides data at rest encryption, and secure multi-tenant use as well as workload isolation. The system provides the expected data management features including thin provisioning, snapshots, and clones. It has a RESTful API for orchestration with OpenStack, Cinder, Kubernetes, DMTF Redfish and SNIA Swordfish.

StorageSwiss Take

Change in the application or server tier almost always leads to disruption in the storage architecture and has led to the birth of billion dollar companies. It has also led to other companies that were once leaders scrambling for significance. The move to a modern application is another of these disruptions and legacy storage vendors are less prepared than ever to manage the transition.

The result is we’ve seen the emergence of several designs to address this market. The critical differentiator is where the storage application runs. Most run the storage application and modern application on the same node at the same time. Pavilion Data takes the opposite approach, the storage component runs on a centralized pooled storage system, and it overcomes theoretical latency problems by delivering a highly specialized storage hardware architecture.

Twelve years ago George Crump founded Storage Switzerland with one simple goal; to educate IT professionals about all aspects of data center storage. He is the primary contributor to Storage Switzerland and is a heavily sought after public speaker. With over 25 years of experience designing storage solutions for data centers across the US, he has seen the birth of such technologies as RAID, NAS and SAN, Virtualization, Cloud and Enterprise Flash. Prior to founding Storage Switzerland he was CTO at one of the nation's largest storage integrators where he was in charge of technology testing, integration and product selection.

Tagged with: , , , , , , , , , ,
Posted in Briefing Note

Enter your email address to follow this blog and receive notifications of new posts by email.

Join 21,941 other followers

Blog Stats
  • 1,292,300 views
%d bloggers like this: