It wasn’t that long ago that most IT professionals would have considered 1 PB of data a gargantuan amount of information. Now, many data center environments are supporting large unstructured data repositories that are well in excess of this heady figure. And as data continues to proliferate, finding a way to economically store and protect this information, while still enabling very rapid access times for applications, is becoming increasingly challenging. Panasas believes their scale-out storage appliance may just provide the optimal balance of performance and economy that businesses need to efficiently manage their large and growing information stores.
HPC Gone Mainstream
As a 15 year old Sunnyvale, CA-based technology company, Panasas has the pedigree for enabling high performance computing (HPC) environments, like large governmental agencies and higher education research departments, to cost-effectively scale out their storage infrastructures, while satisfying their needs for very rapid storage I/O. But while HPC may have been more of a niche within the IT industry just a few years ago, it is now becoming more commonplace as more businesses increasingly adopt Big Data analytics applications, data mining systems and other highly data intensive platforms.
The Performance and Economy Trade-Off
The need for storage speed at scale and economy is creating a dilemma for many organizations. Generally speaking, it can often be an either/or proposition. Companies can either invest in a costly scale-out storage solution that provides very high performance for business applications or implement a solution that is designed to provide middling performance at a lower cost. Panasas claims with their newest product release, ActiveStor 16, that businesses can now have both without spending exorbitant sums on their underlying storage infrastructure.
Scale-Out Object NAS
Panasas’ scale-out NAS storage appliance provides NFS/CIFS file access to applications on the front-end while storing data in an efficient object storage environment on the back-end. The end user sees their files stored on a typical POSIX file system but actually, the data is broken down into chunks and then distributed out across multiple, low-cost SATA disk drives across the entire system using erasure codes in software for data protection instead of depending on traditional hardware RAID controllers. To provide high performance, the appliance utilizes dedicated server blades to process metadata information and small files. These blades are configured with an efficient layer of flash resources to ensure that requesting applications can gain very high throughput to data.
Parity In Triplicate
But with the latest release of their storage system software, version 6.0, Panasas adds a more resilient data protection scheme to their platform – RAID 6 with triple parity protection. Panasas aptly points out that as the number of disk drives in a storage environment grows, the likelihood of suffering disk drive failures goes up. And with disk drive densities continuing to increase, drive rebuild times are getting inordinately higher – in fact, it isn’t uncommon for a 4TB drive to take in excess of 48 hours to rebuild on other systems using traditional RAID controllers.
If a system suffers multiple drive failures while drive rebuilds are taking place, it is possible to suffer data loss. Panasas claims that the key to reducing drive rebuild times is to design a system that dramatically reduces the need to ever perform an entire drive rebuild in the first place and to ensure that RAID rebuild speeds scale with the number of drives in the system.
Self-Healing RAID
By creating an extra parity bit with each write operation, Panasas’ system can use this bit to correct bad drive sectors on the fly. This is, in effect, a proactive approach to correcting disk drive issues before they become an all-out drive failure. Even more important is how the system provides scalable RAID rebuild performance and how data is distributed across drives, allowing data reliability to increase with scale instead of decreasing as is normally observed with other architectures. According to Panasas, this is key to enabling businesses to massively scale their storage infrastructures without dealing with a correspondingly higher number of drive failures and drive rebuild times.
In addition, by protecting file system directory data, the system offers a new feature called Extended File System Availability, that allows it to stay online and accessible to users, even after three simultaneous drive failures—a scenario that would result in 100% data loss and storage system downtime for other systems. Because of how data is distributed across drives, the likelihood of incurring data loss as the result of multiple, simultaneous drive failures is minimized.
Panasas estimates that only 1 in 200 million files would likely be damaged by three simultaneous drive failures in a 1000 drive system with RAID 6+. Because of the extra directory protection, the storage administrator could then restore only those specific affected files from backup and quickly return the system to 100% health. This availability and disaster recovery capability stands to benefit any customer looking to leverage a scale-out architecture—especially ones considering petascale deployments.
Conclusion
Many organizations are starting to take a close look at object storage systems due to their ability to scale out massively while using low-cost, commodity disk drive technology. But many of these object storage systems are not designed to accommodate ultra-high throughput to the data stored within these frameworks. The Panasas offering is designed to give data center environments the best of both worlds – low cost storage that can scale into multiple PBs in a standard POSIX file system environment that can also provide very high performance for all types of application workloads.
Panasas is not a client of Storage Switzerland
