Texas Memory Systems (TMS) has been building high performance, solid state storage systems for over three decades, laying claim to the “fastest storage on the planet” with their World’s Fastest Storage® trademark. The RamSan-720 added to this legacy with hardware based, high availability features that don’t impact performance. The RamSan-720 opened up new SSD applications in critical environments where all infrastructure must be designed with no single point of failure. Now TMS has released the RamSan-820, which uses lower cost eMLC flash, providing twice the capacity of their SLC-based RamSan-720 system in a fully HA configuration.
The RamSan-820 combines the high availability enterprise architecture of the RamSan-720 with the capacity and storage density of eMLC Flash. This new system provides 12 or 24TB of usable capacity (10 and 20TB after RAID overhead), in a 1U rack mounted chassis. Each system has two slots for either dual-port 8 Gb/s Fibre Channel or dual-port 40 Gb/s QDR InfiniBand controllers and redundant power supplies.
Like the other RamSan products, the RamSan-820 doesn’t use drive form factor SSDs, but instead, puts eMLC flash chips on its own PCB-card modules in the chassis using its own controller technology. This provides a significant density and performance advantage over other architectures. Using an internal, non-blocking crossbar switch, each module has independent access to all I/O channels, greatly benefiting performance. The RamSan-820 can sustain 450K IOPS, 4K random reads or writes in any combination, and up to 4 GB/s of sustained throughput. Write latency is 25 µs and read latency is 110 µs.
Each flash module (called Fault Tolerant Flash or FTF module) includes on-board Variable Stripe RAID*, chip-level ECC protection, and its own flash controller. One FTF module can be designated as a hot spare when the system-level RAID is enabled, which the system will use to immediately migrate data from a failed module. And, with redundant paths for power, data, and control this system has no single points of failure that could lead to data loss.
Capacity, Density, Efficiency
As capacity per rack of flash systems increase and power consumption per GB decrease, companies have started using these large capacity flash arrays like the RamSan-820 to end the practice of solving performance problems by adding spindles. Watts/GB and watts/IO are becoming the deciding factors as metro data centers run into limits on available power. The complexity and management overhead of those high drive count solutions were always concerns, but now eMLC Flash in these high density configurations are eliminating the economic rationale behind that strategy as well.
The RamSan-820 leverages an adaptive RAID 5 called Variable Stripe RAID within each flash module, as well as RAID 5 across modules, giving the system two independent layers of RAID protection. RAID in a flash system needs to be handled differently than RAID in a disk array since ‘rebuilds’ can consume a significant amount of limited flash capacity.
When a disk failure occurs in a RAID-protected array, the controller uses parity calculations to recreate or “rebuild” the failed drive’s data on a spare disk drive. Aside from the time required to recreate the failed drive’s capacity, the net ‘cost’ in terms of storage is only the capacity of that failed drive. The rest of the drives in the RAID group are still used when the rebuild is completed.
When a failure occurs in a RAID-protected flash device the controller copies the entire RAID group to spare flash area maintained for this purpose, using parity to recreate the RAID member that actually failed. But unlike disk RAID, the other members of the original Flash RAID group are not reused, making the net cost of Flash RAID rebuilds much higher.
The 9+1 RAID groups on the RamSan-820 Flash modules consist of 10 flash chips, each containing 16 planes (2 planes in each of 8 dies on a chip). A single plane failure that occurs in most flash systems would require either a maintenance interval to replace a module or an inefficient loss of an entire RAID stripe’s worth of capacity. TMS’s Variable Stripe RAID technology addresses this problem by adjusting RAID stripe sizes (in this case, to an 8+1 configuration) to take advantage of all remaining working planes. The result is much less wasted capacity and a significant improvement in flash endurance.
Greater endurance translates to longer flash life and a lower overall cost per GB. This means that these all-flash systems can be made large enough to support more use cases, including those that have historically been tied to high end disk array systems. More capacity means entire applications can be put on flash, eliminating the complexity of caching and tiering. While the RamSan-720 is recommended for write-intensive workloads, the density of the 20TB in a 1U chassis makes the RamSan-820 ideal for read-heavy environments.
Latency vs Performance
“Latency” refers to the time it takes to service a single data request; “performance” is an aggregate metric, how well a system can service all the requests it receives. Latency for flash storage is several orders of magnitude less than it is for even the fastest hard disk drives. While using a large number of spindles can improve aggregate performance, it does nothing to improve latency, the speed with which a storage system services a particular host’s request for data.
In applications that generate high numbers of transactions involving small data objects, latency can be a more important metric of a storage array’s effectiveness than aggregate system performance. If you can’t multiplex a request and stream data from multiple disk drives, then latency is the gating factor. This is one of the reasons that flash storage arrays, like the RamSan-820, are being adopted in high performance use cases, and companies are moving away from high spindle configurations.
What 20TB of HA Flash can do
With these kinds of capacities, data centers can safely move entire databases or other high performance applications into flash and run them there. This eliminates the analysis required to figure out which data sets need to be in a flash tier, which can be left on disk and when. Then, it eliminates the overhead of moving that data. A 20TB flash array also eliminates the complexity of caching implementations and the problems associated with a cache miss.
Storage Swiss Take
TMS is moving the concept of SSD Appliances forward significantly with the RamSan-820. By leveraging the economics of eMLC flash and adding their proprietary technology for data protection and efficient controller design, they’re making a strong case to forget about high drive count solutions for performance problems. When power, cooling and storage density are factored in, the decision becomes that much clearer. For about the same power as a 1U server TMS can now give you tens of terabytes of flash and hundreds of thousands of IOPS.
* Variable Stripe RAID is a trademark of Texas Memory Systems
Texas Memory Systems is a client of Storage Switzerland