Flash-based Solid State Devices (SSD) are more than just a collection of memory cards. They’re also not created equally. Components like the controller, backplane, and internal interconnect all are important differentiators when accessing Flash-based storage solutions. As these products have made their way into the enterprise a component that has become increasingly important and often overlooked is the Flash Operating System (FOS). How the FOS is developed and interacts with the rest of the solid state system directly impacts its performance, reliability, and efficiency.
Flash OS vs. Flash Controller
There are two key components to a Flash device. The first, as Storage Switzerland discussed in the article “Pay Attention to Flash Controllers when Comparing SSD Systems”, is the controller. The Flash controller operates on a group of NAND Flash modules to make sure that data is written correctly and evenly to the cells on those modules. Its primarily designed to maximize the life expectancy of NAND cells and to ensure data reliability. An FOS operates across a series of Flash controllers to increase performance, data reliability, and maintain efficiency of the entire Flash system or appliance.
What Does a Flash OS Do?
An FOS shares many similarities with an operating system that runs on a mechanical hard drive-based storage system. First and foremost, it has to provision and manage access to the device, typically by creating volumes for the attaching host to connect to. The FOS used to stop there, for fear that adding any more functions to the controller would hinder performance. This meant using the server-based operating system or application to provide other capabilities that were needed.
The current generation of FOSs go significantly further than just provisioning and managing NAND cells. These modern Flash devices, like those designed by Texas Memory Systems, have multiple controllers driving multiple Flash memory areas. The operating system works in concert with these controllers. It controls how data is written across the controllers so that each controller is used evenly. The FOS will also provide data protection functions as well. In a system with multiple controllers it will create either RAID or mirrored redundancy to make sure that data is always available to the connecting application.
If the SSD appliance supports it, the FOS will also handle the high availability (HA) features that products like the RamSan 720 & 820 provide. HA configurations require that the entire system be monitored so that if a failure occurs the correct component can be swapped out, allowing data access to continue for the application.
The FOS will also manage how the Flash appliance interfaces with the rest of the data center, including how these devices manage shared access. Many of the current SSD Appliance solutions are PCIe based architectures designed primarily for direct attached storage. These vendors will provide sharing via a gateway type of appliance, either built into the SSD appliance itself or an external direct attached server.
The gateway approach solves the problem of sharing these PCIe based systems but does so at the risk of cost, efficiency, and performance. The processing power available to the gateway will directly impact performance. For some environments this overhead is worth the extra enterprise features, like snapshots, for others the added latency is too severe.
Where to Implement the FOS?
The FOS can be implemented in three areas. The first is essentially as a driver that rides along inside the operating system of the attaching server, something that’s often done with PCIe based Flash SSD. This approach keeps cost down by using some of the attaching server’s CPU and memory to run the Flash operating system. This method does make it hard to create a shared Flash storage pool and it makes the speed and available memory of the attaching server the bottleneck to higher performance.
More often the Flash is implemented via a CPU or custom silicon device in the Flash appliance itself. This approach has the advantage of not consuming host attached resources and it makes it easier for Flash in an appliance to be shared. Like with PCIe SSDs, this single CPU or custom silicon approach does mean that it can become the performance bottleneck as all I/O has to pass through it.
A more unique approach is the one used by Texas Memory Systems. They spread the workload out across multiple field programable gate arrays (FPGAs) inside the SSD appliance, which places the processing power closest to each specific function being performed. This distributed, highly parallel approach may be more challenging from a developmental standpoint but it eliminates a single component from being the bottleneck.
FOSs Need Balance
Storage systems based on hard disks or hybrid systems, with a mixture of hard disk and SSD, are designed to provide both capacity to store data with acceptable performance. Flash SSD Appliances are designed to provide performance as a first priority. This creates a significant challenge in the way the OS that’s going to manage these devices is designed. It has to provide enough services to the enterprise to be considered viable but also provide those services without impacting performance. As discussed above where to place the FOS is important but what performance to provide is also critical.
Custom environments, where the application can provide much of the data services umbrella, need a Flash system to only provide a bare minimum of capability. This is essentially the provisioning and data protection described above.
Other environments, like hypervisors and server operating systems, don’t have these advanced data services capabilities, so more is needed from the SSD appliance. In these situations the appliance needs to offer complete HA as well as basic provisioning and data protection.
The challenge from an FOS perspective is to supply only the capabilities that are needed for the situation. Providing too much just adds overhead for unused features and too little means these capabilities would need to be added externally, which can increase latency.
This is where the distributed operating system design is ideal. By having the FOS distributed across the various components it’s automatically balanced. For example, if HA is not needed the components of the operating system that drive those features are not installed. If HA is needed then the hardware is installed, as are the needed resources to drive those components of the operating system.
The details are important when comparing Flash-based storage solutions, as is the use case. Along with the sophistication of the controller to correct errors on the memory cell, the ability of the FOS to provide steady performance and reliability across the entire appliance is critical. What may be more important is how the FOS implementation will impact the overall performance of the Flash device. In the latency-free world of SSD the ability to scale the Flash operating system to meet unique workload needs is critical for maximizing the value of the Flash investment.
Texas Memory Systems is a client of Storage Switzerland