One of the more popular approaches to implementing solid state disks (SSDs) is to implement the flash memory on a PCIe card, which is installed directly into a server. This places the high performance storage directly at the point of the storage performance problem, and eliminates potential network and storage protocol bottlenecks that can get in the way of greatest SSD performance. But all PCIe SSDs are not created equal. In fact, there are some stark differences between the various architectures in the form factor. This article will explore some of the factors to consider when comparing these popular server-side flash storage devices.
Why Should PCIe SSD Performance Matter?
It is easy to assume that since PCIe SSDs tend to be some of the fastest storage devices available, even the slowest of the group should be more than enough for most applications. But that’s not the case because there will always be a need to go faster. The sheer increase in the volume of data being stored and manipulated by personal use in the cloud, as well as use by enterprises for running critical business applications, need greater gains in performance. Software demands more and more storage performance and users demand faster and faster response times. Purchasing just to solve today’s storage problem will likely create a new performance problem in the future as data increases need greater performance.
Understand Your Environment
The first step in selecting the right PCIe SSD for the organization is to understand the environment in which the SSD will be deployed. Will the server or cluster have the types of applications and access requirements that are generating a large amount of parallel storage I/O? Will multiple users and/or applications need simultaneous, random access to large amounts of data? This is I/O where 1,000s of users or processes are accessing a small group of files, like a database environment or a file sharing environment. Parallel I/O can also be generated by a few users accessing 1,000s of files, like a big data analytics environment or one with dozens of virtual machines on a single host.
Finally, some environments’ storage I/O needs are pure bandwidth and not random at all. These environments are typically looking for high data ingest rates. While not the original focus of SSD, bandwidth-demanding environments are turning to PCIe based SSDs in particular because of their high-speed transfer channels (PCIe) and suburb read/write performance.
Due to the nature of these environments most of the benchmark data that’s available on SSDs won’t typically impress these users. It simply doesn’t fit their real world use case. What matters most is to make sure that the PCIe SSD being reviewed delivers the performance that the environment needs. For parallel environments, 4k random read/write tests using mixed workloads are appropriate for the initial vetting. For high bandwidth environments, raw transfer rates are best. But, in both situations nothing will be more accurate than installing the product into the specific production environment that it will be supporting.
The PCIe SSD Controller Debate
PCIe, unlike any other form of SSD, offers options for how the flash controller function is deployed. The flash controller, at a basic level, manages such flash processes as wear leveling and garbage collection. In PCIe devices this controller function can be performed on the SSD board itself, similar to other form factors, or it can be designed to borrow resources from the host, like the system’s CPUs and RAM.
This ‘borrowed’ controller deployment method is often called “software controller” technology since it leverages the hardware from the host that it’s being installed in. Since software controller technology relies on the host for processing, PCIe cards may be the most appropriate deployment for a developer that needs complete control of the application.
The advantages to the software controller approach include easier implementation and no lost time programming hardware or creating silicon. It also means that as the host CPU scales so will the controller function. But that scaling will be throttled by the number of onboard paths to data. Each path will consume more CPU resources and more RAM, so there is a functional limit. There is also a physical limit to the number of PCIe cards that can be supported by the host.
The software based controller architecture may also impact performance under peak loads. Since CPU resources and storage I/O demands often spike in unison, the impact of a shared controller architecture could be significant. At a minimum, predictability of performance is lost. Raw PCIe SSD performance will constantly be fluctuating depending on the non-storage demands being placed on the host CPU. At the same time high storage I/O from a particular VM, for example, could impact the available CPU resources of all the other VMs on that host.
Dedicated hardware storage controllers on the other hand are the more traditional approach to flash management. PCIe SSDs with onboard controller hardware have much in common with other non-PCIe SSDs. These cards have the advantage of consistent performance no matter what the load on the host resources is or what particular host processes or VMs are doing.
Hardware based controller architectures can scale effectively since they are not dependent on a particular class of CPU to assure performance. This means that the number of PCIe SSDs can scale effectively with the system’s storage needs simply by adding more cards. Memory is also typically in short supply so the fact that hardware based controllers don’t need to consume RAM means all available RAM can be allocated to the applications.
Finally, since there is only a thin driver required for hardware based controllers, it’s more likely that they can be bootable. This may be ideal for the server environment as it eliminates the need for a separate device just to boot the server. The thin driver should also make testing for compatibility with other drivers being loaded go more smoothly.
Multiple Controller Architectures vs. Single Controllers
The hardware based controller architecture also lends itself to scalability. The ability to aggregate multiple controllers allows intensive workloads to be divvied up efficiently without impacting performance. Companies like OCZ are taking the multi-controller approach a step further by virtualizing the controller functions. This allows the demands of flash management to be spread evenly across multiple physical controllers, increasing performance and endurance.
Exploitation of the PCIe Bus
Compared to the abilities of SATA/SAS and even 10Gb Ethernet or 16Gb FC, the PCIe bus is the computing equivalent to the autobahn, making the other protocols seem like winding country roads. However, just like the autobahn, the super data highway has limited benefit if the board being installed doesn’t take advantage of it.
The key to maximizing the performance of the PCIe bus is to 1) make sure the end device can handle it; 2) use PCIe SSDs with multiple, dedicated controllers; and 3) saturate the PCIe bus with data. The first part of this saturation comes from the applications themselves and given the description above many environments should be able to deliver this. The next step, and one that’s often overlooked, is to make sure that the PCIe SSD can support a highly parallel environment by being able to maintain very high storage I/O queue depth. This is partly development intelligence and partly leveraging a multi-controller architecture.
The last area to explore is data protection. The fastest performing PCIe SSD in the world does the data center no good if it can’t protect itself from failure. The first step is to protect against block failure. OCZ does this by analyzing the NAND flash at the block level, as data is being written to it. If an error occurs, that block is marked as bad and re-written to another part of the PCIe SSD.
Not only does this protect against data loss, it also prolongs the life of the PCIe SSD. A block failure does not mean that the board needs to be replaced. In fact there typically is no loss in capacity since most PCIe SSD boards are over provisioned.
Most PCIe boards use DRAM to buffer inbound traffic so that the flash controllers can organize writes. In a multi-controller architecture this happens in microseconds but if power were lost at that moment there would be the potential for data loss. OCZ overcomes this by providing power management on the PCIe board in the form of a capacitor subsystem. It provides enough power to de-stage DRAM to flash before total shutdown.
While it should be commonplace, another set of capabilities to confirm are SMART functionality and TRIM command support. SMART functionality provides board level feedback about flash health. TRIM command support makes sure that when the operating system deletes data it’s marked for release on the flash board. This allows for the garbage collection process to execute more accurately.
Performance matters in solid state devices. However, not all PCIe SSDs are equipped to handle the demands of workloads that are highly parallel or that require high bandwidth. And fewer still are able to handle both workload types from a single board. The key for maximum, multi-workload performance is the method used to implement the controller technology and the number of controllers that are available to manage I/O traffic into and out of the NAND flash that make up those SSDs. A higher number of controllers also supports more flash and allows more data to be accepted via the PCIe bus.