Flash is the form of persistent memory that most IT professional are aware of, and it has fundamentally changed the data center. But we’ve only scratched the surface of what persistent memory is capable of, and flash is not the be-all-and-end-all of persistent memory.
The Impact of Flash
If there is a performance problem, storage is almost always the culprit. While IT can optimize databases and upgrade storage networks, in the end an upgrade of the storage system almost always pays the most dividends. When a data center upgrades its storage infrastructure to include flash, in the form of either all-flash arrays or hybrid (flash + hard disk drives) arrays, it almost always reduces performance problems.
The biggest impact is the time IT operations gets back from not have to maintenance, such as trying to tune a hard disk array to perform better than it is physically possible or tuning a database, application or even operating system to work around the performance issues of hard disk technology. A flash system, in most cases, eliminates the need for tuning in one fell swoop.
What’s Next for Flash?
Performance, like nature, abhors a vacuum. While flash systems almost always solve the immediate performance problem, they also expose other problems, especially as the data center adds additional workloads. A storage system is only as fast as its slowest component That component used to be the hard disk. Now it’s everything else, the storage software, the storage compute, the internal storage network that connects compute, software and media, as well as the external connection to servers and applications. Everything that will occur in the flash market in the next few years will address this reality.
Flash Aware Storage Software
We can assume for now that the compute part of the equation is well in hand. Intel seems to be years ahead of what is required from a storage perspective. The key now if for the software to advance to take advantage of both the available compute and the capabilities of flash.
On the compute side Intel is adding computing power by increasing the number of processing cores available per CPU. Storage software must become much more multi-threaded than it is currently. Most storage software will support multiple cores by starting specific functions on each core. This approach leads to the over-utilization of some cores and the underutilization of others. Storage software needs to spread the workload evenly across the cores so that the full power of each is available to it.
On communication to the flash storage, the storage software needs to interface with flash as if it is persistent memory, because that is what it is. Today, most storage software communicates with flash as if it is fast hard disk drives. Storage software should change the way it organizes data and manages metadata to take full advantage of flash performance.
A key problem for many storage systems is their software runs on top of a file system like Linux or ZFS. It is that file system that actually communicates with the flash media. Storage software vendors need to move away from the middle-man and communicate directly with the flash media.
NVMe and NVMe Over Fabrics
The second area for improvement is internal and external communication. Storage systems have an internal network. The storage software communicates to the CPU which then executes those commands to the flash media. That communication path between the CPU and the flash media has not changed much since the emergence of flash. That internal network is changing from Serial Attached SCSI (SAS) to NVM Express (NVMe). It is a logical device interface specification for accessing non-volatile storage media attached via a PCI Express (PCIe) bus.
As we detail in our article, “What is NVMe“, NVMe also streamlines the software I/O stack by reducing the unnecessary overhead introduced by the SCSI stack. It also supports more queues than standard SCSI, increasing queues to 64,000, up from one queue that the legacy Advanced Host Controller Interface (AHCI) supports. Also, each NVMe queue can support 64,000 commands, up from the 32 commands that AHCI supports in its one queue.
The result is a storage system that can keep up with flash performance, which should lead to reduced latency, more IOPS and increased bandwidth.
Of course, at some point the storage system needs to send some of its data to a connecting server. While today that network is plenty fast from a bandwidth perspective, it is not as efficient as it could be. For the most part both fibre channel and IP-based storage networks are burdened by SCSI overhead.
NVMe Over Fabrics is designed to eliminate the SCSI overhead. It takes the NVMe protocol and enables it to execute across a network. There are several standards underway for both Fibre Channel and IP storage networks to leverage the protocol. Within a couple of years the storage network should be able to communicate with flash as fast as if the flash is installed internally in the server.
What’s After Flash?
The industry did not stop innovating once flash came to market, the next step in storage is non-volatile DRAM (NVRAM). DRAM, when compared to flash has one big advantage, write IO performance, but it also has one big shortcoming, lack of persistence. When power is removed from DRAM, data is lost. Persistent DRAM technologies solve that problem, they deliver the performance of DRAM with the persistence of flash.
NVRAM is available today. The problem is cost. It is far more expensive than flash. Right now an increasing number of flash systems use a very small amount of NVRAM as a write cache to insure data integrity. Over time the size of the NVRAM component will increase so it can be used to cache all inbound writes and coalesce them prior to writing to flash.
The result should be a significant improvement in a systems write performance and an increased lifespan of flash media. We may also see NVRAM installed in servers. Imagine a server that – if power fails – merely goes to sleep like a laptop instead of going down.
StorageSwiss Take
The focus of persistent memory has always been to improve the performance of the data center so it can process more data more quickly. For that impact to progress storage systems now need to play catch up to enable flash and other forms of persistent memory to reach their full performance.
The place to learn about persistent memory is the Flash Memory Summit in Santa Clara, CA., August 8 to 10. It is one of the most educational events of its type. Whether you are developing the next great flash technology or a data center manager looking to understand how to best leverage flash, there are tracks for you.
Loved the piece, George.
The dream, of course, is that applications will read and write shared persistent storage in the same way they would a shared data structure in memory today…eliminating the storage stack entirely from the performance path.
This is a data plane / control plane split in storage much like the one networking went through roughly 20 years ago. In effect, I open the file (or object) and get a memory address pointer to it; I read and write inline from user space; when I close the file my address mapping and access rights are revoked.
Reality is hard: this is not only a lot of storage software (and hardware tables sufficient to represent an interestingly large number of simultaneous objects/files/regions); sharing has some synchronization and atomicity behaviors which might puzzle an application writer used to using traditional storage stacks (see SNIA persistent memory papers); and of course applications must be ported and probably rewritten to achieve optimal performance.
We’re in for an interesting ride the next 20 years!