Most applications could benefit from a faster infrastructure. Storage performance, especially IOPS, has been routinely identified as a primary impediment to improving compute infrastructure speed. Not surprisingly, customers are turning to Flash Solid State Devices (SSDs) as a solution to this IOPS problem, and caching is emerging as an efficient method for implementing flash SSDs.
Host server-based caching is an excellent option, as it brings the cache closer to application processors and can simplify implementation, compared with network attached caching appliances or storage-based cache areas. While the use of flash as a host server-side cache is becoming more common, NAND flash as a storage medium does have some challenges when used in these environments, due mainly to the high turnover of data that caching produces.
Flash controllers must do a lot under the covers to make flash a viable technology for traditional storage implementations. This can mean overhead to align writes to flash blocks, managing wear leveling, optimizing write amplification, and running the ‘garbage collection’ process. When solid state storage is used as a static storage area, most enterprise flash controllers are able to keep up with these tasks. But, the high data turnover that’s common in caching applications can adversely impact SSD performance and reduce life expectancy. Marvell DragonFly is a server-based caching solution (a turn-key appliance on a PCIe card) that has an answer to these challenges.
What Is Marvell DragonFly?
The DragonFly Virtual Storage Accelerator (VSA) is a full- or half-height, half-length PCIe 2.0 x8 card which utilizes a two-level cache, consisting of DRAM and NAND flash storage. It contains up to 8GB of DDR3 DRAM, ultracapacitor-protected to back up to 32GB of on-board SLC flash. It supports 2.5 and 3.5-inch SATA or SAS, SLC or MLC drives, both 6Gb/s and 3Gb/s. Host OS support includes major Linux distributions running XenServer or KVM, VMware ESX and ESXi, and MS Windows Server with Hyper-V.
This two level intelligent cache architecture provides several key benefits. The 8GB of L1 DRAM cache can support many times that capacity of L2 NAND flash storage, allowing the cache to scale by adding more low-cost MLC SSDs to the server. This provides for a cost-effective, incremental expansion of IOPS performance as applications and workloads scale over time. As the working data set becomes more active via spatial or temporal locality it also increases the chances of a cache hit.
Marvell’s dual-level cache structure is also unique in that it operates across both reads and writes. While most conventional caches are read-only (write-thru), Marvell offers the advantages of both an intelligent read cache and write-back cache across a hybrid dual-stage (DRAM + SSD) cache. More importantly, DragonFly is a turn-key card solution with no additional cache software or network appliance to install. Just plug in the DragonFly card into any commodity rack mounted server and configure an ultra-thin filter driver.
The Challenges With Flash
The DRAM tier improves the overall performance of the read/write cache via higher IOPS and much lower application response times compared to SSDs. In addition, a dual-stage approach addresses some of the shortcomings of flash technology in the write-back cache use case. Flash devices write in full block increments, they can’t simply overwrite data at the bit level as magnetic storage can. They also support a finite number of these write cycles (called program/erase or “P/E” cycles) before error rates increase significantly, similar to the way modern battery technologies degrade rapidly after they’ve reached their charge/recharge thresholds. These characteristics require special routines within the flash memory controllers.
If no empty flash cells are available, others must be erased in full blocks ahead of each write operation in a process called “garbage collection”. This pre-erasing of available flash cells so that active writes don’t have to wait for a full P/E cycle creates an overhead burden that reduces overall write performance. In another process called “wear leveling” the controller rotates the cells chosen for each write operation to ensure that the finite P/E cycle count that each NAND cell has is being consumed evenly. This process can result in another hidden cost called “write amplification” in which blocks of data are moved around (and rewritten) within these NAND cells.
The Write Cliff
When SSDs have not been completely filled for the first time they won’t incur any of the garbage collection cycles mentioned above. When they fill completely, garbage collection and wear leveling processes begin in conjunction with write operations. At this point they will exhibit a drastic increase in latency and reduction in IOPS, referred to as a “write cliff”. In recent test running with an Intel 320 160GB MLC SSD, DragonFly largely eliminated this write cliff issue, increasing average IOPS 22x and reducing average latency 33x (see graph below). This means that the performance impact of garbage collection and wear leveling can be largely removed with the Marvell caching solution.
Challenges Of A Flash Cache
Caching, even in read-only use cases, creates high levels of write activity. Data is moved into (write) and out of (erase) the cache storage area much more frequently than is the case with a more static SSD storage tier, increasing the data “turnover rate”. This high amount of erase/write traffic is not an issue for DRAM since it doesn’t have the wear leveling issues mentioned earlier, but it can be a major problem for flash SSD and in particular consumer-grade MLC NAND SSDs.
Using flash in a caching environment can result in SSDs wearing out much sooner than expected, as the NAND flash substrate reaches its limit of program/erase cycles, even with effective wear leveling processes. Also, performance degradation can occur as flash controllers struggle to support constant garbage collection cycles for this increased write activity. With MLC NAND SSDs comprising more than 95% of the market and rapidly growing in enterprise adoption, this presents a conundrum for enterprise users – how to achieve consistently high performance for 3-5 years while using low-cost MLC SSDs?
Marvell’s HyperScale embedded cache technology leverages an intelligent write caching mechanism to address the inherent shortcomings of NAND flash in a caching environment. Data is written first to DRAM, which then de-stages blocks to SSD using a proprietary log-structured buffering mechanism. This process assembles writes into ‘flash-aware’ block sizes which reduces the wasteful overwriting that occurs when writes are not made in full block increments. It also leverages a partial store-order coalescing algorithm to re-order individual random writes and sequentialize them into larger stripes. This can dramatically reduce the frequency of garbage collection and wear leveling overhead on flash controllers. The result is a significant improvement in SSD write performance and endurance – making consumer-grade MLC SSDs behave like SLC in terms of performance and longevity.
High Availability & Data Protection
In the event of a failure on the DragonFly card, in the SSDs, or at any other host server point of failure, the VSA maintains a synchronous mirror on another application server in the environment. Using a low-latency protocol over TCP/IP, this peer-to-peer mirroring feature can provide the HA reliability for using the DragonFly in mission-critical applications. In addition, to protect data that has not been committed to flash in the event of a power failure, the built-in ‘supercapacitor’ can provide power during a flush of the DRAM L1 cache to non-volatile SSD.
Marvell’s write-through (read) cache can distinguish high activity data sets at the sub-VM level, both block and file, to create a more granular I/O temperature map. Configured as an option to default write-back mode, write-through operates as a read cache so that only random reads benefit, producing 10x or higher improvement in random read performance.
The DragonFly is an ‘appliance on a card’, a self-contained system that doesn’t use any appreciable host CPU resources and requires only a thin kernel filter driver. This means it doesn’t impact application performance but also, it’s non-disruptive – no application changes or complicated tiering software to integrate. Moreover, it’s easy to port across multiple operating systems and releases compared to cache software, which can require time-consuming application re-writes across many different OSs (Windows ’03, ’08, VMware ESX, ESXi, Linux (many distributions) with KVM or Xen, OpenSolaris, Solaris, FreeBSD, etc). It’s also storage agnostic, meaning it can support all network-attached and locally-attached storage, including DAS (SCSI), SAN (iSCSI, FCP, FCoE), or NAS (NFS v3, v4) protocols.
Storage Swiss Take
Cache is king, at least in compute environments that need more storage performance and are turning to flash SSDs. But NAND flash has some characteristics that need to be accounted for, if this implementation is to live up to expectations. Caching creates potentially enormous amounts of I/O traffic, something that slows NAND flash devices down and can greatly impact their endurance.
Marvell’s two-tier cache design, using DRAM and SSD, enables a larger effective cache capacity and increases cache effectiveness. The use of a log-structured write-buffer eliminates much of the overhead generated by high cache IOPS, reducing performance issues and improving flash wear life. With the DragonFly Marvell has addressed the major challenges to implementing server-side caching with flash SSDs.
Marvell is a client of Storage Switzerland