In an earlier column we talked about the world’s insatiable appetite for storage and the gap between projected demand and supply of that capacity. In response to this demand disk drive companies are thinking beyond the drive, to the array and system levels, to support the innovation needed to close this gap. But appropriately, this innovation still involves disk drives, like Seagate’s Kinetic Open Storage Platform, which should help enable them to move innovation up the storage stack.
Seagate’s Kinetic Open Storage Platform consists of a new class of hard disk drives that communicate with applications over Ethernet, using a key-value architecture to access data objects directly, via RESTful APIs. This is the same way an object storage system works, a process that’s analogous to using a file tagging system to ‘jump’ to a file instead of scrolling through a directory tree to find it.
These key-value architectures are ideally suited for data that doesn’t change much. But when it does change, the process is to read the entire object, modify it and then overwrite the original object. In contrast, traditional POSIX-compliant file systems are designed to support a more granular modification process within files that only touches the bytes or blocks that actually change.
The net of this read/write (PUT/GET) access pattern is simplification. The key-value architecture generates less metadata for each storage system transaction, reducing processing overhead while simplifying the entire storage stack. The benefits for storage systems design are dramatic.
This approach fits perfectly with the move towards object-based storage systems that cloud providers have adopted as a way to accommodate very large sets of digital content that look fundamentally different than the data traditionally stored on large file systems. With object storage systems able to talk to disk drives, it brings the potential to simplify the storage stack, driving out complexity in the system – sometimes driving out the storage controller or storage server altogether. This is exactly the kind of innovation that the cloud providers are pushing for because it drives down their costs on a number of levels.
Eliminating the storage server layer obviously reduces CAPEX but also increases storage density in the rack. This, in turn, reduces OPEX as power, cooling and rack space are decreased as well. Administration overhead goes down too, with fewer systems to manage, but these new drives also require less ‘care and feeding’.
Since their data is organized at the object level, data failures are handled very differently and drive failure is much less frequent. With a traditional RAID system, when a section of a drive got corrupted, the only solution was to rebuild that entire drive on a spare. With object-based architectures, only the objects that actually contain the corrupted data need to be recreated, dramatically shortening the rebuild time and reducing the number of drives that need to be physically replaced.
Between the file system, the storage system, the OS and the application, the amount of metadata created to do basic data handling can be extensive. Seagate claims this can consume many times the number of CPU cycles needed to actually move the ‘payload’ data itself. As described above, the key-value architecture used by Kinetic drives and object storage systems can reduce this metadata and its processing overhead, resulting in better performance.
With the ability to access drives directly, applications can also lay out data more intelligently, creating more sequential read operations that speed up throughput. While object storage systems aren’t necessarily designed for high performance use cases, improving efficiency is always a good thing, and can lead to lower cost operation.
The change occurring in the disk drive industry is dramatic. We’re seeing flash replace the high-performance drives in enterprise arrays and object storage systems enable the enormous data sets that cloud providers are accumulating. These systems could be set to replace NAS. With the bulk of their non-consumer volume going into these clouds, making the disk drive talk object storage seems like an excellent strategy. It also supports the storage stack consolidation that Seagate and other drive vendors are moving towards.