Server and desktop virtualization has given IT the flexibility to respond to the needs of the business rapidly while at the same time reducing costs and improving efficiency. These abilities though have created a greater challenge for the supporting storage infrastructure making it more complex and costly. These increases have threatened to eat away at the ROI created by virtualization.
In response new storage systems have emerged claiming to be optimized for a virtual environment. For some storage vendors this translates into adding solid state disks (SSD) to a general purpose file system like ZFS. While adding SSD can certainly improve the responsiveness of a storage system supporting virtualization, SSD alone does not eliminate tuning complexities nor does it maximize performance.
Alternatively there are storage systems now available whose underlying architecture is designed specifically for the virtual environment. Most of these systems not only leverage SSD for performance but they also provide direct integration to virtualized infrastructures. These VM Aware storage systems are often compared to ZFS based systems when data centers are looking to improve the responsiveness of the storage infrastructure while curtailing costs.
What is ZFS Storage?
ZFS is a combination logical volume manager and file system originally developed by Sun Microsystems, now Oracle, and is implemented as Open Source Software. ZFS has an impressive array of features including data integrity checking, software RAID as well as volume management, snapshots and copy-on-write clones.
It provides general purpose CIFS and NFS file share capability and uses NFS to provide emulated block storage protocol support typically in the form of iSCSI and occasionally fibre channel. This is an important point of note. The iSCSI/fibre protocols have to be translated twice prior to reading and writing data; first, the emulated storage protocol (iSCSI or fibre) and then the foundational protocol of ZFS, which is NFS. While ZFS is not directly tied to NFS but because ZFS runs on UNIX, non-NFS protocols are often less efficiently implemented and incur additional overheads.
Important for a discussion on improving storage performance in a server or desktop virtualized environment ZFS also includes the ability to leverage SSD as a cache or tier of storage. As data becomes active it is automatically moved to the SSD tier and as it becomes less active it is moved back to physical hard disk. This means that SSD can be installed into a storage system and leveraged with good results.
The availability of an open source file system that can leverage SSD beyond just simple integration has made ZFS an attractive foundational building block for many storage system builders. These companies hope to capture the attention of data centers struggling with the storage supporting their virtual infrastructure. They are also hoping to be able to move into the market quickly by leveraging ZFS as the foundation of their platform.
Some ZFS based storage system vendors may strive to add some value to their systems, and interestingly the most common feature they tend to use as their added value is data deduplication. While ZFS does include deduplication and compression, these vendors reinforce the common perception that it is not suitable for primary storage. A few other vendors have done optimization of the ZFS tiering or caching technology but they are still limited to the constraints of what ZFS provides access to.
When it comes to VM integration most ZFS based systems provide little, if any, specific support for virtualized environments. As stated earlier the concept of these systems being ideal for the virtual environment is based primarily on the ability to generically support SSD.
The key challenge is that ZFS is at the end of the day a general purpose file system that was designed to be a NAS device supporting file shares and not specifically designed for a highly random set of workloads common in the virtual environment. For example when most ZFS based systems perform a clone or snapshot it is for the entire volume, not for individual VMs. In fact without the addition of third party software or specific hardware networking adaptors these systems typically provide little ability to customize performance at the individual VM level.
These systems count on ZFS at their core and do not have direct control over the intellectual property. All the vendor can do is add high level functions like deduplication, but as stated earlier many vendors make no enhancements to the ZFS file system, they are simply packaging it with SSD and claiming that they are ready for the virtual environment.
What is VM Aware Storage?
VM Aware storage is a storage design concept from companies like Tintri. These systems and their associated file systems are designed specifically for supporting a virtual infrastructure. It provides direct interaction with both the physical server and more importantly the VMs on that server.
These systems also include support for solid state storage but unlike ZFS the support of flash memory based storage was built into the design of VM Aware storage systems from day one. It was not a bolted on capability. The combination of being integrated into the system design and the focus on virtualized platforms should lead to a more efficient use of the more premium priced SSD storage. Essentially the user should see their applications run entirely from flash with very few I/O going to hard disks. For VDI, this means ultrabook-level performance and for databases, sub-millisecond latency for small random I/Os.
As stated earlier support for SSD is just one design element in the creation of a storage system whose goal is to improve overall virtualization performance, increase VM to host density and reduce storage footprint. A key component of VM Aware storage is the granular understanding of what is being stored on it. Essentially it understands the characteristics of each VM and it knows which I/Os belong to which VMs. This allows the system to do granular per VM snapshots and clones which save space and reduce performance bottlenecks.
Beyond basic data services like snapshots and clones, VM Aware storage systems provide visibility into the specific VMs being hosted. The storage administrator can “see” the I/O workloads of each VM and how they are impacting each other. VM Aware systems like Tintri’s take this a step further by providing the ability to assign QOS on a per VM basis, even going so far as to “pin” a VM into SSD cache.
The challenge with VM Aware storage is that it requires that the developer spend the extra time developing the software behind the system. While potentially a longer road to market, the benefit to this approach is that the developer is not shackled to a system that was designed and implemented long before virtualization and flash memory were widely adopted technologies. The advantage to the user is they should end up with an easier to use and tune system and as a result deliver better performance to application owners.
Conclusion
When selecting a storage system designed to specifically address the challenges presented by the virtualized server or desktop infrastructure, careful examination of the new crop of storage systems should be given. Both legacy storage systems and systems from startups are claiming VM readiness but in reality these capabilities are either simple generic upgrades like SSD and caching or are bolt-on features. In many cases these systems are still general purpose in nature and are not designed specifically for the virtualized environment.
IT professionals should compare these systems to systems that were designed from the ground up to support the virtual environment. They should find that VM Aware systems can do more with less, which leads to better performance at a lower total cost of ownership.