When effectively implemented, flash storage technologies can help remedy storage IO contention issues and deliver a performance benefit to virtualized applications. It is important, however, for IT planners to find ways of augmenting virtual machine (VM) performance without discarding their investment in existing storage systems.
Additionally, it is equally important to implement solutions that can seamlessly integrate into the virtualized infrastructure, are unobtrusive to application workloads and provide the high levels of data availability and resiliency that today’s 24x7xforever environments demand; all while significantly enhancing application read and write IO performance. The question is, are all of these storage performance and resiliency attributes attainable without breaking the bank?
Increasingly, more businesses are turning to flash to solve what has become an acute pain point in many virtualized server environments – poor storage performance. Simply put, conventional storage systems simply cannot serve up IO requests fast enough when hundreds or thousands of VMs are simultaneously contending for storage IO resources – otherwise referred to as “the storage IO blender”.
Perhaps one of the most fundamental challenges with conventional storage, however, is that adding storage performance often requires adding more storage capacity. This results in higher costs that, at times, only provides a temporary performance benefit.
There are several different architectural approaches for addressing the IO demands of virtualized business applications. Let’s examine each of them and see how well they align with the storage objectives listed above.
Sweeping The Floor With Flash
One segment of the storage market that has received a lot of attention is the all-flash array market space. These suppliers promote the use of 100% flash storage capacity as a way to guarantee consistently strong application performance. Since all application data resides on flash storage, administrators don’t have to constantly tune and monitor critical virtualized applications for performance. But the strength of an all-flash deployment is also its Achilles heel. Inevitably, some of the flash capacity will be needlessly wasted on VMs that can’t exploit or simply don’t need that level of high performance. If the data center can afford all flash then they should acquire it, but from a budgetary perspective this is a less than optimal way to accelerate performance since some Flash architectures still require you to buy capacity to get performance; consequently, not all data centers can justify it.
Hybrid Enabled Performance
Hybrid storage arrays, on the other hand, utilize a mix of flash storage with conventional hard disk drives (HDD). The array typically has some form of software intelligence that performs analytics on the data and then uses this information to determine which data sets should be promoted into the flash tier. In this manner, a smaller subset of flash capacity can be implemented into the array; helping to drive down the cost of the solution.
The challenge with this approach, however, is that if a critical application experiences a “cache miss”, (when data is no longer available in the flash area), performance can drop through the floor and significantly impact application performance and user/customer satisfaction. This situation could be made all the worse if the non-flash tier is configured with high density, low RPM hard disk drives. From an end user perspective, it could be the equivalent of going from a Formula One racecar on an empty track to going down a bumpy, country dirt road in a horse and buggy.
Not So Hidden Costs
Another challenge with implementing all flash and hybrid arrays is that it requires switching to that particular vendor’s storage management services – data snapshot, replication, thin provisioning, etc. This means re-training storage administrators on the vendor’s storage management interface and getting used to their naming conventions, nomenclature, etc.
Even when an existing storage array can be outfitted with flash, these are typically expensive add-ons and often, the arrays are not optimized to work with flash so the full potential of the resource is never fully realized. Flash at the array level can also push the IO bottleneck into the storage fabric. As storage IOPS increase, there may be a need to upgrade fabric switches to accommodate the additional IO traffic. In short, the costs for deploying array based flash can quickly add up.
Converged storage systems are yet another way to bolster the performance of virtualized machines. By pre-integrating storage, networking resources, servers and VM software into the same rack enclosure, converged storage vendors effectively provide a “datacenter in a box” solution. The value of converged storage technologies is that they are typically optimized and tuned for virtual server environments and time to production should be reduced. The idea is to make infrastructure deployments simpler and bring some measure of predictability to VM performance.
Some of these offerings, however, can be quite costly; particularly if they are from some of the well-known, larger suppliers.
But fundamentally, like all flash and hybrid arrays, adopting converged storage systems essentially means abandoning the storage that’s already on the floor. What’s more, all of these systems still tie storage performance with storage capacity. In addition to the expense of investing in a new storage platform, there is also the cost of performing data migrations from the old storage system to the new one and also converting over to new storage management services. In many environments, the existing storage assets still have value; it’s merely a question of finding the right way to augment performance to get additional life out of them.
This need for a cost-effective way to augment storage performance calls for a solution that can work with existing storage assets while improving the performance of virtualized applications. Ideally, the solution would allow for the selective deployment of flash resources so that performance can be accelerated at the hypervisor level rather than a costly “shotgun” approach that accelerates all application workloads.
Rather than deploy flash into the SAN, it may be more advantageous to implement these resources directly on the host where VMs reside. In this manner, storage IO becomes highly localized and there is no added latency of traversing a storage network. But the challenge here is that flash could remain captive on that particular host. This may be less of an issue if these resources are always being fully utilized by the VMs on the server, however, in instances where flash is only thinly being used, this would drive up data center costs. Therefore, there needs to be a way of clustering and sharing out flash resources across all the servers in the environment to improve flash utilization and enable better resource scaling in the data center.
Decoupling Performance From Capacity
One way to accomplish this is to implement software on the hypervisor that can analyze all the storage IO going between the VMs and the backend storage array and then promote the most frequently accessed data sets into the local flash device. This methodology effectively separates storage performance (server-side flash) from storage capacity (HDD on the array). In this manner, performance sensitive VMs can have the vast majority of their IO requests serviced up by the local cache. This then frees up the storage array to function as the “capacity tier”; enabling organizations to forego expensive storage array replacements.
Read and Write IO Acceleration
But this software also needs to ensure data resiliency and data persistency. Should a hardware failure occur on one of the flash devices or if the server itself were to suffer a failure, the data has to be immediately accessible from another source. Likewise, data needs to be immediately accessible following server vMotion activities. In short, it has to be available to all the nodes in the virtual cluster. Most flash acceleration software solutions account for this by only accelerating read traffic. That way if a hardware failure occurs, the data can always be referenced on the storage array. While this approach does ensure data protection, it doesn’t enable write IO to be accelerated. In other words, write IO, in these architectures, still needs to be serviced up from the backend array. A more complete solution would enable both read and write IO to take advantage of solid-state transfer speeds while still ensuring that data (read/write) is always protected on persistent storage.
Contention-less IO Acceleration
The way in which the software is implemented is also crucial. If the flash acceleration software is deployed as a virtual storage appliance (VSA), for example, it will essentially be competing for the same CPU and memory resources as the other VMs on the hypervisor. In addition, this would result in additional management overhead, as these VSAs would have to be patched from time-to-time with software updates; the same is true if the software is implemented at the individual VM guest operating system level. In fact, this can become particularly problematic in data centers where dozens or hundreds of VMs are in production. Therefore it is essentially a pre-requisite for the solution to integrate directly into the hypervisor, without compromising that particular vendor’s support agreement.
By integrating directly into the VMware hypervisor, solutions like PernixData’s FVP’s software, are designed to seamlessly accelerate VM storage IO performance without necessitating costly and highly disruptive storage array upgrades. FVP synchronously replicates data across server-side flash resources to create a shared pool of highly resilient, high performance storage across the data center. By doing so, virtualized applications gain the benefit of accelerated read and write IO traffic, while data remains persistent on existing backend storage arrays.
FVP decouples storage performance from storage capacity by promoting the most active data sets into a server-side “data acceleration network”, while enabling existing hard disk storage systems to focus on protecting less frequently accessed, inactive data.
Since individual flash devices are not captive to any particular server, these resources can be more efficiently utilized across the data center. In addition, since storage arrays are essentially re-purposed as a capacity tier, businesses have the option to continue using existing storage assets or implementing new systems based on their capacity and data services needs. Moreover, storage administrators can continue using the array based storage management services that they’re used to working with.
Another benefit with this approach is that it actually allows virtual server administrators to increase VM density on their hypervisors. When storage IO performance issues occur in virtualized environments, administrators typically have to refrain from adding more VMs; otherwise performance problems could become exacerbated. By removing VM IO bottlenecks, however, businesses can increase VM density and get a higher return on their investments in virtualized infrastructure. As importantly, storage performance can now scale in lock step as new servers and VMs are added to the environment; without requiring any disruptive changes to the backend storage array. This enables organizations to implement a true scale-out storage architecture.
PernixData’s FVP software can bridge the storage performance gap in VMware environments through a simple software deployment that is completely transparent to the hypervisor. As importantly, this can be achieved without impacting the supportability of the environment. Furthermore, by eliminating the need to rip and replace existing storage systems, FVP gives IT organizations a financially viable way to sustain high performance for their key business applications.
PernixData is a client of Storage Switzerland