Many organizations are considering flash caching as a cost-effective way to accelerate application workloads. The challenge is virtual infrastructure planners need to sift through a myriad of flash caching options available on the market. One of the most popular flash caching options is server-side caching. Accelerating performance at the server layer is a highly targeted approach for enhancing virtual machine performance and one of the major ancillary benefits is that it actually enables organizations to extend the life of their existing, shared storage assets.
Separating Performance From Capacity
By configuring server-side flash to function as the “storage performance tier”, shared storage resources, like an external SAN or NAS array, can be re-purposed as a “capacity tier”; potentially saving organizations from the expense of purchasing a new storage array. The question is what server-side caching architectures are best suited for accelerating virtualized application workloads? And more importantly, how can these solutions enable businesses to seamlessly scale-out their virtualized infrastructure and get the best return on their investment?
Driving VM Density
Simply put, to maximize the business value out of investments made in virtualized server infrastructure, IT organizations need to find ways to increase VM density. Server-side caching is an excellent way to accomplish this because it reduces storage IO latency and results in the consumption of less CPU and memory resources on the host. In other words, since caching facilitates rapid storage IO, the CPU becomes less congested with queuing up storage IO requests. And with more computational resources available, more VMs can be configured on the server.
While server-side caching solutions are generally designed to utilize flash resources inside the host, it is important that they have the capability to use cache resources that are external to the host as well. This means that in addition to server based DRAM, PCI-e Flash, RAID controllers and drive form factor SSD devices, the caching architecture should also be capable of leveraging flash in an external array on the SAN or in a NAS platform. This creates a single point of cache management whereby all performance decisions can be made from a single interface.
Through a single point of cache performance management, organizations could then build a highly stratified caching architecture that could address the various performance requirements of the applications in the environment. In short, there would be multiple levels of cache resources that could be assigned to VMs based on their individual performance profile and the underlying caching software intelligence could promote the most active data up this cache “food chain”.
For example, a highly performance sensitive, mission-critical application, like an online transaction-processing (OLTP) database, could have its most active data sets initially placed on a server-side PCI-e flash device. But then as this data became increasingly in-demand, the caching software might promote it into the local DRAM on the host for even higher performance. Likewise, as this data begins to cool, the caching intelligence would then start to de-stage this data off the DRAM (to make it available for hotter data sets) and move it back into the flash storage area.
It would be even more ideal if the caching architecture allowed for fine-grained control when promoting data sets into cache; for instance, designating specific files or even a single database table to get promoted into the cache.
Most cache acceleration software technologies, however, require a certain period of time to elapse before critical application data can be promoted into the cache pool. Referred to as “cache warming”, this is the process whereby the caching software runs an analysis on all the storage IO going between the requesting host or VM and the backend storage array. In some cases, this process can take several days or even weeks before data starts being promoted into the cache. The challenge is it takes time before the value of the cache investment is realized.
Pinning The Cache
Some caching architectures can bypass the cache warming process altogether by “pinning” known hot data sets, like database indexes or redo logs, immediately into cache. The obvious benefit here is that critical applications can be immediately accelerated and there can be a quicker return on cache investments. Cache pinning therefore is a highly desirable feature for ensuring that critical business systems gain near instantaneous access to the fastest available cache resources in the environment.
Of course, for any caching architecture to viably support virtualized environments, it would also need to be capable of maintaining a “hot” cache during server vMotion activities. This means that when VMs are migrated to an alternate server, there would be no need to evict data from cache prior to the vMotion operation and a subsequent requirement to “re-warm” or move data back into the cache after the VM migration.
In other words, VMs should be able to be dynamically migrated without any disruption to application performance. All active data sets would remain in an addressable cache space before, during and after the VM migration; enabling applications to maintain excellent quality-of-service (QoS). This could be accomplished if the caching architecture could mirror cached data for example, from a host based PCI-e flash device, to a flash resource inside a storage array.
This type of “remote caching” could also protect critical VMs from a performance disruption as the result of a server-side flash device failure. By mirroring a copy of hot data to an external cache resource, virtualized applications could continue operating, following the failure, while only incurring a marginal increase in IO latency. Cached data could then be re-mirrored to a replacement device to restore normal service levels.
Centralized Cache Management
As previously discussed, server-side caching software is essentially the glue that enables organizations to integrate all the caching resources in their environment together regardless of the source. By providing a single point of performance management, IT and virtual administrators can accelerate both physical and virtual workloads by leveraging flash resources inside the server, on the SAN or in a NAS system.
But cache resources are not cheap and in most environments, they represent a small percentage of overall storage capacity. Therefore infrastructure planners may want to consider caching architectures that provide the flexibility to accelerate virtualization workloads at either the hypervisor level or down to an individual guest OS VM or even into specific files within those VMs.
For cache resource constrained environments, the advantage of implementing cache acceleration at the guest OS is that only those VMs specifically selected for flash performance speeds will have their workloads accelerated. This helps to economize on cache by ensuring that these resources are only allocated to those applications that absolutely require the performance levels that flash affords. The downside to this approach, however, is that it requires managing software updates for each individual VM.
For environments that support dozens (or more) of hypervisors, integrating cache acceleration software at the hypervisor level may be a better approach if low touch management is a higher priority. This design choice accelerates all the VMs on the host, regardless of their actual performance requirements and consequently results in a much higher consumption rate of cache resources.
But most importantly, IT architects can choose both types of cache implementations as the situation dictates. For example, if one server is hosting multiple VMs that need high performance, the cache acceleration software can be installed at the hypervisor layer. On the other hand, if there is a server that only has a few VMs that require cache performance speeds, the caching software can be integrated on to the Guest OS of those particular VMs. The key here is the flexibility of choice.
Given all the variable application workloads that today’s virtualized environments need to support, it is important for IT planners to choose a caching architecture that can customize application storage performance with fine tuned precision. Features like cache pinning, for example, enable critical applications to have their most active data sets immediately promoted into the fastest available cache resources. Likewise, having the choice to implement caching at the hypervisor or Guest OS layers gives IT architects the ability to accelerate workloads across the entire host or down to an individual VM level.
Furthermore, caching software intelligence should enable businesses to leverage all forms of cache in the environment; whether it’s server-side or array based flash. Multi-level caching, for example, allows for efficient cache resource utilization since the most active data sets get automatically promoted to the highest performance cache resource available in the data center. Likewise, through remote caching, cache external to server-side flash can be fully exploited. This enables fault-tolerance and allows critical services like server vMotion activities to occur without requiring hot data to be evicted from cache; helping to ensure application QoS.
Intel’s Cache Acceleration Software (CAS) is a good example of a flexible caching architecture that allows businesses to tailor their storage performance requirements based on the individual needs of their virtualized applications. By providing a single point of cache performance management, Intel’s CAS separates storage performance from storage capacity and allows all available flash resources in the datacenter to be efficiently and effectively leveraged. By implementing this type of caching architecture, businesses can seamlessly scale their virtualized infrastructure and get a strong return on their virtual investments.
Intel in a client of Storage Switzerland