Briefing Note: Stopping Virtualization’s Double Tax

Virtualized Windows applications suffer from two I/O taxes. First, there is the tax of the I/O blender caused by mixing dozens of VMs on a single server, and then connecting multiple servers to a shared storage array that creates a highly random I/O stream. The I/O blender impacts the whole virtualized environment, regardless of guest operating system. The second tax is the I/O of the most virtualized operating system, Windows, which leads to a highly fragmented Windows data store at the logical disk layer, multiplying the negative impact of the I/O blender. Fixing these issues leads data centers to take extreme measures like all-flash and hybrid disk arrays. Condusiv’s V-locity 6.0 is designed to eliminate both of these taxes and lessen the need to move so quickly to a flash heavy solution, while also ensuring organizations are able to get the most from the flash they already have.

The Windows Tax

As Windows NTFS writes data in a SAN storage environment, it fragments file allocations at the logical disk layer to multiple different addresses since Windows merely looks for the next available space instead of the best space, which results in I/O that is smaller and more fractured than it needs to be, dampening throughput from VM to storage. This free space allocation inefficiency adds as much as 25% I/O overhead on typical systems immaterial of storage media – whether flash or disk. More severely fragmented data sets inflate the IOPS requirement to process any given workload by 2-3X. V-locity’s IntelliWrite® technology is a filter level driver that installs within the guest operating system to prevent I/O fracturing from occurring so files are written in a more contiguous manner from the start, creating a large, sequential I/O stream from the VM. Storage Switzerland detailed IntelliWrite in its briefing note, “Physical Servers Matter”. V-locity brings that same capability to virtualized Windows Servers.

The I/O Blender Tax

There are plenty of technologies that attempt to address the I/O blender challenge created by virtualization; server side flash, hybrid arrays and all-flash arrays. But all of these solutions require the purchase of additional hardware and, other than all-flash arrays, they all use some form of caching algorithm to make sure the most relevant data is in cache at the right time. Most use a first-in, first-out methodology, with some user override. Few do any real analysis of the actual I/O pattern itself.

Condusiv’s IntelliMemory is a read cache that leverages available DRAM in the server to provide rapid response to most read requests and clear the rest of the I/O path for writes, improving both sides of the I/O equation. Since V-locity is a dynamic cache, there does not need to be any permanent allocation of DRAM for caching purposes as IntelliMemory uses what is available and serves memory back to the application as needed so there is never resource contention or memory starvation. Since IntelliMemory uses available DRAM memory there is no additional hardware to purchase, making it an ideal first step for data centers looking to improve performance without replacing the whole storage infrastructure. Condusiv has published several customer examples that demonstrated 50-300% application performance gains with no more than 4GB of available DRAM per VM.

For optimum performance the user can add extra memory to its servers for V-locity to leverage (V-locity maximum cache size is 128GB per VM). Often adding a couple extra sticks of DRAM is far less expensive than adding a PCIe or SSD or investing in a shared flash array.

IntelliMemory does not use a standard first-in or first-out type of algorithm; instead it performs real-time analytics on the actual data, creating a database. Essentially, these are “little data” analytics. It then leverages this data to make sure that the cache only stores data that is truly worthy of being there and prioritizes the kind of I/O that dampens overall system performance the most – small, random I/O. The longer the server runs the more accurate the cache becomes.

The Results

By implementing V-locity with these two technologies, Condusiv claims a minimum of 50% performance gains, again without having to purchase additional hardware. In one example, Condusiv showed that V-locity doubled MS-SQL performance while reducing I/O to the SAN by 61%.

StorageSwiss Take

Storage Switzerland continuously advises data centers not to rush into flash. Instead data centers are better off looking at what can be done before investing in the premium storage technology. Often we talk about optimizing database queries and code, but optimizing the operating system and the hypervisor with solutions like V-locity to smooth out I/O patterns and leverage unused resources is a simpler step. It is also one that is more applicable to more data centers. More importantly, even when the data center does invest in a flash based array, this type of solution compliments that purchase allowing the flash based array to perform more optimally.

Twelve years ago George Crump founded Storage Switzerland with one simple goal; to educate IT professionals about all aspects of data center storage. He is the primary contributor to Storage Switzerland and is a heavily sought after public speaker. With over 25 years of experience designing storage solutions for data centers across the US, he has seen the birth of such technologies as RAID, NAS and SAN, Virtualization, Cloud and Enterprise Flash. Prior to founding Storage Switzerland he was CTO at one of the nation's largest storage integrators where he was in charge of technology testing, integration and product selection.

Tagged with: , , , , , , ,
Posted in Briefing Note

Enter your email address to follow this blog and receive notifications of new posts by email.

Join 22,233 other followers

Blog Stats
%d bloggers like this: