The power of server virtualization comes from the abstraction of hardware resources, which, in turn, efficiently delivers those resources to virtual machines. Thanks to that abstraction, these virtual machines are now free to move across physical hosts in order to enhance resource load balancing and maintain high availability. The abstraction creates a new layer for which infrastructure needs to tune itself for the virtual machine (VM).
Hyper-V is becoming a popular alternative to VMware for server virtualization. It provides a reliable hypervisor with similar capabilities and allows tighter integration with the most common server operating-system platform—Windows. But like VMware, the storage architecture must still be redefined so that it can better align with this new VM layer.
Traditional storage architectures, as well as the first generation of software-defined storage (SDS) architectures, make it difficult to tune storage performance and services for this new VM layer. This is because they still typically have a host to storage relationship, instead of a VM-to-storage relationship. The first generation of SDS solutions is similar to the legacy storage systems they claim to replace. They both are designed for the one application per host past, not the new dozens of applications per host future.
This means that storage-management decisions – including performance, capacity, and data protection – all need to be made at the LUN or volume level, instead of at the more granular VM level. In essence, all VMs on a LUN are treated equally and the only way to change the storage I/O performance of a particular VM is to move it to a different LUN.
The Problem With Generation One SDS in Virtualized Environments
The problem with this lack of granularity is that the success of a Hyper-V project is ultimately determined by how specific business-critical applications perform after being virtualized. These applications must perform at least as well post-virtualization as they did before it. And they must provide this performance in a consistent, predictable manner; something that is particularly challenging in the virtual “shared everything” environment.
Most Generation One SDS solutions and legacy-storage solutions provide no method to apply granular performance, capacity, or data- protection settings. Maintaining and assuring the performance of the business-critical applications translates into overprovisioning of storage resources (drive speed and capacity). It is the classic “throw more hardware at it” approach, which is popular with vendors, but leads to increased storage-infrastructure costs. This results in a reduction of the overall return on investment (ROI) of the virtualization project, while increasing the total cost of ownership (TCO).
The Dedicated Controller Problem
The dedicated controller architecture, in either legacy storage systems or first generation SDS, can be a significant obstacle, because in a virtual infrastructure, a single LUN could house dozens, or even hundreds, of VMs. The lack of a lower level understanding of the specific VMs on that LUN creates several challenges.
First, all VMs have to be treated equally from a performance perspective. While there are technologies available to tune individual VM performance at the network interface-card level and at the switch level, those efforts go to waste as mission-critical VMs have to wait in the same storage I/O queue as other VMs. Individual VMs can’t be identified to have specific performance, capacity, or data-protection attributes applied to them.
The problem is that each of these data services consumes disk capacity and performance capability. While expending those resources to ensure that a mission-critical VM performs well or is adequately protected is worthwhile, it may not be a wise investment for the other VMs on that LUN. But again, without a way to identify individual VMs, the storage administrator is forced to provide the same level of data services across all the VMs on the LUN.
A potential workaround to this problem is to provide dedicated LUNs to those specific performance-critical VMs. The problem is that this greatly increases storage architecture complexity, by burdening the storage administrator with more points of management. And ultimately, there is a finite amount of controller resources that have to be shared across all the LUNs in the environment.
This lack of granular understanding at the VM level results in the overprovisioning of resources, which in turn leads to higher costs, and greater management complexity in virtual environments. The reality is that today’s data centers have neither the excess budget, nor the staffing to manage these complex storage infrastructures.
The Future of SDS—Defined By the Hypervisor
The next step in the evolution of software-defined storage is to move beyond a mode that simply creates a software representation of a dedicated storage controller and to a fully abstracted storage controller. In this model, the controller itself is abstracted from the storage hardware. Since it does not have to be dedicated to a particular storage or server hardware device or virtual appliance, it can be distributed across the Hyper-V infrastructure. This provides not only a storage hardware abstraction but also a data-flow abstraction. It allows for potentially unlimited controller scalability, since it can be distributed across all available hosts.
This complete abstraction of both data services and data flow is important, because the data services (such as snapshots, clones and replication) will end up eventually being delivered by the hypervisor itself. Both Microsoft and VMware are embedding more and more data services into their operating systems and hypervisors. In fact, Microsoft has even added deduplication and advanced caching to their capabilities.
In order for the Hyper-V optimized generation of SDS solutions to be effective they need to break free of the architecture where the controller functions are dedicated to a virtual appliance or physical hardware. In order to accomplish this, the controller itself must be optimized for the hypervisor, such as Hyper-V, which then provides a one-to-one mapping between VM and the controller to provide access to the physical storage.
Defining Hyper-V Optimized SDS
In order for SDS to be truly effective in the Hyper-V environment, it needs to do four things:
- First, it needs to be optimized and customized per VM to accelerate and sustain storage I/O performance.
- Second, data services need to align with the hypervisor’s capability to provide those services on a per-VM basis.
- Third, it needs to virtualize a pool of hardware resources that can be efficiently delivered to the VMs.
- Lastly, it needs to be easy for the virtualization admin to manage.
The success of a virtualization project is measured by how well the virtualized applications perform over time. This means that they need to provide consistent performance that the users can count on. As stated above, with generation one SDS solutions, the only way to provide this level of assurance was to make sure the LUN operated at the highest performance possible (instead of specific VMs). This leads to very expensive storage architectures that are designed to add additional capacity and performance to make up for the inefficient support for each VM’s applications instead of the more cost effective solution where the storage controller adapts to the needs of each VM.
A Hyper-V Optimized SDS solution will allow the controller function to be abstracted into each hypervisor in the virtual cluster. As a result, abstracted controllers will then be provided with a VM level of visibility. This means that each VM can have performance and data-protection resources tuned to its specific needs. This should significantly improve the storage administrator’s ability to provide specific application performance, while controlling costs.
The second attribute for a Hyper-V optimized SDS solution is to not re-invent what is already in place. Hyper-V provides an impressive array of data services. It makes sense then that the SDS solution leverages those services instead of forcing the Hyper-V administrator to buy them a second time. By leveraging what is already in place, the SDS vendor can focus on providing new value added services, like QoS and data tiering.
Finally, the Hyper-V Optimized SDS solution should provide virtualized access to a pool of storage resources based on the storage demands of the VM. This allows the SDS vendor to leverage quality commodity storage hardware while being cost efficient. This shared pool also enables VM migration capabilities. Since the controllers are abstracted from the storage pool, it should be able to scale independently from storage services. In other words, performance via the abstracted controllers and capacity via the shared pool can truly scale independently of each other.
The application owners’ primary concern is the performance of their application. While cost efficiencies and increased availability of virtualization intrigues them, they will not sacrifice a consistent application experience for their users. The key for Hyper-V and storage administrators is to provide an infrastructure that can do this. They have two options – they can overprovision hardware and dedicate LUNs, which reduces ROI and TCO, or they can leverage a Hyper-V optimized SDS solution so that the SDS controller will automatically tune for each specific virtualized application.
Gridstore is a client of Storage Switzerland