Adding File Services to Hyperconverged Architectures

Posted on November 14, 2016 by George Crump

Hyperconverged architectures are gaining interest in organizations of all sizes. They collapse the compute, network and storage tiers into a single tier that promises an easier to use and more cost effective solution for virtualized data centers. The problem is that these architectures are often an island to themselves, ignoring existing storage and the cloud. They also don’t provide anything more than basic file services. The lack of robust file services breaks the hyperconverged story by forcing the use of a stand-alone network attached storage (NAS) and its associated network requirements.

The Hyperconverged Architecture

Hyperconvergence is powered by software defined storage which is designed to work within a hypervisor. These solutions will typically leverage all of the nodes in the hypervisor cluster and distribute data across them. This methodology provides consistent performance and data protection thanks to its distributed nature. The server network replaces a dedicated storage network, likely already in place, to facilitate inter-node communication that the storage software requires.

Most hyperconverged architectures today provide relatively complete data services, including snapshots, cloning and replication. But in most cases, the logical volume that the hyperconverged system creates is block-based, meaning that it is not usable as a file share. In addition, most hyperconverged architectures lack cloud connectivity, so the customer is unable to leverage the cloud for cost-effective scale, data distribution, archive or backup.

The File Services Workaround

Of course data centers, after implementing hyperconvergence, still need to provide file services to their users so they can store and share data. In most cases these organizations select one of two available choices.

The first is to continue to leverage and invest in a network attached storage (NAS) system. While that may seem like the path of least resistance, it actually lowers the return on investment of the hyperconverged architecture. Maintaining a separate NAS system means IT keeps up with a dedicated storage network for that NAS, and, since unstructured (file) data is where the growth is, the NAS will continue to cost the organization budget dollars as they expand it to keep up with user capacity demands.

The second is to create a file serving virtual machine that runs on the hyperconverged architecture or use the NAS functionality provided in the hyperconverged system, which is basically the same thing. For the majority of data centers the compute power that a virtual file server will require is minimal, and providing files services virtually does fit in with the collapsed tiers concept of hyperconvergence.

The problem with providing file services virtually is which virtual NAS will IT use? The default choice is likely a Windows VM acting as a NAS. The problem with a Windows File Server is the operating system is overkill for file services, which means it consumes more resources than necessary.

The second challenge is offering complete services. For example, how will it support Linux or Mac clients? There is also the challenge that not all file servers require minimal compute. A busy file server can be very compute and IO intensive, which in the “shared everything” reality could impact critical application performance.

Also how will modern services like data distribution, file sharing, and access from mobile devices be handled? Often these services will require an entirely different solution that lays on top of the windows file server.

Finally, there is the issue of data management. Most hyperconverged systems heavily leverage flash to help maintain consistent performance. Unstructured data, even user data, can benefit supremely from its performance. But unstructured data, probably more so than any other type of data, has a very short active life span. Usually a file is created, modified heavily for a few days and then rarely accessed again. Having inactive data on flash storage not only wastes the flash investment it robs flash capacity from applications that actually need it. Ideally the organization would like to archive this data to another tier of storage like the cloud.

Of course the organization could implement a Linux-based software NAS solution. While this addresses the concerns about Linux compatibility, it re-introduces concerns about Windows compatibility. Also, all the other concerns about performance and data management still apply.

Maintaining Balance Critical to Hyperconverged Success

One of the reasons unstructured data management is so critical is without it, the hyperconverged architecture can get out of balance. Because storage and compute are distributed across nodes in the cluster any capacity or performance problem requires the addition of another node. Adding that node brings additional capacity and performance. The problem is if the cluster is constantly expanding to meet unstructured data capacity demands, then the hyperconverged architecture ends up with too much compute performance, which goes wasted. That further reduces the architectures ROI.

Bringing File Services to Hyperconverged Architectures

Both workarounds listed above have merit. A dedicated NAS assures consistent performance but does create a separate infrastructure. NAS as a virtual machine adheres to infrastructure consolidation but may deliver inconsistent performance. Both lack any type of data management and cloud integration, which means both can create a storage management and a cluster balance problem.

Most organizations will benefit from a NAS solution with a specific design to provide file services. Generally, these solutions provide better compatibility across operating systems and require the least amount of resources. Second, IT professionals should look for a solution that can run either as a virtual machine or on a dedicated appliance so that if IO or CPU resource requirements of the NAS become a problem it can be offloaded from the hypervisor cluster.

Ideally a virtualized NAS service would act as a cache to an alternative location like the cloud. Since it’s a cache all data is quickly replicated to a cloud provider providing a layer of data protection, which running a file server as a virtual machine would not provide unless it also replicated data to an alternate storage device. The problem is that then the organization has effectively doubled its file storage capacity requirements. In addition a virtualized NAS could leverage the snapshot capabilities of cloud storage to provide virtually unlimited file versioning. Even with replication, a virtual file server still needs to be backed up in order to provide that complete level of version availability. Moving inactive data solves the biggest challenge facing hyperconverged architectures.

If the NAS is virtualized and uses the storage that the hyperconverged software supplies then its requirement is minimal, only the active data set needs it. Data management in this case keeps the hyperconverged architecture in-balance so that additional node resources are consumed at about the same pace.

If the NAS is dedicated, it limits the amount of additional storage that needs to be added to that dedicated NAS. Limiting unstructured storage growth simplifies the dedicated NAS’ networking requirements and lowers costs.

Bringing It All Together

By putting together hyperconverged plus file services you can rid yourself of legacy infrastructure, including storage hardware, backup, DR, remote replication, WAN Optimization, etc. This also allows you to collapse remote office/branch office requirements – e.g. a file services VM can run on a hyperconverged box and you can run at robo to provide true “office in a box.” Nasuni adds the complete set of file services to the storage stack that hyperconverged infrastructure already provides making the combination a complete file service, eliminating the need for a stand alone NAS.

Conclusion

Providing NAS services and maintaining cluster balance has always been a challenge for hyperconverged architectures. A NAS that adds data management can solve both problems. NAS deployment can be based on the CPU/IO demands of the organization’s unstructured data and node expansion can occur when both capacity and compute resources are needed. Add to this a data management solution that moves this data to an off-prem store then also resolves many of the data protection and retention issues that organizations face.

Sponsored by Nasuni

About Nasuni

Nasuni provides an integrated solution to store, protect, share and access all enterprise files. Powered by UniFS®, the first cloud-native file system, Nasuni transforms enterprise file infrastructure. With unlimited scale, continuous versioning and high-performance distributed file access, Nasuni delivers the complete suite of Enterprise File Services — with leading hyperconverged providers, including SimpliVity, Nutanix and Cisco.

About George Crump

George Crump is the Chief Marketing Officer at VergeIO, the leader in Ultraconverged Infrastructure. Prior to VergeIO he was Chief Product Strategist at StorONE. Before assuming roles with innovative technology vendors, George spent almost 14 years as the founder and lead analyst at Storage Switzerland. In his spare time, he continues to write blogs on Storage Switzerland to educate IT professionals on all aspects of data center storage. He is the primary contributor to Storage Switzerland and is a heavily sought-after public speaker. With over 30 years of experience designing storage solutions for data centers across the US, he has seen the birth of such technologies as RAID, NAS, SAN, Virtualization, Cloud, and Enterprise Flash. Before founding Storage Switzerland, he was CTO at one of the nation's largest storage integrators, where he was in charge of technology testing, integration, and product selection.

Tagged with: Archive, Backup, Cloud, Hyperconverged, Hypervisor, NAS, Nasuni, Replication, SDS, Snapshot, VM
Posted in Article