How File Virtualization and Hybrid Cloud can solve the Copy Data dilemma

Posted on October 28, 2014 by Colm Keegan

“The biggest risk to success is perpetuating the status quo.” So said VMworld CEO, Pat Gelsinger, during his keynote address at VMworld. He went on to say that, “Today’s world is stuck in silos. Developers against IT, on-premise against off premise, traditional apps against cloud-native apps.” In his “Brave new IT” presentation, Gelsinger emphasized that IT has to lead the way and be the agents of change that can drive innovation and true business transformation via the software defined data center (SDCC).

Unstructured Data Storm

One area where the limitations, expense and risk of silo’d environments is particularly evident is how geographically dispersed end-users collaborate and share information. As we have covered in past articles, the unabated growth of unstructured data (user files, PDF, rich multimedia, machine sensor data, etc.) and the need for users to seamlessly and securely share this data, is placing significant strains on legacy file-sharing environments.

As Wayne St. Amand at Nasuni said in response to Gelsinger’s presentation, “..the combination of exponential data growth and the expectations of corporate executives and employees to have anywhere, consumer like, on-demand access to this data is creating unprecedented challenges on IT. In addition, with increasing threats to data from hackers, (as was recently underscored by Google’s recent data breach) corporate data needs to be secured with strong encryption wherever it resides – in private data centers or in public cloud facilities”.

Silos of Management

Traditional approaches to managing this data “storm” are not a viable, long-term strategy. For example, many organizations have relied on deploying NAS systems into their distributed office locations to provide file sharing services to their users, however, this creates a management nightmare as multiple, discrete silos of storage have to be individually managed and backed up. Furthermore, as stand-alone silos, these systems cannot share or re-distribute storage resources from location to location. Consequently, it is not unusual to have some sites that have storage resource constraints while other sites have excess capacity. This results in resource inefficiencies and a higher total cost of ownership.

In recent years, many businesses have looked at alternatives to traditional “Scale-Up” NAS systems to alleviate these issues. Scale-Out NAS, architecturally, is an improvement over legacy NAS systems since they can be deployed in smaller configurations (a few TBs) but have the capability of scaling out to support multiple PBs. This addresses scaling and performance limitations but it doesn’t address the issue of having multiple points of storage management throughout the enterprise. And with IT staffs remaining flat or shrinking, this is no small concern.

Copy Data Contingency Plan

Other approaches include copy data management solutions. With this approach, businesses can consolidate multiple copies of production data on to a single protection storage system. No longer does there need to be separate primary and backup storage system to provision and manage multiple copies of data for backup, development, data analytics, file sharing, etc. Instead, they can all be centrally managed by the copy data management array and served up to users who need access to this information. While this represents an improvement in reducing the proliferation of multiple, redundant copies in the data center, it still introduces yet another silo of storage that has to be purchased, installed and managed. In short, it is really an interim solution for addressing today’s burgeoning data growth issues.

Another Silo’d Box

Many organizations have adopted cloud based storage management solutions like Box and Dropbox to enhance user-file sharing and to in effect, “outsource” the management and protection of growing unstructured data stores by storing data in the cloud. These services allow users to access data from virtually anywhere as long as they have an internet connection. Again, while this is a step forward from all the issues with managing discrete silos of NAS systems, it too represents an interim approach. First, some of these public cloud file sharing services don’t provide service level agreements. Instead their terms are that data services will be provided on a “best effort” basis. Secondly, businesses have no control over the data protection process. The assumption is that the provider is taking all the necessary steps to ensure that data is always protected.

Virtualized Cloud Data

To enable organizations to reign in control of their data while at the same time making it easily accessible for their end-users, regardless of where they are located, data needs to be virtualized in the cloud. Just like server virtualization has unlocked physical server resources so that they could be shared out to multiple applications across private and public clouds, file data virtualization could enable businesses to unlock data from the confines of local physical storage silos so that it too can be made ubiquitously available in the cloud.

Software-Defined Data Access

To break-down multiple NAS silos in the enterprise, businesses can maintain one universal copy of their unstructured data in the public cloud. Cloud storage continues to drop in price so the opportunities for cost savings can be significant. But often there can be latency with accessing data stored in the cloud. Even when businesses have robust networking infrastructure, data often has to go through multiple network hops to arrive at its destination. Therefore, to ensure that end-users have the same look and feel and robust application performance experience as data that is served up via a local NAS, active data ideally should reside on a local cache that provides a CIFS or SMB mount point. Ideally, this cache could be deployed in the form of a dedicated appliance or it could be software-defined and reside on a virtual machine.

Centralized Data Distribution

The benefit is that instead of deploying discrete silos of NAS storage in various office and data center locations, data could be serviced up via a local physical or a virtual appliance outfitted with a layer of SSD that is specifically sized to the data storage requirements of that office. But importantly, only ACTIVE data would reside locally, while inactive data would be stored in the cloud. Active files would be retrieved from the cloud by the appliance and stored in the local cache to facilitate fast data access. And then changes made to those files would be replicated back to the “gold” copy in the cloud to ensure continuous data protection. This enables organizations to store the right data in the right place at the right time; low cost cloud storage which provides highly efficient storage capacity and a right-sized local SSD cache that enables fast, local performance for business applications.

Data would also be secured through strong encryption but importantly, all the key management would be fully managed by the appliance; whether the data is stored locally or in the cloud. Maintaining local control of encryption keys is becoming a “must have” for many organizations as it enables them to ensure that their data will remain secure regardless of where it resides. This insulates organizations from the exposure that might otherwise occur if a cloud provider is compelled to turn over data as part of a legal dragnet.

Bye Bye Backup Pain

This would also mitigate the issues with network latency since active data would only have to traverse the internet one time before landing on the local cache, but another significant benefit is that it would eliminate the need to perform daily backups of this data – as it would always be protected in the cloud. This last point can’t be overstated enough. According to multiple industry sources, approximately 80-90% of all net/new data growth comes from unstructured data. By moving the bulk, if not all, of this information into the cloud, IT organizations can remove a major pain point out of their operational environment – bloated backups. It could also be an opportunity to drive significant cost reductions out of the environment as growing data stores require a perpetual increase in backup software licensing and the associated backup disk and tape infrastructure to protect it.

Perhaps one of the biggest advantages of cloud-based file data virtualization is that it essentially supports unlimited business data growth. Storage capacity in the cloud expands organically without requiring any administrator intervention and in turn, this totally eliminates the storage refresh or rip and replace cycle.

Conclusion

The bottom-line for IT is that unending data growth is a fact of life and it will only continue to increase parabolically over time. Legacy storage technology and some of the interim solutions that have succeeded them are not viable methods for cost-effectively managing growing unstructured data stores. In order for IT to move on from the storage management game and focus on business revenue generation activities, calls for a new approach to data management – file data virtualization. Solutions, like those from Nasuni, combine the best of local high performance storage resources with low cost, ubiquitous cloud storage capacity.

By removing all of the costs associated with disparate silos of storage architecture, IT can make the CFO happy. And by giving the business a reliable and rapid way to access information, IT can make their end-user constituents happy. Armed with these “wins”, IT will assuredly become “more brave”.

Sponsored by Nasuni

Click Here To Sign Up For Our Newsletter

About Colm Keegan

As a 22 year IT veteran, Colm has worked in a variety of capacities ranging from technical support of critical OLTP environments to consultative sales and marketing for system integrators and manufacturers. His focus in the enterprise storage, backup and disaster recovery solutions space extends from mainframe and distributed computing environments across a wide range of industries.

Tagged with: Backup, Capacity, Cloud, Copy Data, Data Storm, Disaster recovery, Encryption, Hybrid Cloud, Integration, NAS, Nasuni, RPO, Security, Unstructured data, Virtualization
Posted in Article