Solving the Unstructured Data Protection Problem by Creating A Cloud Foundation

Posted on December 9, 2019 by George Crump

When organizations try to establish a cloud strategy, one of the challenges they face is where to begin. The problem is the cloud can do so much, like run applications, and provide advanced services and store data, to name just a few. A crawl, walk, run approach is often best and data protection is a great place to start. Narrowing the initial project down even further, protecting unstructured (file) data, leveraging the cloud as a ransomware bunker, may provide the highest probability of initial success and a solid return on investment.

A key factor though, when exploring solutions that copy file data to cloud storage, is making sure those solutions lay the groundwork for some of the other use cases. Looking at the solutions from this point of view may weed out candidates that are too basic. Customers should look for solutions that can make the movement between cloud and on-premises seamless.

The Keys to Protecting Unstructured Data

One of the key requirements for protecting unstructured data is making sure that data is stored in a native format. A native format enables the solution to be future ready. If the software stores protected data natively, then, in the future, the organization can use cloud compute to process and extract additional value from that data.

Another key unstructured data protection requirement is in understanding what is contained within the data set. The solution should provide data classification capabilities so that IT can run reports on what kind of data is contained in the backup. Understanding what data the organization is storing helps them understand the value of that data, so they can better manage it and reduce costs.

If the first use case is unstructured data protection, then the solution also needs capabilities to recover accidentally deleted data. The length of time the solution retains data should be user defined. The solution also needs to snapshot those backup copies and set snapshots to be immutable for a period of time to protect data from a ransomware or malicious user attack.

The organization may also want to make sure that the data sent to the cloud is not readable by anyone else other than the organization. The encryption solution should integrate with a Key Management Server (KMS) so that data is encrypted with unique keys that the customer owns and stores, prior to sending that data to the cloud.

Finally, the solution needs to, of course, copy data from one storage location to another. The faster and more frequent these copies are made the better the organization is prepared to recover quickly. Ideally, the solution should replicate data, as it is created or modified, to the cloud storage area. Leveraging snapshots is an ideal way to replicate the unstructured data set to the cloud, so snapshot tiering is a key feature to look for.

The net result is a global file system like architecture that deduplicates, compress, encrypts and then replicates data from an on-premises storage area to cloud storage which has a similar cloud file system. The data is stored natively in that file system, so that any cloud compute process with the same file system software can access the data. The data is then snapshotted for protection from ransomware or user attack.

The Benefits of Unstructured Data Protection Backup to the Cloud

The most obvious benefit of creating an unstructured data protection strategy that leverages the cloud is that the organization protects itself from ransomware. The combination of near real-time file copies plus immutable snapshots provides the levels of protection required to recover from such an attack.

Another benefit of this approach is it sets the organization up to leverage the cloud for other use cases. Unlike most data protection solutions, data is stored in a file system similar to the file system used on-premises. As a result, any cloud application that is given access to that file system is able to directly interact with the data. Cloud applications could subsequently index that data.

Hammerspace – Universal Global Namespace

Hammerspace provides a universal global namespace that enables customers to seamlessly move live data and applications between on-premises and cloud storage without modification. Today, Hammerspace is available in both the Amazon AWS and Google GCP marketplaces. Customers of either of those services can go to the respective marketplace and have Hammerspace deployed in ten minutes. The solution also supports Azure and will be available in the Azure marketplace early in 2020. Also coming in 2020 is the ability to span multiple clouds. A Hammerspace customer could have an instance of the file system on-premises and in each of the supported cloud services and access all data as if the multiple instances were on one volume.

In its latest release, Hammerspace is focusing on the ideal initial use case, data protection. The solution can replicate data from an on-premises volume to a cloud volume in near real-time via its snapshot tiering capability. The new release also adds an undelete functionality that protects user data against being deleted for a specified period-of-time. The undelete policies are file granular, which means administrators can set longer term undelete retention policies for more important data and provide limited or no protection for less critical data like temporary files.

The latest release also integrates support for key management servers (KMS) to enable customers to encrypt data with their keys before sending it to the cloud. Customer ownership of the encryption keys ensures that the cloud provider can’t “see” the organization’s data.

Lastly, this release adds automated data classification as data moves to cloud or object storage. Hammerspace can automatically classify data with Multipurpose Internet Mail Extensions (MIME) type information. MIME was originally designed to extend email solutions to support more than text files. Today it is used to identify the type of data a file is storing. A wide range of file formats support MIME including DOCX, DOC, JPG, PNG, MP4, WAV, MOV and PDF. The capability enables a customer to learn more about the data they are storing than just the date created, and date last accessed. They can use the MIME information to optimize data storage and to better set data retention policies.

StorageSwiss Take

Hammerspace provides a foundation for organizations to seamlessly move data between on-premises file systems and cloud-based file systems. Its advancements in data protection capabilities provide organizations with an ideal on-ramp to the solution and prepare the organization to leverage cloud resources more fully. The data protection capabilities alone can solve a major IT challenge, protecting unstructured data. The advantage of the Hammerspace solution is that once the data is in-place in the cloud, accessing the cloud-based copy is seamless and organizations can almost immediately begin leveraging cloud resources on that data.

Sign up for our Newsletter. Get updates on our latest articles and webinars, plus EXCLUSIVE subscriber only content.

About George Crump

George Crump is the Chief Marketing Officer at VergeIO, the leader in Ultraconverged Infrastructure. Prior to VergeIO he was Chief Product Strategist at StorONE. Before assuming roles with innovative technology vendors, George spent almost 14 years as the founder and lead analyst at Storage Switzerland. In his spare time, he continues to write blogs on Storage Switzerland to educate IT professionals on all aspects of data center storage. He is the primary contributor to Storage Switzerland and is a heavily sought-after public speaker. With over 30 years of experience designing storage solutions for data centers across the US, he has seen the birth of such technologies as RAID, NAS, SAN, Virtualization, Cloud, and Enterprise Flash. Before founding Storage Switzerland, he was CTO at one of the nation's largest storage integrators, where he was in charge of technology testing, integration, and product selection.

Tagged with: Cloud, Compression, Deduplicate, Encryption, Hammerspace, Object Storage, Ransomware, Replication, Snapshot, Unstructured data
Posted in Blog