Purpose-Built Backup for Unstructured Data

Posted on October 9, 2019 by George Crump

Unstructured data has grown dramatically in terms of total capacity, the number of files, and criticality to the organization. Home directories now represent the bulk of the organization’s creative output, and losing or not being able to find that data is unacceptable. In addition to users creating unstructured data from their use of office productivity applications, machines and devices now contribute significantly to the unstructured data store. All this data is unique and needs to be protected.

Why Unstructured Data is Unique

Most data protection solutions have some form of unstructured data protection. To these applications, however, unstructured data is just another data source. They don’t give it the individual consideration it deserves. Unstructured data needs frequent protection since it is the most vulnerable to a ransomware attack. IT also requires insight into the data set. Unstructured data needs to be classified and indexed since the file name only provides a tiny fraction of details about the actual data.

How Unstructured Data is Protected Today

The massive growth of unstructured data, both in terms of raw size and especially in terms of the number of files, leads most data protection vendors to use image-based backups. Image-based backups save the software from having to scan or walk through the file systems directory structure, file-by-file looking for files that need protection. An image backup only needs to backup blocks of data, so determining what needs to be backed up and backing that data up is very fast.

Even though the backups are image-based, most backup solutions do provide individual restores, but the requesting user needs to know the name of the file and what backup job, via date range, contains the file. Additionally, an image-based technique provides limited granularity into the actual data itself. The software can only enforce retention policies at a job level, not an individual file level, which makes meeting increasingly complex compliance policies more challenging than ever.

Do-it-all backup solutions, for obvious reasons, focus on mission-critical database and virtual environment protection. Unstructured data protection is often an afterthought. That attitude shows in their products. Adding a solution which is purpose-built for unstructured data protection not only provides the organization with better protection but improved insight as well. It also makes the do-it-all solution more reliable since IT can focus the software on its strengths.

Aparavi File Protect & Insight – Purpose Built Unstructured Data Protection

Aparavi’s File Protect & Insight (FPI) doesn’t want to replace the organization’s data protection solution. Instead, it intends to complement it. With Aparavi, the organization continues to use its current backup solution to protect mission-critical databases and virtual environments but uses FPI to protect their unstructured data pools.

Instead of using a slow file system walk or an image-based backup, FPI uses a snapshot-like technology to provide rapid data protection of even the largest unstructured data pools while still providing detailed information about each file and its versions. As a result, Aparavi FPI provides the granularity and visibility that unstructured data management requires.

FPI deploys as a software appliance, which protects file-servers and various name-brand network attached storage (NAS) systems. It also protects end-user devices. While the software appliance is ingesting data from these multiple sources, it classifies indexes, encrypts, and stores it. The software appliances can store data locally and on almost any cloud provider or S3 compatible object-store. Data remains encrypted during transmission to the cloud provider and while it is at rest.

Data flow is all set by policy. The organization can decide to keep the most recent backup on-premises in the software appliance, and later move older backups to cloud storage as they age. Because of its file granularity, this archival to cloud storage actually means a legitimate decrease in on-premises backup capacity consumption.

FPI also provides an unstructured data version of “Instant Recovery,” a feature offered by some backup vendors to mount virtual machines quickly. However, those solutions don’t support NAS or file-server shares. Given the importance of unstructured data and its vulnerability to attacks like ransomware, the lack of this capability in other solutions leaves a gaping hole in their recovery strategy. Aparavi’s prioritization and development of this capability speaks to the value of having a purpose-built solution for unstructured data.

StorageSwiss Take

Like mission-critical databases or virtualized infrastructures, unstructured data needs frequent backups to protect against ransomware and other threats like hardware failure or user error. Unlike those environments, IT needs help organizing unstructured data so it can find it in the future and so the organization can ensure that it is adhering to retention policies.

Aparavi, because it is purpose-built protection for unstructured data, is ahead of the more traditional, do-it-all providers of data protection solutions. Organizations don’t want to add additional products to their operations portfolios, but sometimes the needs are too high. Traditional do-it-all products are too far away from meeting the unstructured data protection needs. Aparavi makes the process easier by providing an easy to install and easy to use solution that brings significant improvement to the process while lightening the load on those traditional applications.

Sign up for our Newsletter. Get updates on our latest articles and webinars, plus EXCLUSIVE subscriber only content.

About George Crump

George Crump is the Chief Marketing Officer at VergeIO, the leader in Ultraconverged Infrastructure. Prior to VergeIO he was Chief Product Strategist at StorONE. Before assuming roles with innovative technology vendors, George spent almost 14 years as the founder and lead analyst at Storage Switzerland. In his spare time, he continues to write blogs on Storage Switzerland to educate IT professionals on all aspects of data center storage. He is the primary contributor to Storage Switzerland and is a heavily sought-after public speaker. With over 30 years of experience designing storage solutions for data centers across the US, he has seen the birth of such technologies as RAID, NAS, SAN, Virtualization, Cloud, and Enterprise Flash. Before founding Storage Switzerland, he was CTO at one of the nation's largest storage integrators, where he was in charge of technology testing, integration, and product selection.

Tagged with: Aparavi, Cloud, NAS, Purpose Built, Ransomware, Snaphots, Unstructured data, Virtual machine, Virtualize
Posted in Briefing Note