Organizations are now drowning in unstructured data. Addressing the growth and managing this data is increasingly a high priority for organizations. The public cloud has a role to play and vendors must carefully integrate it into their unstructured data protection and data management solutions.
Protecting Unstructured Data with Public Cloud Storage
Protecting unstructured data with public cloud storage does not necessarily mean backing up all data to the public cloud exclusively. Most organizations want a working set of backup data on-premises. IT can use the public cloud for limiting the growth of on-premises secondary storage. In most cases, the only data accessed from backup storage for recovery purposes is the most recent copy. Despite this reality, most organizations retain data for years if not decades, just in case the data is needed in the future. There are undoubtedly good reasons to retain data both to meet external regulations and internal corporate governance, but backup is not the place to perform that function.
Since the likelihood of the organization reaccessing backup data after a few days is small, this data is a candidate to archive to public cloud storage, limiting the growth of on-premises backup storage. The problem is few vendors support this use case, and it only solves one part of the problem, which is minimizing the growth of on-premises backup storage. Most of these solutions do nothing to reduce the growth of primary storage or to provide insights into what type of data they store on the secondary storage tiers.
The Importance of Backups with Insight
The next generation of data protection solutions also need to provide insight into the data they are protecting so organizations can gain insights into it. Armed with this insight organizations can make decisions based on details about the data. A simple example is archiving old data off of primary storage in addition to secondary storage, freeing up capacity on both data tiers. That archive is another use case for public cloud storage, although IT needs to weigh the costs of storing data in the cloud perpetually versus the cost of housing that data on-site.
Another use case for backups with insight is to comply with the various data privacy regulations like GDPR and California’s consumer privacy act (CCPA) which, at the request of the customer, require organizations remove that customer’s data. Known as “the right to be forgotten” these components threaten to break traditional backup solutions. A data management solution armed with insight may be the ultimate solution since it can quickly identify data belonging to a particular user or customer and eliminate it without compromising the integrity of the data set.
The defensible removal of data is more than just fulfilling the right to be forgotten requests. There is also data that the organization can justifiably remove, but again that data needs identification and documentation of that removal. Once again insight is critical.
Public cloud storage can play a vital role in the protection and management of unstructured data. The use of the public cloud storage, however, has to be done correctly. Most solutions, if they support public cloud at all, use cloud storage to mirror the entire backup or archive repository instead of using public cloud storage intelligently. In our latest webinar Storage Switzerland and Igneous Systems discuss how to evolve unstructured data management to handle both the growth of the data set and the increased number of regulations that surround it. Watch “The Elephant in the Datacenter – Protect, Manage, & Leverage Unstructured Data” now.