All primary storage systems protect themselves in some way. Most provide protection from media failure by leveraging some form of RAID. They also provide snapshot technology to protect against data corruption or user entry error. Many also provide the ability to replicate data to a similar off-site storage system. Some, as we describe in our ChalkTalk Video, “Using the 3-2-1 Rule to Protect Against Ransomware” integrate directly with third party backup applications. But as the IT professional responsible for data protection knows, these are just foundation components of a much larger data protection requirement. NAS 2.0 should collapse the external components and provide a complete data protection capability within itself.
The Data Protection Problem
There is a simple rule of thumb when it comes to data protection – the 3-2-1 rule. This means organizations should have at least three copies of data on two different types of media, with at least one copy off-site. While 3-2-1 sounds simple, implementation is challenging.
There is seldom a universal controlling application that insures there are three copies of each piece of data. Most organizations tend to overcompensate by making too many copies of data. The problem with “too many” is knowing which copy is the most appropriate for each recovery situation.
“Two different types of media” is increasingly difficult as hard disk drives become the primary and in many cases only form of data protection storage organizations use. For organizations without tape, a copy on a completely different type of storage system like on cloud or object storage is an acceptable alternative.
“1 copy off-site” is the most achievable component of the “3-2-1 rule, but many organizations now have multiple sites or already have data “off-site” because it is in the cloud. Making sure each site has the right copy of data and making sure there is a centralized, protected off-site storage area is critical. For a situation where the cloud has a unique copy or version of data, it is important data be copied either back on-premises or to an alternate cloud.
The problem is meeting these requirements today requires the use of a multitude of tools that are not interconnected. That lack of interconnection means each process has to be independently managed and monitored. It also means there is likely a significant overlap between processes. That overlap results into a case where extra copies of data are made, which reduces overall capacity efficiencies.
NAS 2.0 – The Self-Protecting NAS
A NAS 2.0 solution should provide a holistic means of protecting data, eliminating additional layers of data protection software. Eliminating these layers should not only ease the administrative burden of IT but also improve the quality of the protection process and increase the efficiency of data storage consumption.
The first order of business is to not replace the underlying technologies that are often automatically there. For example, if as part of its management the NAS 2.0 system is storing data on a storage array with built-in RAID protection, it should certainly leverage that capability.
The primary means to improve protection is to make the process more granular. Most protection strategies are volume or storage system-based. A NAS 2.0 system should have a granular understanding of the data it stores at a directory or file level. The granularity allows data protection to become policy-based.
Policy Driven Data Protection
A NAS 2.0 system, as defined in other blogs in this series, becomes a centralized storage architecture for all the unstructured data within an organization. There will be vastly different types of data stored within the architecture, each with different degrees of value and therefore data protection requirements. Policy driven data protection enables protection to vary based on these different data value classifications.
A NAS 2.0 architecture will also consolidate multiple storage arrays and storage locations (on-premises and the cloud). A policy-based protection strategy enables protection to span across multiple storage hardware devices and even multiple locations.
Each policy can operate on specific attributes driven by the value and location of the data. For example, each data set within the architecture can have its own schedule of when protection occurs. Each can also dictate the number of copies both as the data is active and as it becomes dormant. The number of protected copies can automatically shrink as the value of the data changes. Each policy can dictate where those copies should reside; disk and tape, disk and cloud, cloud A and cloud B, for example. Finally each policy can use a different form and level of checksum to confirm the validity of the protected copies.
A NAS 2.0 architectures simplifies the storing of unstructured data by consolidating all of the various types within a single umbrella. Part of value of this consolidation should be the simplification of the data protection process. But for the data protection process to be effective it needs to be granular at a sub-volume level, which enables a policy protection scheme that can span the various storage systems within the architecture and the various locations that the organization might operate from.