The case for implementing an archive strategy is impressive, move the inactive 80% of data that hasn’t been accessed in the last year off of production storage to an archive store. An archive solution is less expensive, requires less power, less data center floor space and does not need to be protected by the backup process. The cost savings of an archive solution are extraordinary, many showing complete investment recovery in months. So what then is killing archive?
The only legitimate concern that could kill an archive project is how the solution will respond when a user or application needs data from the archive. In reality, the chances of a recall are so minuscule that you could quite literally delete the data, instead of archiving it, and not be impacted. Of course, no IT professional is going to delete 80% of their data, and organizations are increasingly asking IT to keep all data forever. Data retrieval is a legitimate concern, but what usually kills an archive project are concerns that the archive vendors create in the delivery of their solutions. These solutions are either overly complex, proprietary or both.
The number one cause of death for an archive project is complexity. Some archive “solutions” are a hodgepodge of software and hardware from a variety of vendors. Their integration requires a team of archiving specialists. First, a software application has to be purchased to analyze the data for archive-worthy candidates. Second, another software application needs to be bought to manage writing to the archive storage target (disk, tape or cloud). While vendors argue over which target is best suited for archive, the truth is there is no “best”. What is best depends on the organization’s needs. The problem is that most “target managers” force the choice upon you. To some extent the target choice does not matter, the complexity of the target is just one small piece of a very complicated puzzle.
Vendor Lock In
The number two killer of an archive project is vendor lock-in. An alternative to the above complexity is for vendors to deliver a turnkey solution. The problem with this approach is that it forces you to not only use a specific archive target, but it also forces you to use a specific software component for moving data to that archive. That software also often writes data in a proprietary format. This level of lock-in is particularly troubling since archive data by definition will often be retained for years if not decades.
The Archive Workarounds
Given the choice of a very complex but flexible archive or a proprietary archive solution, organizations have taken another path; expanding primary storage to keep up with capacity and retention demands. While this seems like the path of least resistance, it is also the most expensive path, not only in terms of storage costs but also in power and floor space costs. Additionally, these systems are not designed to retain information long term; they lack integrity checking and automated media migration.
Data Preservation – An Archive Resurrection
Data Preservation is the next step in archiving. At the heart is an appliance (physical or virtual) that simplifies both the classification and movement of data as well as abstracting the archive targets. The data preservation appliance leverages operating system APIs to scan file systems for inactive data and then moves that data to the archive target of the organization’s choice. Data can also be manually copied to the system since it presents the archive target as a CIFS, NFS or S3 (object) mount point.
Data preservation also stores that archive in a non-proprietary format so that the data preservation solution is not required to recover data. However, if the appliance is in place, it can retrieve data transparently for the requesting user or application, without having to involve IT. Finally, the data preservation solution can live up to its name by performing routine data verification and media migration to ensure the data it handles will be readable for decades to come.
If data preservation can simplify the classification and movement of data as well as abstracting the IT professional from directly interfacing with the target, then they are free to select the target that makes the most sense for their organization. Today’s options include scale-out disk, tape or cloud. In our next column we’ll compare these technologies to see which makes the most sense for your organization.
Sponsored by FujiFilm Dternity Powered By Strongbox