What Killed Archive?

Posted on October 7, 2015 by George Crump

The case for implementing an archive strategy is impressive, move the inactive 80% of data that hasn’t been accessed in the last year off of production storage to an archive store. An archive solution is less expensive, requires less power, less data center floor space and does not need to be protected by the backup process. The cost savings of an archive solution are extraordinary, many showing complete investment recovery in months. So what then is killing archive?

The only legitimate concern that could kill an archive project is how the solution will respond when a user or application needs data from the archive. In reality, the chances of a recall are so minuscule that you could quite literally delete the data, instead of archiving it, and not be impacted. Of course, no IT professional is going to delete 80% of their data, and organizations are increasingly asking IT to keep all data forever. Data retrieval is a legitimate concern, but what usually kills an archive project are concerns that the archive vendors create in the delivery of their solutions. These solutions are either overly complex, proprietary or both.

Complexity

The number one cause of death for an archive project is complexity. Some archive “solutions” are a hodgepodge of software and hardware from a variety of vendors. Their integration requires a team of archiving specialists. First, a software application has to be purchased to analyze the data for archive-worthy candidates. Second, another software application needs to be bought to manage writing to the archive storage target (disk, tape or cloud). While vendors argue over which target is best suited for archive, the truth is there is no “best”. What is best depends on the organization’s needs. The problem is that most “target managers” force the choice upon you. To some extent the target choice does not matter, the complexity of the target is just one small piece of a very complicated puzzle.

Vendor Lock In

The number two killer of an archive project is vendor lock-in. An alternative to the above complexity is for vendors to deliver a turnkey solution. The problem with this approach is that it forces you to not only use a specific archive target, but it also forces you to use a specific software component for moving data to that archive. That software also often writes data in a proprietary format. This level of lock-in is particularly troubling since archive data by definition will often be retained for years if not decades.

The Archive Workarounds

Given the choice of a very complex but flexible archive or a proprietary archive solution, organizations have taken another path; expanding primary storage to keep up with capacity and retention demands. While this seems like the path of least resistance, it is also the most expensive path, not only in terms of storage costs but also in power and floor space costs. Additionally, these systems are not designed to retain information long term; they lack integrity checking and automated media migration.

Data Preservation – An Archive Resurrection

Data Preservation is the next step in archiving. At the heart is an appliance (physical or virtual) that simplifies both the classification and movement of data as well as abstracting the archive targets. The data preservation appliance leverages operating system APIs to scan file systems for inactive data and then moves that data to the archive target of the organization’s choice. Data can also be manually copied to the system since it presents the archive target as a CIFS, NFS or S3 (object) mount point.

Data preservation also stores that archive in a non-proprietary format so that the data preservation solution is not required to recover data. However, if the appliance is in place, it can retrieve data transparently for the requesting user or application, without having to involve IT. Finally, the data preservation solution can live up to its name by performing routine data verification and media migration to ensure the data it handles will be readable for decades to come.

Targets Matter

If data preservation can simplify the classification and movement of data as well as abstracting the IT professional from directly interfacing with the target, then they are free to select the target that makes the most sense for their organization. Today’s options include scale-out disk, tape or cloud. In our next column we’ll compare these technologies to see which makes the most sense for your organization.

Sponsored by FujiFilm Dternity Powered By Strongbox

About George Crump

George Crump is the Chief Marketing Officer at VergeIO, the leader in Ultraconverged Infrastructure. Prior to VergeIO he was Chief Product Strategist at StorONE. Before assuming roles with innovative technology vendors, George spent almost 14 years as the founder and lead analyst at Storage Switzerland. In his spare time, he continues to write blogs on Storage Switzerland to educate IT professionals on all aspects of data center storage. He is the primary contributor to Storage Switzerland and is a heavily sought-after public speaker. With over 30 years of experience designing storage solutions for data centers across the US, he has seen the birth of such technologies as RAID, NAS, SAN, Virtualization, Cloud, and Enterprise Flash. Before founding Storage Switzerland, he was CTO at one of the nation's largest storage integrators, where he was in charge of technology testing, integration, and product selection.

Tagged with: Archive, Backup, Complexity, Crossroads, Data Preservation, Fujifilm, Migration
Posted in Blog

2 comments on “What Killed Archive?”

Nathan Golden says:

October 7, 2015 at 11:31 am

Great article! I remember deploying SCSI Express software in the 90’s to perform HSM (Hierarchical storage management). I thought by now HSM (or tiered storage) would be the norm, yet for many of the reasons you pointed to, this is not the case.

From the backup side (something near and dear to me), archive becomes more and more essential as the data sizes grow. It just doesn’t make sense to treat 10 year old data the same as 10 minute old data. The quantity of data keeps going up, up, up, while the hours in the day stay the same. Archiving older data is the best way to tackle the problem instead of throwing faster technology at it.
[Storage Switzerland] What Killed Archive? | PBS – Primo Bonacina Services says:

October 10, 2015 at 3:28 am

[…] [to continue, click HERE] […]

Comments are closed.