Creating a Crawl, Walk, Run Approach to Archive

In a recent entry, “What Killed Archive?“, I discussed why most data centers don’t move inactive data from expensive storage to less expensive storage, despite a very compelling return on investment (ROI). The number one killer of the archive process is the solutions themselves. They are too complex and have too many moving parts. There is however another killer; most archive solutions require that the organization jump in with both feet. Archiving inactive data is a large project and while there is a lot of data that qualifies, the problem is that solutions don’t allow the project to be broken into smaller, more manageable chunks.

Most archive solutions require that the IT professional first evaluate and implement an archiving software package. After installing the software package on a dedicated server, that server needs to either connect to storage or have storage installed inside of it. Then if connectivity to tape, object storage or the cloud is part of the archive process, IT needs to set up and configure those components. All of the components need to come online at once, making testing for proof of the concept difficult and expensive. Instead, IT needs a crawl, walk, run approach to archive. Start small and gain confidence in the solution and its ROI potential before moving it into production.

Virtualizing Archive

If this “all-in” requirement is a top killer of archive projects, then virtualization of the process will be its savior. Almost every data center today invests in virtualization, and most virtualized hosts can easily support another virtual machine (VM). So, why not use these resources for a virtualized archive appliance? With an archive VM, IT administrators can begin an archive project by simply downloading software. Unfortunately, most archive appliances are physical, or the software can’t run within a VM. However, a few archiving vendor companies, such as Crossroads and FujiFilm, can deliver a virtualized archive appliance now.

The virtualized archive appliance enables a quick start for the archive process. Once the archive VM is in place, you can provision inexpensive disk capacity to the VM directly from the virtual cluster. Finally, for no cost at all the data center can see the impact of a real archive process.

Where to Start?

After installing the archiving VM, the next big question is where to start. Organizations should try to keep it simple. So, the best idea is to have the software archive the oldest five percent of the company’s data. Depending on the software you can find this data set either by selecting the oldest possible segment of data (everything not accessed in x years) or by specifically asking the software to archive the oldest five percent of data. Some archive solutions present the archive storage as a mount point, enabling manual selection and movement of data.

Gaining Archive Confidence

Once the archive has begun, the next step is to gain confidence in the software. Archive confidence comes in two forms; first, when an archive file is recovered and second, when it becomes apparent that most of this data is never going to be asked for again. Recovery type depends on well-designed software and what the source is. In some cases, the archive software can trigger recovery automatically when the user tries to open the file, in other cases, the administrator will need to find the file and restore it.

Being able to recover a file and verifying that the data is intact once you recover it is the acid test for an archiving solution. The real validation comes when the request for data occurs during a normal business day, NOT a simulated recovery during testing. The problem with traditional archive is that by the time this real world recovery occurs the organization has a large investment in archiving hardware.

What makes the virtualized archive solution so unique is not only can an organization get to production quickly and cost-effectively, it can also hold off on further investment until these ‘during the business day’ restore requests occur. Then when it sees that the software does what the vendor says it will do, the organization can jump in with both feet, and make investments in archive disk and tape as it sees fit.

Sponsored by Fujifilm Dternity, Powered by StrongBox

Twelve years ago George Crump founded Storage Switzerland with one simple goal; to educate IT professionals about all aspects of data center storage. He is the primary contributor to Storage Switzerland and is a heavily sought after public speaker. With over 25 years of experience designing storage solutions for data centers across the US, he has seen the birth of such technologies as RAID, NAS and SAN, Virtualization, Cloud and Enterprise Flash. Prior to founding Storage Switzerland he was CTO at one of the nation's largest storage integrators where he was in charge of technology testing, integration and product selection.

Tagged with: , , , , , , ,
Posted in Blog
One comment on “Creating a Crawl, Walk, Run Approach to Archive
  1. […] Storage Swiss Says: “Crossroads and FujiFilm can deliver a virtualized archive appliance now, enabling a quick start for the archive process. Once the archive VM is in place, you can provision inexpensive disk capacity to the VM directly from the virtual cluster. Finally, for no cost at all the data center can see the impact of a real archive process.” Read the full article here. […]

Comments are closed.

Enter your email address to follow this blog and receive notifications of new posts by email.

Join 22,182 other followers

Blog Stats
  • 1,514,150 views
%d bloggers like this: