Creating an Object Storage System that Bridges the gap to Tape

All data is not equal. It needs to be stored on different types of medium depending on the use case. Even within the archive dataset, all data is not created equal. Just like there are tiers of storage for primary data, archive data, for the maximum balance of cost and organizational efficiencies, should support multiple storage targets. The problem is archive storage is missing a middle tier of disk storage that bridges the gap to tape.

The Types of Archive Data

The reason multiple tiers of archive hardware are needed is that there are three types of data within the archive data set. First there is data that has just qualified for archiving because it has not been accessed for more than six months. This data has the highest probability of being recalled, and users will want access to that data almost as fast as if that data were still on primary storage. A scalable NAS, leveraging high-capacity hard disks, can service this class of archive data well.

There is also data within an archive that is multi-years old and may never be accessed. It is a long-term archive data set. Economically storing this data, but making sure it is accessible, is critical. While it still needs to be accessible it does not need to be recovered instantly; a few hours or even days may be an acceptable recovery time. A tape library is an ideal solution for this category of archive data. It requires no power when not in use and can scale to very high capacities. If this data is needed restoration from tape should take less than an hour.

In between the recently archived and long-term archived data sets is the middle archive. This intermediate data has not been accessed within a one-year to seven-year time frame. Object storage is an ideal target for this data set, but most object storage vendors have not optimized their solutions for this use case. These solutions have a similar four-year lifecycle of a NAS storage system. Spectra Logic recently announced ArcticBlue which increases the endurance of archive disk systems and can be the bridge from NAS to Tape.

What is ArcticBlue?

The goal of ArcticBlue is to deliver a storage solution that is half the cost of an archive NAS solution but intended to store data twice as long. If they can achieve that goal, Spectra Logic may deliver an ideal bridge to tape.

Delivering on the Full Promise of SMR Drives

The key for Spectra Logic is to manage the lifecycle of the drive in the same way that the archive software manages the lifecycle of the data. At the heart of ArcticBlue are Seagate hard disk drives that leverage Shingled Magnetic Recording (SMR) technology to deliver 8TBs of capacity.

Managing the SMR drive is an essential component of the ArcticBlue system. SMR drives are unlike standard hard disk drives that have independent tracks. The tracks on an SMR drive overlap and to make sure data remains intact, require intelligent updates to the drive. For optimal use, SMR drives need to be written to sequentially in large blocks. They don’t respond well to data deletion or random I/O. The special handling of an SMR drive is very similar to tape media, which of course Spectra Logic has been doing for decades.

Not Your Father’s MAID

Another critical capability in surpassing the typical life of disk archive system is powering down the drives. Most disk archive systems power their drives all the time. Doing so wastes power and shortens drive life. The concept of powering down drives is not new; laptops have done it for years, and MAID (Massive Array of Inexpensive Disks) tried to provide an enterprise solution. While spin down hard disk drive technology worked well in the laptop use case, MAID never actually worked for the enterprise.

Spectra Logic’s take on powering down hard disk drives has less to do with power efficiency, although there is some gain there, and more to do with prolonging the life of the disk drive itself. The system has four large bands of drives set in a 20+3 configuration for data protection. When not in use, groups can be independently powered down. But unlike a MAID storage system, ArcticBlue can power all the bands making them all active at the same time. This power management reduces drive failure 1100% over a seven-year period.

Of course there will be times where data is requested from a powered down band. The first differentiator for ArcticBlue stems from its ability to power all bands simultaneously. The system does not have to wait to power down a band so another band can be powered up. Another advantage that ArcticBlue has over legacy MAID systems is that it is built on an object foundation. It is driven by a RestFul API which is perfectly capable of sustaining the latency of a power up cycle.

The Multi-Tier Archive

ArcticBlue provides the middle ground for an archive that requires large capacity and retention measured in decades. Organizations with this type of archive need it to be cost-effective while ensuring data integrity. To meet these requirements an archive strategy will need to include three tiers of storage. A front-end NAS tier that delivers reliable performance and high capacity. This tier will hold 1-2 years of data. Spectra Logic’s Verde disk system can be the solution for this tier. The middle tier will hold data for 1 to 7 years and because of the reasons described above, and ArcticBlue fills that requirement. Then tape will be the final tier containing data for seven years to decades, and obviously, Spectra Logic provides a suite of tape libraries that meet that requirement. Ideally if disk and tape are used in the archive there should be the ability to dual copy, meaning that Dat is written to both disk and tape at the same time. This not only provides protection but also pre-populates the tape tier, so when it is time to remove the data from the disk tier it does not need to be copied yet again.

StorageSwiss Take

While much of the hype in the industry is focused on all-flash arrays and production data, the significant challenge is storing the massive amount of secondary data that organizations are creating as well as meeting its unique demands. Spectra Logic has brought to market an impressive array of secondary storage solutions that promise to lower the cost of storing and retaining this data while preserving it and making it easier to access.

Twelve years ago George Crump founded Storage Switzerland with one simple goal; to educate IT professionals about all aspects of data center storage. He is the primary contributor to Storage Switzerland and is a heavily sought after public speaker. With over 25 years of experience designing storage solutions for data centers across the US, he has seen the birth of such technologies as RAID, NAS and SAN, Virtualization, Cloud and Enterprise Flash. Prior to founding Storage Switzerland he was CTO at one of the nation's largest storage integrators where he was in charge of technology testing, integration and product selection.

Tagged with: , , , , , ,
Posted in Product Analysis
One comment on “Creating an Object Storage System that Bridges the gap to Tape
  1. Tim Wessels says:

    Well, unless I missed something in the article, there was no mention of the Spectra Logic BlackPearl DS3 system, which sits at the heart of this architecture. Black Pearl DS3 adds some “extensions” to the AWS S3 API for dealing with the operation of tape drives, which would be installed in a Spectra Logic tape library. The tapes themselves are LTFS formatted and become the “last stop” for deep archive data. This architecture would work well with just three tiers…a flash array tier for “hot” or transactional data that is smart enough to migrate data based on policies to an AWS S3-compatible object-based storage tier for “warm” or “cold” data than needs quick retrieval. The AWS S3-compatible object-based storage tier would then use policies to tier data to the the Spectra Logic BlackPearl DS3, which would then prepare the data objects for “deep archive” storage on a Spectra Logic tape library using LTFS formatted tapes.

    Although they claim to effectively utilize SRM HDDs in ArcticBlue, these drives have much higher error rates compared to PMR HDDs. While they are certainly not suitable for I/O intensive access, SMR HDDs strike me as not “trustworthy” HDD storage devices. Powering off scores or hundreds of SMR HDDs may conserve energy, and possibly extend the SMR HDD life, but it would be useful to get some hard data on this. Storiant is doing something similar with their “deep archive” storage servers, and both Spectra and Storiant rely on ZFS.

    On the whole, this is a good architecture. It uses currently available storage technology, along with an AWS S3-compatible to achieve what is probably the lowest total cost of ownership for life cycle data management. And, it can accept upgrades and newer technology as it become available in the market.

Comments are closed.

Enter your email address to follow this blog and receive notifications of new posts by email.

Join 22,185 other followers

Blog Stats
  • 1,514,744 views
%d bloggers like this: