It comes as no surprise to IT professionals that data is growing and 90 percent of that growth is unstructured data. Users will not access 80 percent of that unstructured data after 90 days. It simply does not make sense to store all this data on your primary storage flash or even hard disk-based systems. The answer is archiving, not backup. Backup is making repeated copies of data in case you need it in the event of a disaster. Archive creates one copy of data one time for long term preservation and removes it from primary storage. Lowering the cost of both primary storage and backup storage.
An archive system has to be effective. It has to be accessible seamlessly. It has to scale to petabytes (hundreds of petabytes) of capacity. It has to be reliable for long term data preservation. And of course it has to be significantly less expensive than the primary storage that it is being moved from.
At the FujiFilm Global IT Executive Summit, HPE’s Chris Powers defined an archiving strategy that blends object storage and tape storage into an archive system. The architecture Powers defines creates a three-tier approach. The first tier, probably all-flash, is for the most active data set, which includes databases, recently created or modified user data. The second tier has two roles, first it can be an initial staging area for data no longer accessible in the first tier. It can also be an ingest point for the new data, especially data from IoT, sensors and cameras.
Object storage, as we discuss at Storage Switzerland in many, many blogs and webinars, is the ideal candidate for this second tier of storage. Of course scale-out NAS could be a candidate as well, but we continue to think that scale-out NAS is a short-term solution for more performance sensitive unstructured data.
The third tier, according to Powers – and we agree – for most organizations, should be tape. The problem is that object is still disk-based and requires all the things that disk systems require; data center floor space, power and cooling. The key is that tape has to break the shackles of legacy tape concerns. Tape archive solutions now present tape as a network mount point and you can move data to a tape library by a simple drag and drop. More importantly many of these archive solutions write data to the LTFS format which provides portability between library vendors and even between archive software vendors.
The key is to merge these three tiers so that the latency of mounting a tape into a tape drive is hidden from the user. Primary storage and certainly a secondary storage system can act as a cache to that tape library.
The other tape negative is reliability. Tape as a media is more reliable than hard disk drives. The problem is that a hard disk is sealed into an array and seldom touched by human hands. Tape by definition is moved around all the time and is potentially in many human hands. Most library vendors, HPE being one of them, have the ability to monitor the tape environmental and provide warning of conditions that may cause tape faults.
Powers ended his presentation with a video where an LTO tape cartridge was strapped to a hockey puck, slapped around for an hour and then successfully read again.
ROI of $300K+
The ROI is significant. As an example, Powers compared the cost of upgrading a storage system to store 200 more terabytes of capacity vs. creating an archive we mentioned earlier. The slide showed a potential savings of $335,000 by installing an archive system instead of upgrading a storage system.
In my opening blog for the summit I write that tape essentially won the economics argument, its tape complexity that is the issue. The blended archive strategy that Powers describes does much to overcome the negatives of complexity and reliability while accentuating the cost advantages.