How can a cost effective, low cost object storage system also provide fast recalls? These systems are not typically designed for performance but IT professionals are told time and time again to archive to object storage because of its cost effectiveness and its speedy recalls. In fact, the expectation is that it will recall a user requested file almost as fast as high performance flash.
Why are Object Storage Systems Slow?
The first piece of the puzzle is to understand why object storage systems are slow.
First, slow is a relative term. Certainly compared to a high performance block based flash array but object storage in comparison to tape or remotely connected cloud storage is very fast, especially for single file picks.
Second, object storage systems store a much more robust metadata catalog on the objects (files) it stores vs. a traditional block or NAS system. This metadata later becomes very valuable after the data has been stored for a few years and provides the organization with a whole new way to find the information it stores. Creating, storing and managing all this metadata does create extra work for the object storage system which manifests as latency.
Why Are Object Storage Systems Fast Enough?
The reason most object storage systems can recall data as fast as performance-oriented storage systems is they handle lower amounts of simultaneous traffic. When configured correctly, most object stores store data that has not been accessed in a number in months – maybe longer. Recalls of archived data don’t tend to be random. They are recalled for a specific user or event. Also, the recall typically comes in a batch. A typical request might be to give the user all files for John Smith, accessed from March to April. Or it could be that a file was archived and the user recalled that file by clicking on a stub file. Finally, these requests don’t come repeatedly throughout the day.
Most of the time the object store is idle waiting for IO. It is not overwhelmed with 1,000 of simultaneous IO requests like primary storage. By definition data that is archived is archived because it has not been accessed in a few months or years. Even in an aggressive archive where data is archived 45 days after not being accessed recall rates don’t increase enough to impact performance.
Because of these factors the user experience is a recall request that is as fast or almost as fast as when the user accesses that data on primary storage. This seamless performance is especially critical when the organization has implemented a transparent recall technique where a stub file is left in place of an archive file. The user clicks on that stub file and gets the real file almost as fast as if it had never been migrated.
Seamless archiving requires zero impact recall performance. A modern object storage system should be able to deliver the file back to the user almost as fast as primary storage, at least from the user perspective. When this seamless environment is achieved, IT can then become even more confident and aggressively archive all inactive data. That will drive down the cost of storage and eliminating storage server upgrades.
Object storage is an ideal back-end for Windows file servers. To learn how to create a seamlessly archive for your Windows file server environment watch our on-demand webinar “Capacity – Ransomware – Protection – Three Windows File Server Upgrades to Avoid“.
Sponsored by Caringo