For as long as there has been data there has been a quest to consolidate that data onto a single, consolidated storage system, but that quest seems to never be satisfied. The problem is that there are essentially two types of data; active and archive. Active data typically needs fast I/O response time at a reasonable cost. Archive needs to be very cost effective with reasonable response times. Storage systems that try to meet both of these needs in a single system often end up doing neither particularly well. This has led to the purchase of data and/or environment specific storage systems and storage system sprawl.
The enterprise cannot afford the storage system sprawl that it is currently experiencing but at the same time they can’t afford to compromise either performance or cost efficiency. It is time for enterprises to move to a two-tier storage architecture that provides a best of breed storage experience for both types of data, active and archive. The emergence of two storage technologies, flash and object storage, is making the two-tier storage architecture a reality. It is also becoming easier for IT to get these two storage environments to interact with each other.
Tier 1 – All-Flash
Flash storage has moved beyond the go to performance option for the enterprise. All-Flash array’s safe use of MLC flash and use of data efficiency technologies like thin provisioning, deduplication and compression have allowed vendors to legitimately claim price parity with performance focused hard disk arrays. These flash arrays eliminate the need to constantly fine tune storage performance, since every workload on them has access to thousands of IOPS (Input/Output Operations per Second).
While all-flash arrays have achieved price parity with performance focused hard disk arrays, they have not and probably will not achieve price parity with capacity focused hard disk arrays. At the same time, not all data should be on all-flash all the time. Flash is best used and justified for the active data sets, but as data ages it should be moved to a more cost effective, more scalable storage tier. In fact, many data sets should never be on flash even at the point of creation. Good examples are sensor data and video surveillance data.
Unlike early flash implementations that were cache oriented, all-flash arrays are not capacity limited. Several of these systems are scale-out architectures that can scale to single digit petabytes. This means that administrators don’t have to aggressively move data off of the flash array. Data management and movement to the second tier can be more of a scheduled maintenance operation.
The Object Option
There are several options available for this secondary tier of storage, and as we discuss in our article “Should Enterprises replace NAS with Object Storage?”, one well worth considering is object storage vs. a high capacity NAS. Archive data needs to be stored in a very cost effective manner that provides high durability of the data being stored.
Data could be on this tier for years, if not decades. IT needs to have confidence that the data on this tier will be readable years after originally being stored. This tier also needs to scale to the tens, potentially even hundreds of petabytes. Finally this tier needs to receive input from a variety of sources. Data also needs to be occasionally migrated from the primary storage tier, which means the archive tier has to support traditional block and file protocols. But this tier also needs to support newer, more cloud like protocols, like object and S3 which Internet of Things type of devices may transmit. Object storage hits all of these check boxes.
Making The Two Tiers Work Together
In the past, the challenge with a multi-tier architecture has been moving data between the storage systems. Simply reducing the number of tiers to two makes this process much more manageable. As mentioned above, several all-flash arrays can scale to petabytes so moving data is not an urgent day-to-day task. It can be scheduled on a more periodic basis, once a week or even once per month. Also the data types between the active and archive tier are so different that many data sets will not ever need to cross tiers. For example databases and applications will likely stay on flash and sensor data will likely always stay on object storage.
There are times where movement of data between tiers will make sense. Either a backup or an archive migration will typically cause movement of data from the flash tier to the object tier. Flash array vendors like SolidFire have added the capability to perform snapshot backups directly to an object storage system like Cloudian’s HyperStore solutions. In addition, vendors like this have both integrated into OpenStack so that movement of data between the tiers can be automated through that framework.
A more straight-forward use case in the enterprise is to simply copy or, even vMotion older virtual machine data, to the secondary storage device. In these use cases it is again critical that the object storage system support the legacy protocols. Cloudian, for example, supports NFS. This allows administrators to simply copy older data with protocols with which they are already familiar.
A two tier architecture that leverages flash for the active tier and object storage for the archive tier creates an almost ideal storage infrastructure for the data center. It provides high performance flash for the most active data, and affordable object storage for unstructured and older data sets. The combination slows the growth of the all-flash storage tier while making sure that archived data is durably retained for decades.
Sponsored by Cloudian