Every few years the time comes to update and refresh a data centers storage infrastructure. As part of this process many are considering all-flash arrays (AFAs). Vendors have flooded the market with AFA products, and for the IT professional sorting through the available options can be daunting. Potentially more daunting is actually implementing the AFA into the existing infrastructure. When looking to include AFAs as part of a data center wide storage infrastructure, the integration of these devices presents multiple challenges that storage administrator’s need to solve.
Is The All-Flash Data Center a Reality?
In theory, a simple way to solve the challenges when integrating AFAs is not to. In other words, consolidate all workloads on to a single all-flash system. This approach certainly solves the problem as the data center ends up with a single point of storage management, and for most the move to an AFA should eliminate performance problems as well.
Despite anecdotal reports of the all-flash data center, the reality is that most data centers can’t cost justify, nor performance justify the total elimination of hard disk drives (HDD). The single biggest challenge is that all-flash data centers are an expensive proposition. While deduplication and compression help flash be more competitive, it is still more expensive than capacity disk. Additionally, something that all-flash vendors neglect to mention is that deduplication and compression can be applied to hard disk drive (HDD) based systems as well, eliminating the theoretical advantage the AFA vendors claim they have thanks to data efficiency. As long as HDD based systems are less expensive than AFAs, and as long as there are workloads that don’t have the performance demands to justify an all-flash investment, there will be HDD in the data center.
Finally, most AFAs are available from a start-up vendor that is increasingly using customized hardware to take full advantage of the differences between flash and hard disk, like density and performance. Since these systems come from a new vendor, that vendor will also deliver new methods to provide data services.
The Mixed System Reality
The data center will likely have a mix of AFAs, Hybrid Arrays and cost effective, capacity focused HDD based systems. Since there will be more than just a single AFA in the environment, it is critical that the storage administrator develop a strategy to properly manage these discrete systems. The performance benefits of the AFA investment should not be nullified by yet another silo of storage that needs to be discreetly managed due to incompatibility with other storage systems in the environment. This includes an AFA from a legacy storage vendor, as many of these systems came to the vendor through acquisition and the only compatibility is that the logos match.
The organization can decide to try to manually manage multiple storage platforms through brute force. But the administrators may also need to work through a missing or incomplete feature set. For example many AFAs from startups still do not have the ability to replicate data off-site.
How To Integrate an AFA
Step One – Select the Right Software Defined Storage Solution
If the data center’s goal is to develop a consistent storage strategy where all features are universally applied across storage platforms, then they should leverage software defined storage (SDS) to deliver that common feature set. But the SDS solution has to be the right SDS solution for this use case.
First, the SDS solution has to provide integration and support of shared storage systems on existing network infrastructures like Fibre Channel (FC) and iSCSI. This also includes the support of existing storage in the environment like HDD arrays and hybrid systems. SDS should centralize the management of all these systems under a single set of services, so the management and execution of these services is identical across them. Many SDS solutions force a conversion to a hyper-converged architecture. While that may be a viable option for the data center in the future, leaving behind legacy storage assets and not being able to leverage well vetted shared all-flash technology is premature for many data centers.
Second, the SDS solution should support migration of data to the new AFA. Data can’t be simply copied and even some block migration solutions won’t work. For example, almost all AFAs support thin provisioning, a basic block migration tool will actually copy empty block and require the AFA to deal with it separately. In some cases the migration could break the thin provisioning feature common in almost all AFAs. The migration function should be thin provisioning aware and not copy zero blocks. Additionally this migration should occur privately on the storage network, so as not to interfere with production applications. Finally, this migration should be executable while the application is in an online state. This allows the AFA to be integrated with limited amounts of scheduled downtime.
Third, the SDS solution should be tuned for the performance potential of the AFA. The reality is that the centralization of storage assets does add a layer of latency. The SDS developer needs to minimize this as much as possible. This is where the option to run the SDS solution on a dedicated set of clustered appliances or servers has significant value. Combining well written software with dedicated storage processing power and dedicated network resources can make sure that any latency and performance impact from the SDS architecture becomes unnoticeable to applications and users.
Fourth, the SDS solution needs to be able to provide data movement between the various storage systems within the data center. All data is not the same and its performance demands will change over time. And, as stated above, the data center will have multiple forms of storage. The SDS solution should enable data to be migrated to and from the AFA system, rapidly in a non-disruptive manner. This allows data to be promoted or demoted without impact to applications or services. It also maximizes the investment in the AFA by allowing multiple applications to use it when they need the performance boost.
Fifth, the SDS solution needs to provide built in continuity and protection. While all AFAs have some form of RAID data protection built in, as mentioned above, many are lacking in disaster recovery features like synchronous and asynchronous replication. The SDS software should be able to monitor storage connections and volumes in order to sense a failure and immediately shift storage operations to another storage system or location. The replication should be WAN optimized via deduplication, compression and IP protocol optimization to minimize the investment in WAN bandwidth. At the destination, the secondary system should not need to be identical to the primary. The cost of one AFA is expensive, buying two, when one of those systems will basically sit idle until an actual disaster, simply does not make sense for most data centers.
Sixth, the SDS solutions should be able to provide data efficiencies across the storage infrastructure, not just a single storage system. An example of this is deduplication. If data is being migrated or replicated only the unique blocks of data should be replicated between storage systems. Further, if the organization decides, deduplication could be applied across storage systems so that there is no redundant data between them. In the same way compression should be leveraged so that when data is being migrated or replicated it does not need to be uncompressed, transmitted and then recompressed. It should be moved in its compressed state.
Step 2 – Select The Right All-Flash Array
Selecting the right SDS solution may alter the selection of the AFA. Since the SDS solution is providing all the data services, the IT planner can ignore most of the AFA software features and instead concentrate on hardware design. Hardware matters most in a flash storage system, and every aspect of the system should be tuned to deal with a zero latent memory technology.
Focusing on flash hardware will allow the organization to select a dense and very high performance storage system at, potentially, a less expensive price than the system over-run with features that the organization no longer needs. The ability to turn these features off or to not even have them “bundled” at all may be a huge advantage, both in terms of performance and price.
AFAs allow organizations to deliver applications and infrastructures that respond faster and scale further, but not all data needs to be on all-flash all the time, especially when the entire lifecycle of that data is considered. The ability to move data between storage systems is a critical element in the optimization of the AFA investment and it is a capability that none of them provide. In addition, IT planners don’t need another silo of storage to manage with its own separate data services. SDS provides a logical way to eliminate these integration challenges. But as Storage Switzerland covered in its article “The Three Problems with Software Defined Storage” SDS needs to be able to support legacy storage architectures, provide a complete and robust set of data services that the storage infrastructure can be centralized on, and not add latency that reduces the performance of the flash system.
Sponsored by FalconStor
FalconStor’s FreeStor is a software defined storage solution that provides a complete set of well vetted storage management features. It can provide a single source of storage features and a single point of storage management. The solution has been optimized for all-flash devices while at the same time support a robust list of legacy storage platforms.