Why Data Movement Breaks Data Management and How to Fix It

Posted on June 18, 2018 by George Crump

The primary goal of a data management strategy is to reduce storage costs. Achieving that goal requires that at some point data will have to move from primary storage to a less expensive secondary storage. The movement of data between these two points, and perhaps more points as it ages further, is often the key element that breaks a data management strategy because once data is moved, users are impacted. Fixing the data movement problem is critical to both the long-term and short-term success of a data management strategy.

The Problems with Data Movement

The first problem with data movement is that the traditional data management architecture requires copying data from one storage system to another and that data copy happens through an intermediary server or servers that are connected via a network, to both the primary storage system and secondary storage system. The process is time consuming and has potential for error.

The second problem is one of access. While it’s true that most data, which has not been accessed in the last 90 days, will never be accessed again, some of it will, especially data that was newly created within the last six to nine months. Users want access to that data to be seamless both in terms of finding the data and in terms of how quickly they can access that data.

The problem is, especially in the all-flash era, that the performance delta between primary storage access performance and secondary storage access performance is too great. Users are accustomed to accessing their files off of a high performance all-flash array, but with data management, they now have to access their data off a high capacity NAS or object storage system across a slower network. Additionally, in many cases, that access is not direct; data must first be copied back to the original location before it is accessible.

Flash to Flash to Cloud and Data Movement

Using a flash to flash to cloud architecture solves this problem. The heart of the solution is the flash to flash component. Within a single storage system is high performance NVMe flash and high capacity SAS flash. Data is automatically, by intelligence within the storage system, moved from high performance flash to high capacity flash. The movement is completely transparent to the user and application with no changes needed in their workflows.

Data movement is also internal to the storage system; it copies or moves data directly from NVMe flash to high capacity flash. As a result, transfers are rapid and are highly unlikely to be noticeable to users. The recall performance difference is also almost unnoticeable since both tiers are flash based and provide excellent read performance.

Attributes such as rapid data movement between tiers and rapid recalls make sense for data that is in that 90-days to 1-year of being accessed category. It is the data that is the most likely to be recalled. Once data has passed the one-year of not being accessed threshold, the likelihood of someone accessing it again is very low. At this point, it makes sense for the organization to move data not accessed in more than a year, to cloud or object storage. The movement to cloud storage further drives down cost and it allows the organization to take advantage of object storage’s excellent data retention capabilities.

The Data Management Imperative vs. IT Reality

The movement to the object storage tier can be either manual or it can also be automated via one of the various data management solutions on the market today. The organization will need to decide on the importance of transparent recall. While the feature is compelling, it does come with some overhead. IT needs to decide if this overhead is worth it for files that users may never again access. Storage Switzerland finds that when most organizations need data that is over a year old, the request is a known event driven by a discovery request or a need to run analytics on a specific set of data.

Conclusion

The flash to flash to cloud architecture is a cost optimized alternative to going flash-only in the data center. It enables internal data management for recently active data, which means rapid response for the requester and it makes practical the use of cloud or object storage as the long-term repository. All data movement can be done manually without requiring the organization to implement a completely new file-system infrastructure.

In our next blog we will discuss how flash to flash to cloud, helps solve the data protection problem in addition to the data management problem.

Watch On Demand

About George Crump

George Crump is the Chief Marketing Officer at VergeIO, the leader in Ultraconverged Infrastructure. Prior to VergeIO he was Chief Product Strategist at StorONE. Before assuming roles with innovative technology vendors, George spent almost 14 years as the founder and lead analyst at Storage Switzerland. In his spare time, he continues to write blogs on Storage Switzerland to educate IT professionals on all aspects of data center storage. He is the primary contributor to Storage Switzerland and is a heavily sought-after public speaker. With over 30 years of experience designing storage solutions for data centers across the US, he has seen the birth of such technologies as RAID, NAS, SAN, Virtualization, Cloud, and Enterprise Flash. Before founding Storage Switzerland, he was CTO at one of the nation's largest storage integrators, where he was in charge of technology testing, integration, and product selection.

Tagged with: All-Flash, Cloud, Flash, Hybrid, NAS, NVMe, Object Storage, SAS, Tegile
Posted in Blog

Why Data Movement Breaks Data Management and How to Fix It

The Problems with Data Movement

Flash to Flash to Cloud and Data Movement

Conclusion

Share this:

Related