The Math Behind Storage Architecture Design

Posted on August 2, 2018 by George Crump

The speed at which the data center is evolving is forcing IT to sacrifice proper storage architecture design. IT is just trying to keep its head above water; it doesn’t have time to swim. This reality leads to a primary storage tier that stores ALL data, not just active data. Increasingly the primary storage tier is all-flash and some vendors are proposing all-flash for all tiers of storage.

The problem is that the math doesn’t justify an all-flash primary storage approach, and it certainly doesn’t justify an all tiers on all-flash approach. The problem is that most hybrid storage and archive storage vendors only look at the higher level math functions to explain and size their solutions, which in turn leads to a failure to meet expectations. IT planners need to go more in-depth on the math to understand how to create a storage architecture that better manages data and keeps costs down.

Math vs. The All-Flash Array

As data centers look to refresh their storage infrastructures, many are considering all-flash arrays while others are looking for their second-generation all-flash array. As IT planners begin this process the temptation is to make the entire primary storage tier flash, where one or multiple all-flash arrays are used to store primary data. The decreasing cost of all-flash systems, plus their density, makes creating an all-flash primary storage tier seem reasonable. However, once IT considers the usage patterns of the data on primary storage, all-flash ends up with a math problem. The numbers don’t add up.

How Data is Used

There are exceptions, but generally, most data segments follow a similar lifecycle. After creating a segment, users typically access and modify that data frequently for about 24 hours. Subsequently, access and modification of the data segment occurs sporadically over the next several weeks, and eventually the segment goes dormant, rarely accessed again.

This access pattern holds true for both database records and documents. For example, when adding a new patient to a hospital’s patient record system, that record is very active for the time that the patient is receiving treatment and then is dormant until the patient checks back into the hospital again. A document follows a similar pattern of high activity when first modified, then occasionally edited when reviewed, and then after everyone agrees to a final version, goes dormant. Reaccessing the file only occurs when referenced or when used to start the creation of a new document.

Deep Data Math

(Disclaimer, while the numbers below have remained surprisingly accurate over the years and across hundreds of data centers, your actual results will vary)

Most storage assessments show that an organization creates, modifies and accesses about 5% of its total data capacity on a daily basis. The contents of the 5% accessed data changes slightly each day. When compared weekly, the segments accessed are very different. If IT planners look at data patterns over a three-week sliding window, most data centers find that 15% of its data capacity is created, accessed, or modified during that time. The change between days and weeks is what made hybrid arrays less appealing. The hybrid array vendors sold configurations that were only 5% flash and which lead to frequent cache misses over the course of time.

Once data gets four weeks beyond its original creation date, it typically becomes dormant and is rarely accessed again. The overwhelming number of storage assessments show that at least 80% of an organization’s data isn’t accessed again after the first year.

A remaining 5% goes dormant but is reaccessed, typically when triggered by a legal e-Discovery request or by an analytics query. When these subsequent accesses occur the response time to the requestor needs to be quick, not as fast an all-flash array, but fast enough as to not impact the user experience. Processes like AI and data analytics as well as increasing legal inquiries are forcing the percentage of reaccessed data even higher.

The nuances of dormant data are what make traditional archiving solutions less appealing. Of the 85% of data that does go dormant, predicting which 5% will become active again is almost impossible. As a result, if the organization implements an archiving solution it has to be prepared to access any percentage of that data and be able to deliver it quickly to the requesting user or application.

Why Hybrid and Archive Have Failed

Most hybrid array and archive vendors only talk to their prospective customers about the higher level math. An IT planner often hears “Only 5% of data is active on a daily basis, so you only need 5% of your capacity to be stored on flash media” or “85% of your data is inactive so move it all to our archive”. It’s the deeper level math that breaks hybrid systems and archive systems. The lack of understanding that 15% of the total data capacity is active during a sliding three-week window and understanding that 5% of dormant data is reaccessed cause these systems to fail to live up to expectations.

The answer is not to give up and go all-flash, rather the answer is to fix the weaknesses in the strategy. First, a hybrid system needs to either have a much higher percentage of flash or use flash as its secondary tier. A better hybrid system addresses the data modification drift seen in the first three to four weeks of a data segment’s life. It also puts less pressure on the archive tier, only data with no activity over the last year needs to go to that tier.

In addition to making sure data is very dormant before moving it to an archive tier, the IT planner should also leverage technology to support faster access to that tier. The primary storage component of the archive tier should be an object storage system, and it should join with archiving software that not only moves data to the object store but also sets up transparent links back to primary storage.

In our next blog we’ll discuss how using deeper math allows a hybrid storage system to perform as well as an all-flash system and how to choose between a hybrid system that is flash and HDD or a system that is multi-tiered flash. In a later blog we’ll explain, again using deeper math, how an object storage system cannot only meet all the response times that the organization needs but also position it to help it better protect and maintain data compliance. Finally, in our final blog in the series, we’ll discuss the ROI of these systems versus going with an all-flash array.

Watch On Demand

About George Crump

George Crump is the Chief Marketing Officer at VergeIO, the leader in Ultraconverged Infrastructure. Prior to VergeIO he was Chief Product Strategist at StorONE. Before assuming roles with innovative technology vendors, George spent almost 14 years as the founder and lead analyst at Storage Switzerland. In his spare time, he continues to write blogs on Storage Switzerland to educate IT professionals on all aspects of data center storage. He is the primary contributor to Storage Switzerland and is a heavily sought-after public speaker. With over 30 years of experience designing storage solutions for data centers across the US, he has seen the birth of such technologies as RAID, NAS, SAN, Virtualization, Cloud, and Enterprise Flash. Before founding Storage Switzerland, he was CTO at one of the nation's largest storage integrators, where he was in charge of technology testing, integration, and product selection.

Tagged with: AI, All-Flash, Archive, Data Analytics, Flash, HDD, Hybrid, Object Storage, ROI, Tegile, Western Digital
Posted in Blog