Driving down the cost of NAS is a top priority because these systems continue to require increasing amounts of capacity and need to meet new performance challenges. The increase in demand stems from the organization’s need to support more and more workloads. NAS systems are no longer the digital dumping ground. Instead, they store mission-critical data and even host business-critical applications.
Driving down the cost of these systems is complex, and most vendors have no viable answer. To keep up with the new performance requirements, most vendors suggest that an all-flash NAS is the only answer, which significantly increases NAS costs. The other option is generic NAS solutions that leverage open-source file systems but which can’t scale to meet the new requirements.
All-Flash NAS Can’t Drive Down NAS Costs
Despite what all-flash NAS vendors claim, they have not reached price parity with hard disk drives (HDD). A solid-state disk drive (SSD) is typically ten times the cost of an HDD. For example, a 20TB hard disk drive costs about $400, whereas a 16TB flash drive costs almost $4,000. The mistake that all-flash NAS vendors make is the faulty assumption on which they base their calculations. They assume that HDD price per TB will stay static while SSD pricing will continue to decline. The reality is that HDD vendors are showing roadmaps where HDDs increase in density and decrease in price per TB continually over the next ten years, maintaining their price advantage over SSDs.
Few Organizations Need an All-Flash NAS
NAS is not an ideal use case for an all-flash configuration. NAS systems tend to be the most loyal to the 80/20 rule, which states that users and applications are not actively using over 80% of the data stored on the NAS system. We also find that as the capacity of the NAS increases, the percentage of active data continues to decline. Customers with 500TB of NAS storage often have less than 10TBs of active data. Why store 80% of your cold data, which users may never access again, on premium-priced storage?
Deduplication Can’t Drive Down NAS Costs
All-Flash NAS vendors also claim that deduplication helps them drive down NAS costs and use it to support their parity with HDD claims. While there is some duplication in unstructured data sets, it is relatively rare. We frequently hear from all-flash customers that they are netting less than 3:1 effective capacity (far less than the vendors’ 5:1 claims), which means HDDs without deduplication are significantly less expensive. Deduplication, while not delivering price parity with HDD, does hurt performance, forcing all-flash NAS vendors to use more powerful, expensive CPUs in their controllers and more drives in their enclosures, which raises the overall costs.
Cheap NAS can’t Drive Down NAS Costs
An alternative to all-flash NAS is a category of NAS systems typically referred to as “cheap NAS.” These vendors build their systems using open-source file systems like ZFS and BFS. Since these vendors have a limited investment in software development, they can sell the combined solution at a lower cost than more traditional vendors. These low-cost NAS vendors also support hard disk drives but their technique of using this decades-old legacy code base, which they can’t optimize, leads to inconsistent performance and a long recovery process from a drive failure.
Cheap NAS systems have limited scaling capabilities, often capping at 500TBs or less. Customers frequently need to buy several NAS systems to meet their performance and capacity requirements. Some customers end up with a half-dozen of these systems, increasing hardware acquisition costs and people costs to manage them.
Learn more: Register for our webinar, “How To Build a Better NAS”
Fix The Storage Engine to Drive Down NAS Costs
The first area of optimization is below the file system. All NAS systems start life as block storage. It is the engine that drives the file system. If the core I/O engine storage software is not optimized to use hardware efficiently, then there is little the file system can do to optimize it further. A reimagined storage I/O code base leads to a new storage engine that can extract the full performance potential of the underlying storage hardware. This new engine can now extract the maximum performance from the SSDs to deliver better performance using fewer drives.
Furthermore, if the storage vendor now owns the core storage I/O path, it makes sense for the vendor to integrate data services at this level to operate in sync with I/O as it comes in. IT can now apply features like snapshots and replication to extreme levels without impacting performance. This reimagining also means that the vendor can reimagine drive failure protection technologies like RAID and deliver a more rapid return to a fully protected state after a drive failure. Since time to recover from a drive failure is a top hesitation point of IT professionals contemplating using HDDs, reducing that time from days to hours alleviates that concern.
Finally, with control over the entire I/O path, the vendor can also optimize how data moves between the SSD tier and the HDD tier eliminating performance inconsistencies. Legacy hybrid systems that attempt to utilize both SSDs and HDDs have a fatal flaw. They wait until the flash tier is full before moving data to the HDD tier. The next IO burst that comes in has to wait until data is moved to the HDD tier, making space for it on the SSD tier. A new storage engine needs a new auto-tiering algorithm that can monitor the resource utilization of the storage infrastructure and, during “less busy times,” automatically move data from the SSD tier to the HDD tier.
Driving Down NAS Costs Requires Eliminating Migrations
One of the most expensive aspects of NAS systems is the painful migration process when the current NAS runs out of capacity or can’t keep up with performance demands. Moving from an old NAS to a new one is so complex that most organizations buy another one, which certainly will not help IT professionals interested in driving down the cost of NAS.
Fixing the storage engine and refactoring the drive failure protection helps eliminate the need for NAS migration. With these two capabilities, customers can add new high-performance flash or high-capacity HDDs without replacing their NAS, adding another node, or even creating a new volume. These new technologies are simply added to the existing media pools, and data automatically expands onto them, while customers enjoy their full capacity and performance. Suppose a new class of drive is needed to, for example, address a new performance requirement. In that case, customers create a new pool, and the solution automatically re-tiers data as appropriate.
ONE NAS, not a Dozen
One theoretical workaround for NAS migrations is buying more NAS systems. This approach is prevalent with commodity NAS systems since they typically don’t scale beyond a few hundred terabytes before experiencing performance problems. The customer buys a NAS for each use case or at a set capacity point to avoid migration. Acquisition costs increase because the customer has to buy more than just additional drives every few hundred terabytes. They have to buy an additional NAS controller pair and additional network connections. Operational costs increase because they have to rebalance workloads and reposition data manually. They also need to update user profiles and reconfigure backup jobs and schedules to handle the new data sources.
A better NAS approach is to have a NAS system that can scale capacity and performance independently without having to scale to half a dozen or more nodes in a scale-out cluster. A new efficient storage engine should drive a NAS to scale to double-digit petabytes of capacity and millions of IOPS in performance.
Balance Performance and Capacity
The most effective way to drive down the cost of NAS is for a modern NAS solution to efficiently utilize all types of media, storage class memory, SSDs, and HDDs. All data on a NAS is not created equal, and vendors should not force IT to place all data on premium-priced flash storage. A modern NAS solution, using an efficient auto-tiering algorithm, can deliver consistent performance across all types of workloads at the same time. With this advanced auto-tiering, a modern NAS solution can leverage HDDs for 80% of static data and SSDs for 20% of the operational data, reducing upfront costs by a factor of five or six.
Conclusion
Most vendors try to drive down the cost of NAS by overstating claims like flash reaching price parity with HDD or quoting unrealistic deduplication ratios. Alternatively, some vendors use open-source file systems and off-brand server hardware to deliver a sub-standard quality of service. A better approach is to do the hard work and rethink how I/O moves from a user or application through a file system and onto storage media. Combine this new engine with a more intelligent method to move data between tiers that can lower storage controller resource utilization while also delivering the maximum capabilities of modern storage media, and then actual cost reduction is possible.
To learn more, register for our live webinar, “How To Build a Better NAS”