For most IT shops, flash storage has become an established component of the storage architecture. The conversation has become less about whether or not the performance acceleration that flash technologies provide is needed; it has become more about which workloads require this “premium tier.” This is especially true with the advent of faster-performing and higher-priced NVMe solutions.
There are a number of factors that play into the total cost of ownership (TCO) equation that storage buyers should consider as they are making purchase decisions. Because data is growing so fiercely in volume, the impact of applying deduplication and compression for capacity efficiency on the storage system’s performance is a key factor to evaluate.
Deduplication eliminates redundant data and segments within data stored on an array, and compression decreases the size of unique data. Both free up available storage capacity, which is only increasing in value to the business in today’s data-driven economy.
At the same time, these data efficiency functions take valuable processing power and memory away from the other functions the storage software needs to perform (IO management, volume management, RAID, snapshots, replication, and clones). This is especially a concern for modern, performance and data-intensive workloads such as artificial intelligence (AI), machine learning and high-velocity analytics that can push storage IO to its limits and are typical candidates for deployment on NVMe.
StorageSwiss Take
The new levels of performance required by this future forward workload set and enabled by NVMe create new concerns and skepticism around when to use deduplication and compression. Most of the time, the improved capacity utilization that data reduction facilitates outweighs the performance overhead, because the system will still perform faster than most traditional applications. Users are unlikely to notice any impact on performance.
There are workloads though that can push the IO capabilities of a storage system to its limits, and in those cases, it makes sense to turn deduplication and compression off selectively on a per volume basis. For instance, very little space savings is achieved by applying deduplication to a database. Additionally, much of the Internet of Things (IoT) data that is powering AI, machine learning and analytics workloads is 100% unique and thus would not benefit from deduplication. However, it can be compressed so the ability to turn off deduplication but turn on compression makes sense for these workloads.
The key for the IT planner is flexibility. Storage buyers should look for storage systems with the ability to turn features on and off depending on the use case. For further discussion around how to integrate NVMe into your storage roadmap most effectively, access Storage Switzerland’s webinar with Western Digital, Myth Busting – The Four NVMe Myths, on demand.