Organizations are asking the present day NetApp Filer to do many things: host home directories, store machine generated data, host virtual machine images and even databases. What was once a single purpose appliance focused on file server replacement is now a storage infrastructure in its own right. Today’s NetApp Filers attempt, like many enterprise storage systems do, to be the single storage solution for the entire data center. The problem is that use cases are too varied, and today even using NetApp Filers for their intended task – storing file based data – may be too much to ask. NetApp clearly has a role to play in the data center but it might be time to offer their systems a little help.
How’d We Get Here?
In the late 1990’s many data centers faced the challenge of file server sprawl. That sprawl was in response to the rise of the knowledge worker, employees who were using Unix and Windows based computers to create data. These file servers ran a general purpose operating system, designed to do a wide variety of functions beyond just serving files. In those days, the value proposition for NetApp was to simply replace expensive several file servers with one dedicated appliance designed to serve files. The focus allowed NetApp to support more users and more data on less expensive hardware.
In the early 2000’s the cost of compute power declined rapidly and core operating systems became more stable. As a result, standard hardware could perform the file serving task well in a cost effective manner. NetApp, seeing the writing on the wall, expanded the Filer’s usefulness beyond just file serving and focused on hosting databases and virtual machines. The company also added iSCSI support and then fibre channel support. The NetApp Filer became Fabric-Attached Storage (FAS), a mainstream storage system that could support all the data centers storage needs. It was NetApp’s golden era.
The End of an Era
Fast forward to 2016. Times have changed. Virtualization now dominates the data center and machines (devices, cameras, sensors) create most of the unstructured data; not users. Flash or hybrid arrays now dominate the storage landscape and most of the features that made NetApp unique (snapshots, data protection, replication) are now available for free in the hypervisor. But it is unstructured data, NetApp’s roots, where it is the most vulnerable.
While knowledge workers are creating more data than ever, their efforts are being easily outpaced by machines. In a recent survey, Storage Switzerland found that machine generated data, data from sensors, devices and applications, accounted for more than 25% of an organization’s total data set. This data, similar to user generated data, is very active when first created. But it often goes dormant and unlike user data, is more likely to be accessed again in the future.
The likelihood of future access creates a problem for many IT professionals who use their NetApp systems as an archive, by simply allowing data to remain on them indefinitely. As the system needs more capacity it is either expanded by adding additional shelves of storage or by adding another Filer, or both. Using any primary storage as a pseudo-archive was already a bad practice, continuing this practice in the era of machine-generated data is worse. Other than the cost concerns of storing inactive data on primary storage, these systems were not designed to store data indefinitely, they don’t have the long term scaling capabilities nor the ability to audit data to confirm that it is still readable.
Thin Provision Your Filer
Data center’s should refocus their NetApp investment on what it is best suited for given its cost. Ideally these systems should be used to store virtual machine images, database data and only the most active unstructured data, where high performance response times matter. All other data, likely 80 percent of an organization’s capacity should be stored on something else, such as object storage systems.
Sponsored by Caringo
Caringo was founded in 2005 to change the economics of storage by designing software from the ground up to solve the issues associated with data protection, management, organization and search at massive scale. Caringo’s flagship product, Swarm, eliminates the need to migrate data into disparate solutions for long-term preservation, delivery and analysis—radically reducing total cost of ownership. Today, Caringo software is the foundation for simple, bulletproof, limitless storage solutions for the Department of Defense, the Brazilian Federal Court System, City of Austin, Telefónica, British Telecom, Ask.com, Johns Hopkins University and hundreds more worldwide. Visit www.caringo.com to learn more.