Driven by either the demand for more capacity, more performance or both, every storage system at some point has to scale. Broadly there are two types of scaling available: scale-up or scale-out. Scale-up systems scale by adding more capabilities to the initial system. When you reach the limits of that system then you need to purchase a new system and migrate the old data. Scale-out systems expand by adding additional nodes to the original purchase. Each node has additional capacity and delivers greater aggregate performance, no data migration or the need for an additional point of management. For these reasons scale-out storage tends to be the design of choice for data centers with rapidly growing unstructured data to store.
The Scale-out Storage Challenges
While scale-out storage succinctly addresses the ever growing capacity demands of data centers, it still has challenges. First there is the physical process of just adding a node. There is a physical appliance that needs to be racked and integrated into the existing cluster. The good news is most scale-out NAS systems are fairly adept at finding new physical nodes. This new physical node may be part of a closed system, where you, the customer, must buy hardware from the vendor that supplied them the software. While acceptable for initial setup, over time the lack of flexibility may cause a problem. The organization may be able to get a better price from another vendor or simply may prefer another vendor’s solution.
The second challenge is rebalancing data to this new node. When adding a new node to the cluster the data protection feature of the scale-out system will want to leverage its capacity to improve the protection. Doing so requires a lot of data copying to occur, so much so that it may impact performance. It is important that the scale-out storage software manages how quickly the infrastructure absorbs the new nodes without impacting performance.
A third concern is how effectively the storage software manages the cluster itself. Scale-out storage systems may scale to 1,000s of nodes. The software needs to effectively manage the internode communication and the management of metadata to avoid performance bottlenecks.
Finally, there is, or a at least should be, a concern over power. Many of these systems will store their data for years, if not decades. The problem with the traditional scale-out storage system is that physical node has to have power on 100 percent of the time. For decades. With hard disk drive prices reaching near parity with tape (or at least it is close) enough, data centers will select a disk based system over a tape based one. But tape’s key advantage is power consumption, or lack thereof. Few scale-out storage systems do anything to address that.
While a few scale-out storage systems support power managed drives, they still have to power the nodes. IT professionals should look for scale-out storage systems that can power down both drive AND nodes.
As you can see, there is more to scale-out storage than just adding a node. The storage software, be it object storage or NAS storage, needs to effectively manage the cluster, provide hardware flexibility and provide some mechanism to manage power.
Caringo was founded in 2005 to change the economics of storage by designing software from the ground up to solve the issues associated with data protection, management, organization and search at massive scale. Caringo’s flagship product, Swarm, eliminates the need to migrate data into disparate solutions for long-term preservation, delivery and analysis—radically reducing total cost of ownership. Today, Caringo software is the foundation for simple, bulletproof, limitless storage solutions for the Department of Defense, the Brazilian Federal Court System, City of Austin, Telefónica, British Telecom, Ask.com, Johns Hopkins University and hundreds more worldwide. Visit www.caringo.com to learn more.