In theory, scale up storage appeals because the data center can start small and add capacity and performance as needed. Do these theoretical advantages apply to the use cases in which All-Flash storage is most commonly deployed; databases and virtualization? In this article, we will analyze which scaling approach is best for flash storage.
Scale out storage has become one of those check box terms in the storage industry. After all, who wouldn’t want a storage system that can scale to infinity? The problem is that scale out storage systems are more expensive to build, implement and maintain. There are many use cases for scale out storage; it is most ideal for situations where meeting a high capacity demand takes precedence over a performance demand. Current scale out storage architectures, however, may not be right for performance centric environments like All-Flash.
Defining Scale Up vs. Scale Out
In scale up architectures, all the performance and capacity potential of the storage system are provided in a single controller unit, typically upfront. Current scale out architectures provide performance and capacity as storage nodes (servers with internal capacity) are added to the infrastructure. These architectures have their ideal use case depending on performance and capacity demands. As stated above, the appeal of a scale out storage system is that performance and capacity can be added incrementally as needed.
1. Starting Small?
One of the theoretical advantages of a scale out storage system is that IT can start small and then add storage capacity and performance at the same time. In reality, this is seldom the case. Scale out storage systems count on the cluster of nodes for availability as well as capacity and performance. This typically means an initial purchase of at least three systems or the vendor needs to put high availability in each node, raising the initial entry cost of the solution.
The problem is that All-Flash systems are most commonly implemented in Database and Virtualization environments. While there are capacity needs in these environments, they are typically not extreme, like archive and backup, where scale out systems tend to make more sense. It is also important to remember that many All-Flash systems come with some sort of data efficiency like deduplication or compression. This means less physical storage capacity actually needs to be purchased for these environments.
The result is that in scale out storage systems, the initial nodes required to form a quorum may far exceed the capacity needs of the environment that it is being placed in. This means wasted capacity, which given the premium price of flash storage, is particularly troublesome. It also means that the cost of the initial implementation may be similar to the cost of a scale up storage system.
By comparison, a scale up All-Flash Storage system is designed to provide full high availability and performance in a single product. Capacity can start as small as is actually needed by the environment and be added to the storage system without having to buy and connect additional nodes.
2. Do You Want To Scale Performance and Capacity?
Performance and capacity operate on different vectors and are not necessarily linked together. Most environments that can take advantage of All-Flash storage will typically run out of performance long before they run out of capacity. In a scale out architecture, this means additional nodes will need to be purchased with capacity (flash capacity) in order to scale performance. Once again this more than likely wastes capacity.
In a scale up architecture, all the performance is delivered with the unit upfront where capacity is added, as needed, to the system. While performance can’t necessarily be scaled, it is delivered in its entirety up front and essentially is a fixed cost with no surprises.
Another side effect of scale out storage is that the nodes typically need to be homogeneous. Each node needs to have a similar processor chip set and must leverage the exact same size SSDs. A scale up system could intermix SSDs of different sizes and even different types as new flash technology becomes available.
3. Is Scale Up Performance Really An Issue?
While the scale up lack of performance scaling is often cited by scale out advocates, the reality is that the overwhelming majority of applications can’t push current scale up flash based systems. Additionally, some scale up systems can do a periodic controller unit upgrade. So as processing technology continues to advance, the head can be upgraded to offer more performance to the existing storage shelves. As a result, there actually is some performance scaling capability in scale up systems.
Finally, some scale up vendors have the ability to add a scale out design to their architecture if the need ever becomes relevant. It is hard to imagine that processing technology would fall behind storage I/O performance, but if it were to happen, this is the ideal way to scale; scale up completely first, then start scaling out if performance exceeds the capabilities of the current processors.
4. Is Linear Performance A Reality?
Current scale out storage systems are actually a very sophisticated clustering application. In fact, they can be just as complex to design as a clustered database application. While some of this complexity can be hidden from the storage administrator, there is a limit to how much complexity can be hidden. There is also a performance complexity introduced with scale out systems – internode communication.
The nodes within a scale out storage system need to stay in sync with each other to make sure the right nodes have the right data and the right nodes are accessing the right data. This is called internode communication and it typically requires a dedicated backend network. This communication, and the requirement of a network communications protocol, introduces latency. In hard disk based scale out architectures, this latency is not typically noticeable. In a scale out All-Flash storage system that does not incur HDD latency, network communication latency may very well be noticeable. As a result, the concept of linear performance growth may not play out when it comes time to scale.
If in theory, internode latency could be effectively hidden to enable scale out storage to enjoy its promised performance advantage, it would incur a potentially significant cost disadvantage. To hide this latency therefore, would require a high-speed backend network like infiniband adapters and switches; something that most mainstream data centers have no experience with, which only serves to increase management complexity. It would also require more powerful processors that increase the cost-per-node to the point that it could be more than the cost of the scale up storage controller.
5. Is Scale Out Cheaper?
In storage there are two hard costs to be concerned with. The first is the initial purchase cost. In theory, this should favor a scale out storage system since it can start small. But again, current scale out designs need to have an initial cluster created or they need to deliver high availability in each node. Counting on the cluster for HA requires the purchase of potentially more performance and capacity than the customer needs because more nodes are needed initially. Building HA into each node requires added expense per node, probably equivalent to the scale up storage system.
A case could be made that a storage node could be delivered less expensively than a scale-up controller unit. This would require that the first option be chosen, that nodes are delivered with no HA and require a quorum to do that. Again, buying multiple nodes eliminates that advantage and it leads to node sprawl because nodes have to be added to address performance issues, not capacity issues.
At a minimum, the initial cost difference between the scale up and scale out implementation types may be a wash. When implementation time or time to data is factored into that equation then scale up systems have a clear advantage. It simply takes longer to install more pieces and get those pieces working together.
The second cost, incremental cost, is an area where scale out storage should have an advantage. But again the limits of current scale out designs tell a different story. The only way a scale out All-Flash system would have a cost advantage is if the need for expansion is being driven by performance instead of capacity. But as mentioned earlier, the overwhelming majority of flash vendors and customers report that they can’t exceed the performance of a single box. So any scenario that would justify a scale out deployment will probably not happen in most data centers.
6. Is Scale Out Really Simpler?
Another theoretical advantage to scale out is how simple it is to expand. “Like adding Lego blocks” is the common analogy. But current scale out systems don’t actually “snap” together. They are a series of individual servers with clustering software that must be carefully networked together for maximum performance and availability. This combination makes initial implementation more complex and it makes ongoing upgrades something that needs to be carefully planned.
Scale up architectures are actually relatively simple. All the capabilities, at least from a performance perspective, are delivered upfront. There is nothing to “click” in. Capacity can be added incrementally either by inserting drives into the existing shelf or adding shelves to the existing storage controller. While adding shelves also requires planning, the capacity per shelf is high and as long as the scale up All-Flash array can do non-disruptive upgrades, no down time should result.
Scale out storage is one of those technologies that looks great on a white board but at least in the All-Flash instance, does not play out well in the reality of the data center. All-Flash is essentially too fast for current implementations of the scale out architecture. Most customers don’t yet need the performance that scale out systems provide. Furthermore, the costs to achieve that performance requires a significant investment in the system’s infrastructure, which for many, puts its cost out of reach as well.
Scale up storage, while having the disadvantage of buying all the performance capabilities up front, has the dual advantage of more incremental capacity expansion and a less complex backend infrastructure. And leveraging data in-place storage controller upgrades can easily eliminate the lack of performance scalability.
Pure Storage is a client of Storage Switzerland