The Problems With Server-Side Storage, Like VSAN

Posted on March 31, 2014 by Colm Keegan

The expression, “everything old is new again”, certainly applies to the renewed interest in server-side storage. Due to the widespread adoption of server virtualization technology, internal server storage is once again being hailed as a simple way to bring performance closer to where virtualized applications reside – on the hypervisor.

Now with VMware’s recent launch of their VSAN offering, which utilizes server-side storage capacity configured across a network of clustered hypervisor nodes, this seems to add further validation to this methodology. While there are some instances where server-side storage adds value, there are also some drawbacks that need to be considered.

Server-Side Tiered Storage

The appeal of PCI-e flash and drive form factor SSD offerings is that they can easily be coupled with conventional hard disk drives to build a multi-tiered storage configuration directly inside the server chassis. This allows system planners to tailor performance according to the varied application needs of their physical and virtual server platforms. For example, flash and SSD can be configured for performance demanding applications, like transactional databases, while HDD may be allocated for user files and home directories.

Less Scaling Flexibility

But the biggest drawback with deploying a server-side storage architecture is its inherent lack of flexibility in independently scaling computational and storage resources. The need for and benefits of this were why SANs appeared in the first place. As environments grow, compute and storage resources rarely need to be added at the same time.

Some servers may have excess compute but a lack of storage capacity, while other servers may have a lack of compute and an excess of storage resources. In either scenario, an additional server with discrete storage resources has to be purchased, racked and configured to scale the application environment. In fact, administrators will typically find that storage requirements scale at a much higher rate than computing; if you’re locked into a scenario where they scale linearly, you are almost certain to be out of step in one or more variables.

According to various industry sources, over the past decade, computational workloads have increased by 16% year over year. During this same timeframe, the annual growth rate of storage has been close to 50% or about 3X the rate of CPU growth. A server-side storage deployment that utilizes a 3-way replica (N+2 protection) could end up growing the server infrastructure 9X faster than needed.

Considering that one of the greatest drivers of virtualization has been server consolidation, returning to the days of rapid server proliferation seems like a step backwards.

Taking A Step Back?

The goal of reducing storage costs is a worthy goal. In fact, storage has become the most expensive element in the data center. The answer to what has become known as the I/O blender – caused by the architectural mismatch between the virtualized data center and traditional storage – has been to throw more storage and more expensive storage at the problem. But if you combine this with growing servers at the same time, the result is a very expensive proposition. So clearly, the goal of driving better storage performance at less cost is worthy, but not at the cost of growing servers.

The manner by which storage capacity scales out in a server-side storage environment runs contrary to how virtualized infrastructure is designed to work. Virtual servers are a software abstraction of hardware resources. It enables administrators to quickly and seamlessly provision virtual machines (VMs) on-demand without cracking open a server. Only occasionally do new servers have to be deployed to meet demands for new application computing power and even then, it doesn’t disrupt existing applications.

The server-side storage scaling model, on the other hand, is too reliant on a physical “push” of hardware – whether it is adding storage devices inside a server chassis or dropping an entire server node on to the data center floor to increase the available storage pool. This is not a sustainable architectural model for either private data centers or service providers that need a “low touch” environment to help drive down operational costs.

Disruptive Upgrades

One obvious challenge with this approach, depending if your server can hot add drives, is that capacity or performance upgrade requires scheduling a maintenance window to populate a given server with the required storage resource. VMs would also have to be VMotioned to another hypervisor prior to the upgrade and then VMotioned back afterwards. This adds a fair amount of complexity, management overhead and risk to what should otherwise be a simple storage upgrade.

In the VSAN architecture specifically, upgrades such as those described above, could force a data protection rebuild. This is because VSAN typically requires that three copies of all data exist. When the server is brought down for the upgrade, VSAN must scramble to re-create the third copy of data that was on that server.

The Cost of Scaling Servers

While the promise of VSAN or server side storage is to reduce cost by using internal storage, this may be just the tip of the iceberg. When you start to grow servers at 9X the rate needed, you need to add the cost of the license stack on each of these physical hosts. Ultimately, the efficiencies enterprises have gained through consolidation of physical servers will be wiped out if servers are forced to grow at 9X their normal growth rate.

Network-less Architecture?

Another perceived benefit to implementing server-side storage is it eliminates the need to deploy costly and complex SAN switching infrastructure like host bus adapters (HBAs) and fiber channel switches. The fact is, while application data is stored locally on the host, a storage network is still required so that data can be replicated across server nodes for data sharing and resiliency. As with any clustered environment, the network complexity rises as the number of nodes, and in the case of VSAN, hosts increase.

Distributed Storage Provisioning

To ease virtual infrastructure management, administrators need a solution that enables storage resources to be seamlessly provisioned out to virtual applications as they are needed. By decoupling the storage hardware from the physical server layer, organizations can more efficiently utilize storage capacity and actually increase VM density – increasing the return on their investment in virtual server infrastructure.

On-demand Storage

One way to accomplish this is by implementing a grid storage architecture alongside the virtual server environment. For example, independent storage nodes configured in a grid, could be attached to an existing GbE or 10 GbE network so that physical and virtual servers could access storage resources on-demand. Commodity NICs and Ethernet switches would be utilized to build the storage network – saving organizations money and reducing storage networking management complexity.

Grid-Pooled Resources

More importantly, physical storage resources are not captive to any individual server; instead, they are managed on a shared grid. This is an important distinction because unlike server-side storage solutions like VSAN, that force servers offline to add capacity, with a grid you simply add nodes to your network and storage resources are transparently added non-disruptively.

For server-side storage, this means scheduling a maintenance window, migrating VMs to another server, etc. This is far more disruptive from an operational perspective and introduces undue risk to the environment as a whole. What happens if the upgrade fails or worse, the upgraded hypervisor server fails to come back online?

SLA Driven Provisioning

Intelligent storage grid technologies, like those from Gridstore, don’t have any impact to the virtual server environment because the storage is a pooled resource (of physical storage nodes) segregated from the server infrastructure. By utilizing a virtual storage controller, these solutions optimize the I/O path between the VM and the backend storage resource. For example, Hyper-V virtual administrators can set the specific service level agreement (SLA) performance policy for a particular virtualized application and the virtual storage controller will dynamically marry the application up with the appropriate storage resource on the grid – flash, SSD or HDD.

What’s critically important is these storage performance SLAs remain bound to the VM regardless if it is migrated to another hypervisor. So if a business critical application suddenly needs to be migrated to another hypervisor server to gain access to additional compute resources, its active hot data sets will not need to be evicted first from cache. Instead, the application’s storage SLA will migrate along with it to the new hypervisor and it will maintain access to its previously assigned storage resources.

Pay-As-You-Grow

One of the most beneficial aspects of storage grid architectures is that they can start small and then non-disruptively expand out over time as application performance and storage capacity needs grow. Unlike server-side storage architectures where one may be compelled to pre-emptively add storage capacity to avoid future service disruptions, there is no need to over-provision storage capacity up-front. It’s more of a true, pay-as-you-grow model. This is an especially important capability for service providers since they need to keep a tight rein on cash flow. Most importantly, this pooled resource can be scaled independently from the server infrastructure.

Conclusion

Server-side flash and SSD storage technologies can be beneficial when there is a sudden need to accelerate application performance on a select set of physical servers or hypervisors. The challenge is one of scale. Data centers that need the scalability and ease of management that virtualized infrastructure offers, need storage solutions that are capable of integrating seamlessly into these environments.

While trying to lower the cost of storage, the unintended consequence is that server-side storage really drives both capital and operating costs in the wrong direction, while increasing risk. Server-side storage architectures are hampered by the need to physically install storage resources into the hypervisor itself whenever there is a need for additional storage capacity. And the frequency of this event is now multiplied by 9 due to the growth rate of storage and the server side storage architecture. While infrastructure planners could pre-seed their virtualized server infrastructure with a complement of flash, SSD and HDD capacity, this could require a large up-front investment. Since the price of storage has typically dropped over time, it behooves IT planners to deploy resources only as they are needed, “just-in-time”.

Grid storage solutions, like those offered by Gridstore, give data centers the ability to economically scale-out pooled storage resources in lock-step with, and yet independently of, their virtualized server environments. Since storage is managed across independent nodes that attach directly into the same network as the virtualized infrastructure, there is never a need to disrupt applications when it comes time to add storage capacity or additional storage I/O for performance demanding applications. Instead, it’s simply a building-block or “lego” style approach to adding storage resources. This allows organizations to quickly and easily provision storage resources as they are needed; helping to conserve capital and simplify operational management.

Gridstore is a client of Storage Switzerland

Click Here To Sign Up For Our Newsletter

About Colm Keegan

As a 22 year IT veteran, Colm has worked in a variety of capacities ranging from technical support of critical OLTP environments to consultative sales and marketing for system integrators and manufacturers. His focus in the enterprise storage, backup and disaster recovery solutions space extends from mainframe and distributed computing environments across a wide range of industries.

Tagged with: Flash, Gridstore, HDD, PCIe, Server-side, SSD, Virtualization
Posted in Article

27 comments on “The Problems With Server-Side Storage, Like VSAN”

Noam Shendar says:

April 3, 2014 at 8:35 pm

Sincere but self-serving comment alert! 🙂

+1 from us here at Zadara Storage. Server-side storage (or Server SAN as some call it) is easier to understand than to use.

Colm has done a great job of capturing the real-world limitations of these products.
The Problems With Server-Side Storage, Like VSAN | Storage CH Blog says:

April 7, 2014 at 4:06 am

[…] Read on here […]
Captain VSAN says:

April 7, 2014 at 9:56 am

Clearly a guy who has no clue what he is writing about and nor has he done home work researching it. And the fact that you are sponsored by gridstore had nothing to do with your bias, I’m sure.
- George Crump says:
  
  April 7, 2014 at 2:33 pm
  
  Captain, If you would like to provide a professional response and indicate specific problems, we will be happy to discuss those with you. We have a track record of responding to professional comments. It becomes a learning experience for both our readers and us. If you can’t respond professionally your comments will be deleted.
Michael Stenson says:

April 8, 2014 at 8:59 am

vSphere Giveth, VSAN Taketh away. True dat!

Joking aside, it’s a matter of time before people start looking beyond the marketing aspect of server-storage and take a closer look at the tradeoffs and the challenges it presents because lets face it, there’s no such thing as free lunch
Jason Nash says:

April 8, 2014 at 11:06 am

There are some interesting assumptions in this article. The big one being that if storage is growing 9 times as fast as compute you need to scale your VSAN nodes 9x as fast. Why? Hard drives are getting bigger by the day (6TB!) and the flash cost per GB and cost per IOP are dropping by the second. You don’t have to scale nodes to scale storage capacity and performance. Let the technology do that for you.

The scare tactic around maintenance windows and upgrade downtime is a bit mis-leading as well. Isn’t the beauty of virtualization and abstraction of underlying hardware the ability to take no downtime for hardware maintenance and upgrades? Not to mention buying the appropriate nodes up front with empty slots to allow for hot add upgrades.
Chuck Hollis says:

April 8, 2014 at 11:31 am

Hi George

I did my best to create a professional response and indicate specific problems in this post:
http://chucksblog.typepad.com/chucks_blog/2014/04/the-problem-with-storage-swiss-analysis-on-vsan.html
Michael Stenson says:

April 8, 2014 at 1:03 pm

Well, it appears the Andy Warfield over at COHO has some similar concerns. http://www.cohodata.com/blog/2014/04/08/hyperconvergence-is-a-lazy-ideal/
- duncan@yellow-bricks says:
  
  April 14, 2014 at 3:09 am
  
  LOL, yeah sure every startups has concerns about VMware moving in to their market… they see it as a thread.
George Crump says:

April 9, 2014 at 6:23 am

Mr Hollis,

For people that are concerned about our ability to balance independence vs. sponsorship I suggest that they do a search for “EMC” on our website. Make sure you scroll to the bottom of the search result and click on “older” serveral times.

You will see that our clients, sometime inspire thought leadership, are praised when they do something that we think is right, and are criticized when they don’t. There are countless examples where clients of ours have loved what we have written and not loved what we have written. Our job is to provide balance as best as we can. You will also see that there are almost as many non-sponsored articles and blogs as there are sponsored. We are not perfect, but we try very hard. We also allow commenting so that those who disagree and voice their concerns, that assumes that they can do so without name calling.

Again, this article was not anti-VSAN, it was the PROBLEMS with VSAN. I would am not suggesting that anyone not install VSAN. If you want to install VSAN more power to you, there are some things that it does well. But we want you to be aware of its potential problems. I have the confidence in our readership to make a self-determination on what is best for them. All products have weaknesses. Knowing those weaknesses and designing around them is the job of IT.

Blind adoption of any technology, which I believe that some would like, is not good for anybody. If they can’t stand the criticism without resorting to bulling then tough. Especially when their target this time is a firm that has been at this for eight years, is crystal clear on its business model and knows what it is talking about.

George Crump
Chief Steward, Storage Switzerland
- Chuck Hollis says:
  
  April 9, 2014 at 6:49 am
  
  Hi George
  
  You published incorrect conclusions based on made-up “facts” that fly in the face of both published documentation and widespread user experience. You did so apparently at the behest of your client, GridStore. I have an issue with that, as would others.
  
  You’re quite right that people need to be armed with the facts when making technology decisions, and no one should blindly adopt any technology.
  
  But you’re not helping matters by intentionally spreading misinformation.
  
  — Chuck
George Crump says:

April 9, 2014 at 7:27 am

Mr. Hollis,

The problems pointed out in the article are based on our research with many external sources and internal validation. Again, I am reviewing your rather lengthy response. If we got it wrong, I will correct it and admit mistakes. IF those mistakes exist, and at this point, I am not convinced they do, that is exactly what they were, mistakes. They were not “made-up facts at the behest of my client”. Nor is Storage Switzerland “intentionally spreading misinformation”

We have never claimed to be perfect, but indeed very human. I only ask that you further investigate our business model and our track record of eight years of disclosure and provide the same consideration of admitting your repeated and mistaken claims about my company’s character and ethics.

George
- duncan@yellow-bricks says:
  
  April 14, 2014 at 3:13 am
  
  What I personally do not understand is that a company like Gridstore sponsors a post like this while to my knowledge they don’t have support even for the VMware platform. What are they trying to proof?
Michael Shea says:

April 9, 2014 at 10:02 am

Ultimately, VSAN leaves a lot of open questions. Marketing hyperbole, and breathless rants shed more heat than light.

In the end, time and experience will decide who gets to say “I told you so”. Until then, Caveat Emptor.
Wim Provoost says:

April 9, 2014 at 10:42 am

I believe the truth is more in between. Some of the above points are valid points. I even believe you missed one of the most important points, it is VMware only. On the other hand, VMware has build a great product, which in time will only become better and more scalable. Hardware will never be able to beat software implementations in term of cost and complexity for the same availability/reliability. The biggest ‘issue’ I see with both solutions is the lack of openness.

I’ve tried to summarize the valid points from both sides in a blog post: http://blog.openvstorage.com/2014/04/why-both-converged-storage-and-external-storage-make-sense/

Note: I work for an open-source project which tries to be ‘Swiss’ in between converged and external storage arrays.
The Problems With Server-Side Storage, Like VSAN - IT-TNA | IT-TNA says:

April 9, 2014 at 11:02 pm

[…] For more information, CLICK HERE […]
Problems with Server-Side Storage, like VSAN says:

April 10, 2014 at 3:31 pm

[…] storage, “The Problems with Server-Side Storage, Like VSAN”,published on 3/31/2014 https://storageswiss.com/2014/03/31/the-problems-with-server-side-storage-like-vsan/ A post and link was placed on Spiceworks on April 4th. […]
Server-Side, Converged Storage vs. Shared Storage | Storage Swiss - Storage Switzerland says:

April 11, 2014 at 11:27 am

[…] Fundamentally, there is nothing wrong with this approach and I have personally suggested to storage managers that they at least consider server-side storage as a potential solution to a variety of storage challenges they have. That said, I do have concerns with these architectures and those storage managers should be aware of them. No infrastructure is perfect, especially as it scales. Some of those problems were pointed out by my colleague Colm Keegan, in his article, “The Problems With Server Side Storage, Like VSAN“. […]
Mark Kulacz says:

April 13, 2014 at 10:33 am

Mark Kulacz (NetApp) – VSAN sure does look as if it was inspired by Isilon’s OneFS.
Mark William Kulacz (@markkulacz) says:

April 13, 2014 at 2:26 pm

Just curious – Does (will?) VSAN support the new 6TB hard drives? It probably will, but looking for some confirmation.
- duncan@yellow-bricks says:
  
  April 14, 2014 at 3:07 am
  
  If someone goes through the certification process it will be supported.
FUD It! says:

April 14, 2014 at 6:31 am

[…] something stood out to me when it comes to the world of storage and virtualisation and that is animosity. What struck me personally is how aggressive some storage vendors have responded to Virtual SAN, […]
My favorite part about VSAN… | Virtual Data Blocks says:

April 28, 2014 at 5:17 am

[…] enabled in vCenter by just clicking a couple of check boxes. VSAN has seen its supporters and detractors but one thing that everyone agrees on is its […]
FUD it! | Online News Portal says:

May 3, 2014 at 11:40 pm

[…] something stood out to me when it comes to the world of storage and virtualisation and that is animosity. What struck me personally is how aggressive some storage vendors have responded to Virtual SAN, […]
FUD it! | CrypticZero says:

May 4, 2014 at 11:30 am

[…] something stood out to me when it comes to the world of storage and virtualisation and that is animosity. What struck me personally is how aggressive some storage vendors have responded to Virtual SAN, […]
Why both converged storage and external storage make sense - Open vStorage BlogOpen vStorage Blog says:

April 6, 2015 at 9:30 am

[…] the peace in storageland has been disturbed by a post of Storage Swiss, The Problems With Server-Side Storage, Like VSAN. In this post they highlight the problems with VSAN. At the same time Coho Data, developer of […]
Brian Politis says:

July 30, 2015 at 7:44 pm

Interesting article but there are quite a few holes. I’m a VSAN advocate so I’ll admit my bias first now let’s jump into the details…

1. In regards to CPU and Storage growth demands you make some interesting points. However you fail to mention that available storage per drive bay has escalated at far faster rates than the ratios you mentions for CPU and Storage growth in general. I’m currently designing a VSAN to support 60TBs useable initially with growth to 120TB. The customer wants the option to scale up (storage) or out (compute) as needed over the next five years.

The easy answer here is very large SSD drives for the capacity tier on VSAN out of the gate. I can purchase 4TB SSD for the capacity tier at a street price of $4,500/SSD. Then front the capacity tier with an 800GB SSD for the cache tier at price of about $1500.

This kind of drive density completely alters the spindle counts required for traditional SAN to scale up. And when dealing with SSD like this for most purposes the IOPS/spindle are a non issue. Latency of less than 4ms is essentially instant for most applications.

Suddenly unelss you are dealing with PetaByte capacities designing to scale up is relatively easy.

In the deployment I’m designing right now each Dell 730 Host has 24 slots. Ultimate scale up limits result in 21 slots dedicated to the VSAN capacity tier. Peak capacity per host is 4TB x 21 slots = 84TB per host. 84TB per host x three hosts = 234TB. Divide that by 2 for my striping needs (resulting in stability greater than traditional RAID 6) to result in roughly 120TB useable across three hosts.

You are correct once 120TB is utilized I will have to scale up, but for most deployments this tremendous density is more than adequate. To put it in perspective my current customer has 15 years of data to get to 40TB currently used. They routinely grow at 10TB year. My deployment will easily last them 6 years if this growth rate is sustained. And they can always buy bigger denser SSDs as they hit the market.

Perhaps video streaming services for a signicantly sized media company would be limited here but not much else. And scaling OUT is actually pretty cost effective. The cost\overhead of Dell 730 chassis isn’t much more than my traditional SAN vendors charge me for a disk shelf.

And I won’t even go into how when these servers are cabled with DAC cables the cost of interconnection between the SAN and the compute nodes nosedives. Not requiring Fiber or other expensive interconnects drives down cost significantly.

2. Your maintenance notes above are largely no longer true as well.
Adding drives
Rebuilding disks
Changing out disks to larger disks and rebuilding can all be done by VSAN dynamically since the first production version shipped.

The second production version even moves these commands into the GUI where in the first production version shipped you did need to type into a command line.

3. I used to think like you, but then read a VSAN development blog entry titled something like “Why we shipped VSAN without compression” and the guy who wrote the blog basically summed it up as – “with current drive densities and with greater drive densities hitting the market so quickly we see Compression as a solution to a problem that no longer exists. There are cheaper faster ways to skin the cat rather than develop compression algorithms”

This blog entry got me thinking and I soon realized they were right – even though it completely spun my head around. .

I was still designing solutions based on constraints that no longer exist in the market place. SAN was created because 80MB disks were as big as you could purchase for 3.5″ drive bays, and usually there were only 6 slots in a server.

The rest of SANs feature set developed later as people realized what they Could Do once storage was decoupled. The question now is with 4TB of essentially unlimited IOP performance per slot\spindle available to you in the local host at cost effective dollar amounts Should You continue in that vein.

Every customer is different but for most enterprise customer’s these days I believe the answer is no.