Recently we had a very well-attended webinar with the CEO of Avere Systems, Ron Bianchini, “Five Keys To A Successful Hybrid Cloud Storage Strategy”. The following excerpt from the Q & A portion of the webinar discusses how businesses can seamlessly implement an on-premises and off-premises hybrid cloud storage solution. That webinar is now available on demand. To view that webinar now click here.
“Are you using Raid on the Edge filer?”
Ron Bianchini: We’re not using Raid on the edge filer. Each individual FXT node is considered the failure domain. So when a disk fails, the node takes itself out of the cluster. And because of our N+1 failover, the system continues to run and does not lose any data. But what you can do, is you can reconfigure the node to rejoin the cluster with one less drive, or to stay out of the cluster, until someone takes a look at it.
“You’re saying AWS I assume that’s S3, correct?”
That’s absolutely correct. We use S3 interface to talk to storage up in AWS. So when I refer to AWS I simply mean the S3 interface to talk to the storage.
“Do all access to core filers need to be via Edge filers?”
That’s mostly true. For cloud core filers, yes we need everything to come through the Edge filer because it’s the Edge filer that converts files into objects. But for NFS mounted core filers, we like to have all the data coming through the Edge filer because that’s when you get both read and write acceleration and you get the best performance. But if you want to allow users to access the core filer behind us, there’s a mode that supports that. It’s called “Write Around Mode” and when you put the Edge filer in “Write Around Mode,” you’re effectively warning us that people will write around us. Then, what we’ll do is periodic checks to make sure that the data in the Edge filer is consistent with what is in the core filer; if not we’ll throw it out and go get the newer version.
“If you use the Avere software, flash mirror etc, can I realize additional cost savings by removing licensed products that do similar functions on the legacy storage arrays?”
That is absolutely true. We have a number of people who will use us to do the DR functions for FlashMirror, and they’ll stop purchasing that function on the core filer behind us. Another interesting example is CIFS, because depending on what you want to do with AWS you may or may not need CIFS running on the core filer. Because we only do the data requests using NFS, we will convert all the CIFS requests into NFS and then will only talk to the core filer using NFS.
“You can have up to 50 edge filers and 50 core filers, so up to a total of 100 such devices?”
That’s true. So we scale from 2 to 50 Edge devices, and that’s independently scaling from 1 to 50 core filers. So those scale independently, that means you can have a total of 100 of both.
“You kept talking about offload of 50 to 1, what do you mean by offload?”
Ron: Well, offload is very simple. What we mean is, if you count the number of requests coming from the user to the Edge filer, then you count the number of requests going from the Edge filer to the core filer, that’s the offload. So when I say I’m 50 to 1 offload, I mean for 50 IOPS that come in from the users to the Edge filer cluster, 49 times we turn them around locally, meaning nothing happens to the core filer. As a result, only 1 out of 50 needs to perform an IOP that is coming from the user. That’s what we mean by 50 to 1 offload. Out of 50 user requests, 49 times we can service that request locally, while only 1 causes a core filer request.
“If most of the IOPs are writes, then can it still only be 2% of the overall IOPs? I’m assuming you don’t write on the Edge filer, but instead write through to the core filer(s).”
The 50 to 1 is very application dependent. There are some times when we do exactly 50 to 1. A good example of that would be SpecSFs result that I shared in the talk, but there are plenty of cases where we do more. In the media/entertainment space, for example, they typically run in the hundreds to one offload. We have some customers that already put a large infrastructure in the core filer and they just needed a small amount of offload to get their head above water. The offload is usually very dependent on the application.
The interesting thing is that in write heavy workloads, if it’s all write once, that will drive the offload ratio down, so I think this is a really good question. If the writes are common to one file and there’s multiple over-writes, you effectively will get a write compression and it will only write to the very last version to the core filer. So in shared data or places where the data is written very actively, you could see a very, very healthy compression in write offload. The other thing we do in terms of offload is perform very small writes. In other words, we’ll effectively batch those up and do very large IOPs to the backend as well. So that’s another way of increasing the offloads in a write intensive environment.
“How do you address the security concern with respect to cloud storage?”
The way that we address security is simply to use encryption. A very high level: 256 bit AES encryption. The important thing about how we do encryption is, the encryption keys never leave on-prem because the keys stay in the Edge filer. So the only thing that gets written back to the cloud is opaque objects. The cloud provider simply has no idea what’s inside that data, and because they don’t have the keys, there’s no way to crack inside the object and see what that data is. So that’s really how we address security, we use a very high level of encryption, and the keys never leave customer prem, they’re always on the Edge filer.
I know we didn’t discuss cost in the talk. We sell five different systems, which we recommend very much based on workload. So for example, we have some systems that are only SSD, if you have a very random workload, and most of our systems are combinations of SSD and disk for more typical enterprise workloads. We offer both buy and lease options, but the total price is very much dependent on your needs.
“Who manages the encryption key for the data stored in Amazon?”
The answer to that is simple, the encryption keys are stored in the Edge filer. Currently all the encryption key management (the cycling of keys) is done on the Avere Edge filer using a Avere proprietary solution that comes with the product. One of the things we are going to do in a near term release, is open up support for encryption key management. And then we’ll be able to support that standard.
“What happens if in the future I want to migrate my data out of Amazon and over to a different cloud provider? How do I move the information?”
This is a perfect application for our FlashMove software. With the FlashMove function, you simply go to the GUI, and you can change the directory to repository mapping. So if you’re currently in one cloud provider, and you want to move that export to another cloud provider, you go to the GUI, a couple of mouse clicks, and we copy the data out. And the beauty of FlashMove is that it operates independently of whether the repository is cloud or NAS. So you can move from NFS to NFS, you can move from NFS into the cloud, you can move from cloud to cloud, or cloud to NFS. The nice thing about that FlashMove is, it happens concurrently with users transacting against the data. So there’s never an outage to do that migration.
“How expandable is cloud NAS? If we acquire another company, and our data needs suddenly surge, how easily can we increase the cache to accommodate the additional data?”
The beautiful thing about the about the Edge/Core architecture is you can scale performance independently of capacity. So when you bring in all this additional data, if it comes at no increase in performance requirement, you simply bring up another repository (or the repository of that acquired company) and link it into the global namespace and you’re done. You’ve just scaled capacity. However, if you now need to scale performance, you simply just buy more Edge filers for the performance level you need. So this is easily accomplished with the Edge/Core architecture. Really, what that Edge/Core enables is that independent scaling of performance from capacity.
“Which enterprises do you think are starting to integrate to the cloud? And for what use cases?”
Well, I would say a couple of the early use cases we’re seeing are for active archive, so maybe for regulatory reasons, they need to keep data around for a while. But every once in a while they need to actually go in and look at that data. We’re seeing a lot of use cases like that. So big bulk storage out in the cloud, and periodically the ability to go in and take a look at what that data is.
We’re also seeing use cases around Disaster Recovery. I think this was a point George made early in the talk, that a lot of the early use for cloud is for backup. Well, imagine with our FlashMirror you can keep the data on-prem in a NAS filer but you could also keep a copy of it synced to the cloud. So in a disastrous event where you lose your main data center, you always have a full backup in the cloud.
We’re also starting to see it in what I’ll call off-prem storage peak. So, if all of a sudden there’s a huge influx of data and you don’t have enough time to get your data center ready to store all of that data, you can easily use us to push that data into the cloud and then as the gear shows up, as your data center grows, then you can bring it back in.
“Not sure why you say gateways don’t help with performance, they say they have onsite caches and thus provide a local access feel, I’m assuming you do the same – am I missing something?”
The point that I was trying to make was that the typical gateway uses a very conventional NAS model; there’s one CPU and that CPU has to drive all the IOPs. The only thing you can do to scale that thing is put more and more storage on it. So storage capacity is easy to scale, but performance isn’t. The big point of the Edge filer is you can scale performance by adding more boxes. Every time you add a box, an Edge filer node, into the cluster you add not only CPU, but that globally shared capacity of whatever media it brings is now globally shared by all the nodes. The important part is, yeah, if you have a gateway there’s some level of performance, but it’s fixed by the head that you use to implement that gateway. Ours starts with this ability to use local flash, to get automatically to putting the hottest stuff in flash, but then also have the ability to scale as you add nodes to the cluster.
Register here to learn more about Hybrid Cloud storage strategies. Your registration allows you to watch the complete webinar featuring George Crump and Ron Bianchini, now available on demand.
Avere Systems is a client of Storage Switzerland