Backup and Recovery are vitally important to what we do in the Data Center. We can’t lose data and if something happens, we MUST get them back and you have to recover them fast. To help you have success in your backup and recovery, Storage Switzerland founder George Crump met with Amy Reed from Synnex and HP’s Scott Baker. Scott is HP’s Director of Enterprise Data Protection. This is a transcription of the Question and Answer portion of the webinar, with Scott and George answering the questions from webinar attendees.
Question: “Earlier comments were made about throwing hardware and software at backup problems, can you give me some examples that HP is seeing from its customers?”
Scott: It’s pretty common-place, and it breaks down like this: When a customer is dealing with situations where they’re running out of capacity, it’s as simple as throwing more hardware at it. Typically the hardware that is thrown will be the lowest cost hardware, which introduces a number of complexities around having to specialize the personnel in the environment so that they understand how to work with it.
If we hear things like applications are running slow, this typically means it’s an opportunity to upgrade the infrastructure, and unless you have an endless wallet, then doing that outside of budgetary planning is very difficult. If I hear backup is taking too long it traditionally means they may be looking at opportunities to update the backup infrastructure but it tends to be very point based in nature. In other words, a lot of people will buy purpose-built technology to address a specific kind of product, whether it’s supporting it from hardware or doing it from backup, you see an introduction (in a worst case scenario) of five different backup solutions. If I hear we need to be more compliant, this typically means let’s implement some kind of archive device, some kind of document management system or regulatory compliant system.
The last two that I would pass along would be, if we need to retain our information for longer periods that typically translates into, we need to retain things forever. Lastly is having a need to retrieve historical information, which means they’re trying to look for different ways to access information and store it on a tape device, or push that information into a cloud so that they can offload that responsibility for owning that infrastructure, and almost use that as a sort of quasi DR solution.
Question: “RPO and RTO seem the same, can you explain the difference?”
George: They are closely related, they are more like brothers than they are twins though. So first of all, RPO is recovery point objective, RTO is recovery time objective. Recovery point is how many times are you capturing data in some sort of protected state. The lower your RPO, the more often you’re capturing that information. So its essentially a sensitivity to data loss or rekeying information.
If you are Amazon and you lose an hour’s worth of data, that could be a catastrophe all in itself. If you are Storage Switzerland, it wouldn’t be great. But we probably have it all backed up.
RTO is how long it takes the application to get up and running. It’s more than just install and exchange, it’s install and exchange, it’s getting the data restored, making sure you replay all your transaction logs if you have them, things like that. It’s the total time that users can get things back up and running to the closest state it was before you had the outage. That’s really the basic difference.
Question: “What is the most difficult backup configuration you’ve seen?”
Scott: The most challenging conversation I had this week was a company who wanted to deploy a multi tiered e-commerce application. They have, for VMware, their user interface which is sitting on top of NAS based storage that’s been virtualized, their web-logic layer is a physical implementation that’s being installed on SanDisks that are coming off of a non-HP array, and their actual database layer is set across two different databases. One database is Oracle and the other is SQL, one of them is on SAN storage and the other one is on ISCSI storage.
So here is the challenge given all of this, I asked them to define their backup requirements. They want to backup to a separate physical location they want a backup schedule that includes hourly snapshots, nightly folds, differentials of the database every six hours, differentials of the log files every 30 minutes for backup retention perspectives, they wanted daily backups for the last seven days, weekly backups for the last four, monthly backups for the last twelve, and yearly backups capped. Then with recovery, they wanted point in time recovery at the object level, they wanted application extensions for self service recovery for the individual business unit owners, and they wanted an application consistent recovery approach.
As far as the retention period – three periods, (daily, weekly, monthly) maintained off site and each period had to be on different media types. Also, user access also occurs at over 50 different remote sites.
That is a challenge, but it’s also reminiscent of what I feel most organizations deal with to some degree – multi tiered applications that have to work in concert with each other, that can be destabilized during a recovery process if you’re using multiple different backup solutions, or you’re working with people with multiple different levels of expectations or experience, or you’re using different types of hardware with varying degrees of performance and characteristics that may help or hurt the overall recovery process.
Question: “What are ways that I can reduce RPO and RTO?”
George: I think that is where the portfolio approach becomes important. You might start by using snapshots. The risk with snapshots is if you’re storage system fails, so do your snapshots, so you want to make sure you have a really good storage system and also have some way to get them off your storage system so you’re not relying completely on your snapshots. Then from there you go through the process of categorizing your applications.
Not every application needs to use snapshot as its primary means of recovery, but for those critical applications it probably makes a lot of sense to do so. Then you want to down stage, might be backup to disk and then backup to tape, it just varies. The simplest way to reduce RPO and RTO would be with a snapshot, but again, that assumes the system stays up and running, you may then want to replicate to a second storage system, or do frequent backups to a disk appliance.
Scott: I think the right approach is to pick a backup storage that gives you tiers and those tiers allow you to pick one location that will limit the amount of time it takes to get those applications up and running as fast as possible. As George mentioned, being able to have those snapshots near instant access from a recovery perspective, having a second tier before it gets moved off into a traditional backup catalogue, sort of a change in the binary format of the information is a very important caching area that’s off of the primary array. And then having the ability for you to get into an archive or a cloud based target.
But most importantly, being able to coordinate the overall recovery chain regardless of where the information that lies within that span of media is the most important thing. So if you do lose the snapshots, the recovery chain is updated and then you get the next fastest possible recovery point from there.
Question: “Are the visualizations and analytics you discussed part of a suite or are they sold separately?”
Scott: We debated for a long time whether or not we should make it a suite based offering or individual. In the end what we decided to do was make them individual components. The reason we decided to do this was to allow us to provide it to users who maybe weren’t ready to move into the latest and greatest version of data protection.
So with our backup navigator which does the analytics, with the data protection manager pack which does a topical view of the overall infrastructure, those are actually backward compatible with our backup product back to version 7. We’re currently sitting on a version 9 now, so this allows us to bring those customers that are happy with that version of the data protection that they have and not totally ready to upgrade. So it gives them the ability to take full advantage of the analytics that I discussed earlier.
Question: “Do you use deduplication during a backup?”
Scott: Deduplication is certainly going to be something that is going to be unique for each organization, my overall recommendation is that you take full advantage of deduplication.
With respect to data protector itself and the way that HP approaches it, we use a common deduplication engine. So the same engine that’s based inside of our backup software, with the exact same engine thats running in the backup target hardware from HP. So you only have to deduplicate one time, at the application source to minimize traffic, at the backup server to push the responsibility straight to it, or if you have plenty of bandwidth you can push the deduplication responsibility to the backup target.
In the long run, what this allows you to do is take full advantage of a limited amount of capacity without degrading or destabilizing the performance when you go through a recovery operation. It’s all pointer based deduplication. So we still have access to the unique box, we just minimize the amount of traffic both on disk and over the network.
George: I think that’s an important point too. My fear nowadays is that deduplication has become a pretty common word and people are treating it like a checkbox item. It’s really worth your time to explore different options with the vendors that you’re talking to about their deduplication strategy and how it can be deployed throughout the environment and how well does it scale, those are really important questions to have answers to.
Question: “Do you recommend different technologies for backup and archive and if so, what?”
Scott: I absolutely do recommend not having just a different technology, but a different mentality. Backup, for me, is about having immediate access to the most recent information. How you define “recent” is up to your organization. Behind all of this should be a long term retention strategy for the referential value or the probability that you’re going to use the information, so that you can move it into some kind of an archive target, whatever that may be. It could be tape, it could be disk based archive, it could be an object store, it could be cloud, you name it. It’s got to match what your requirements are for cost, efficiency, and access to the information.
Most importantly, make sure that the data protection solution that you choose has the capability to manage the movement of data from the production array, to the second point of recovery which would be a caching or staging area, to the third point which would be your backup target, and even manage and work with that at the archive side. Understanding the information is in archive, and only retracting that information when it’s necessary, not any time you scan a file system.
George: What we tell people is they need to reevaluate how they think about archive, because archive has always kind of been this desk where you keep all this old dusty stuff and you think you won’t need it again. But we need to keep going after archives more and more frequently, obviously with legal issues, as well as with the emergence of analytics and things like that. We tell people to consider how often they need to recover and how fast would it need to be back if you needed it. Typically in a storage situation, you’ve got time – take advantage of that. The more time you have to recover something, the lower cost form of media you can afford to put it on, so it’s a cost/savings issue, and it’s a great set of questions to ask.