Recovery time reality (RTR) and recovery point reality (RPR) are relatively new terms that relate to recovery time objective (RTO) and recovery point objective (RPO). They refer to the difference between the actual capabilities of your recovery system and the objectives to which it will be measured. In a perfect world, the reality would match the objective. But the world, especially the world of IT, is not a perfect place.
What is RTO and RPO?
Recovery time objective (RTO) is the amount of time your company determined the recovery of a given application should take. This should be a non-arbitrary value based on the cost of downtime. If you can afford to be down for long periods of time without affecting the bottom line, then hopefully you should have a long RTO. If, however, your company loses tens of thousands or even millions for every hour that your system is down, you can easily justify a very small RTO.
Recovery point objective (RPO) is the amount of data that your company determined that a recovery can acceptably lose. RPO should be a non-arbitrary value based on the cost of replacing said data. If you can easily recreate any data in your system using manual purchase orders, for instance, you can afford to lose large segments of it because you can easily replace it. If, however, your ordering system is driven directly by a web interface that has no manual backup, every minute of lost data could result in lost orders. Since many companies now have front facing systems like this, RPOs of only a few minutes are becoming quite common.
Getting Back To Reality
The question at hand, however, is what is your recovery reality? Suppose your company has deemed your application has an RTO and RPO of one hour. That means that you should completely recover the application within one hour, and you should lose no more than one hour of data. Is your recovery system able to satisfy such an RTO? What is the actual recovery time? That is your RTR. Are you able to restore all but one hours worth of data? This basically comes down to how often you are backing up. If you are backing up once a day, then your RPR is going to be at least 24 hours. If you’re taking a snapshot once an hour, your RPR is one hour. If you’re using continuous data protection, your RPR can be effectively zero.
Unfortunately, many people operate in a world where both their RTR/RPR and RTO/RPO are unknown. My colleague George Crump calls this Mutual Mystification and wrote about it in a recent blog, “Is Mutual Mystification Part of your Disaster Recovery Plan?” Essentially the organization has not come to an agreement as to how fast an application should be recovered and how much data can be lost, and they have no idea how long their system will actually take to restore a system or how much data will be lost. If either of these values are unknown in your environment, take the time to fix that.
Ending Mutual Mystification
The reason it is is so important to clear up the mystification is that mismatched expectations are the surest way to have an unhappy customer and it may even kill the business. If you have an undefined RTO/RPO, you can be assured that the RTO and RPO in the customer’s mind will be significantly smaller than what you had in mind. And if you have an unknown RPR and RTR, you can almost guarantee that it will not match the agreed upon RTO and RPO.
Determine RTR and RPR FIRST
Before working with the business to define RTO and RPO, you should know your RTR and RPR first. To establish these parameters, you need to do test recoveries with your current backup and recovery system in order to know how quickly it can recover an application and how much data you will lose. Only then will you know your RTR and RPR, and only then can you address any discrepancies between these values and your RTO and RPO. Better to know now than to find out during a recovery.
I say this because I suffered this very same problem years ago when I began my career. I had a fancy new backup system that I knew was so much better than the old system. I didn’t test how long a restore would take until we lost a major file system. The first time I figured out how well the new system would perform was when I fired it in anger at that file system. It didn’t go well. It was a horrible day and resulted in weeks and months of meetings that I could’ve avoided if I had only done one thing: Test.
Knowledge is power. You must know your RPR and RTR, and the only way to know them is to test recoveries using your current backup and recovery system. Once you discover any discrepancies between your RPR and RTR and the objectives that have been set for your system, you can proactively solve the problem. Or you can wait until the defecation hits the rotary oscillator.