Recovery point objective
Recovery point objective definition
Recovery point objective (RPO) is defined as the maximum amount of data – as measured by time – that can be lost after a recovery from a disaster, failure, or comparable event before data loss will exceed what is acceptable to an organization. An RPOs determines the maximum age of the data or files in backup storage needed to be able to meet the objective specified by the RPO, should a network or computer system failure occur.
An organization’s loss tolerance, or how much data it can lose without sustaining significant harm, is related to RPO and is set forth in the organization’s business continuity plan (BCP). This also dictates procedures for disaster recovery planning, including the acceptable backup interval, because it refers to the last point when the organization’s data was preserved in a usable format. For example, an RPO of 60 minutes requires a system backup every 60 minutes.
What is a recovery point objective?
RPO recovery point objective is a time-based measurement of the maximum amount of data loss that is tolerable to an organization. Also called backup recovery point objective, RPO is additionally important to determining whether the organization’s backup schedule is sufficient to recover after a disaster.
The recovery point objective is critical because at least some data loss is likely when a disaster strikes. Even real-time backups cannot entirely prevent data loss when large-scale failures occur.
RPOs can determine:
- How much data will be lost after a disaster or event
- How frequently you need to backup your data for disaster recovery purposes—in other words, RPO does not concern other IT needs
How does recovery point objective work?
Often, high-priority applications demand tighter RPOs, which will require more frequent backups. In these situations, the IT department must schedule backup systems that can satisfy such RPOs, such as the combination of snapshots and replication (also known as near-continuous data protection, or near-CDP). When RPO is near-zero, the team will combine failover services and continuous replication, or a continuous data protection system (CDP) to create nearly 100 percent availability for applications and data.
Recovery point objective vs recovery time objective
Recovery point objective and recovery time objective (RTO) are among a data protection or disaster recovery plan’s most important parameters. These objectives can guide the selection of an optimal data backup plan, as well as offer bases for identifying and analyzing viable strategies which could enable the enterprise to resume business processes within a timeframe at or near the RPO and RTO.
Although these two terms are related, it is important to understand the difference between them.
Every BCP sets forth a maximum allowable tolerance or threshold for data loss during a disruption. The recovery point objective (RPO) describes the amount of time that can pass during an event before data loss exceeds that tolerance.
Example: An outage occurs. If the RPO for this business is 12 hours and the last good copy of data available is from 10 hours ago, we are still within the RPO’s parameters for this business continuity plan.
In other words, recovery point objectives of a recovery plan specify the last point in time the IT team could achieve tolerable business recovery processing given how much data will be lost during that interval.
The recovery time objective (RTO) is the amount of real time a business has to restore its processes at an acceptable service level after a disaster to avoid intolerable consequences associated with the disruption. The RTO answers the question: “How much time after notification about the business process disruption should it take to resume normal operations?”
Another way to think about the difference between recovery time objective and recovery point objective is that RPO represents a changing amount of data that will require re-entry or may be lost during network downtime. RTO represents how much real time that can pass before the interruption impedes the flow of normal business operations unacceptably.
Recovery time actual (RTA) and recovery point actual (RPA) are always the elapsed time and lost data of an actual recovery process and are often different from these objectives. Only business disruption and disaster rehearsals can expose these actuals.
As mentioned above, RPOs and RTOs will differ based on application and data priority. Near-zero RPO and RTO for all applications are very costly, as the only way to ensure no lost data and 100 percent uptime is by ensuring continuous data replication inside failover virtual environments.
Due to the cost of a near-zero RPO, prioritize data and applications to match the expense of achieving the right RPO and RTO based on purpose, risk, and costs. RTO is concerned with systems and applications, meaning its calculation deals more with time limitations on application downtime than data recovery.
This is another way to express the difference between recovery point objective and recovery time objective: RPO is focused on how much data is lost after a failure. Bad user experience and irritated users are the realm of RTO, but RPO covers catastrophic issues such as the loss of hundreds of thousands of dollars in customer transactions.
Recovery point objective examples
Here are several examples of recovery point objectives in action:
In the case of a business that uses traditional tape backups, consider a backup plan that schedules backups twice a day at 6 AM and 6 PM. A primary site failure at 2 PM allows the team to restore from the 6 AM backup an RPA of eight hours. The RTA will be driven by how long the restore takes, followed by any additional work necessary to return the system to full operation.
Continuous replication and continuous data protection (CDP) offer more secure RPO guarantees, since the target system holds a mirror image of the source. Depending on whether the replication is synchronous or asynchronous and how fast the changes are applied, the RPA values change. RPA depends on how quickly the application can access the data on the replicated site.
In some situations a business may require granular item recovery capabilities. For example, a user may delete important company files attached to email communications, and then empty the contents of their trash folder. Email is a business-critical application for many enterprises, so this is an application that IT might backup continuously, allowing for granular backup and recovery of a deleted file with an RTA of several minutes.
As another example, an e-commerce retail site likely uses multiple databases for different purposes. It stores its product catalog in a relational database, historical order data in a document database, and connects to its payment processor’s gateway via an API.
The RPO for the document database is within 24 hours because IT can reconstruct data for it from other databases. For this relational database, RPO is not critical because the business only adds products periodically. But if the database goes down, revenue stops, so RTO is more critical for this database, so the RTO might actually be shorter than the RPO.
How to calculate recovery point objective
RPOs can be set based on the frequency at which files are updated. This confirms your restored operations contain the most up to date version of your data following a service interruption. For example, frequently updated files need a short RPO of no more than a few minutes to ensure IT can restore operations with minimal data loss following a disruptive event.
Factors that can affect RPOs include:
- Maximum tolerable data loss for the specific organization
- Industry-specific factors—businesses dealing with sensitive information such as financial transactions or health records must update more often
- Data storage options, such as physical files versus cloud storage, can affect speed of recovery
- The cost of data loss and lost operations
- Compliance schemes include provisions for disaster recovery, data loss, and data availability that may affect businesses
- The cost of implementing disaster recovery solutions
Once defined, RPOs serve to detail the goals of the BCP, and each business unit should have distinct RPOs. For example, financial transactions and other mission critical data processes demand shorter RPOs than less frequently updated files such as personnel records.
As you calculate RPOs for your business units, consider these sample intervals:
0 to 1 hour
This is for critical operations that cannot afford to lose over an hour of data. They are dynamic, high volume, and difficult or impossible to recreate due to the number of variables involved. Patient records, banking transactions, and CRM systems all fall within this tier.
1 to 4 hours
This interval is for semi-critical business units that can afford data loss of up to four hours’ worth of data such as file servers and customer chat logs.
4 to 12 hours
Business units in this tier might include sales data and marketing.
13 to 24 hours
These business units handle semi-important data, and their RPO should go back no more than 24 hours. This tier may include purchasing and human resources, for example.
Druva’s cloud-native disaster recovery solution and RPO
Druva’s cloud-native disaster recovery solutions offered as-a-service provide flexibility when it comes to enterprise RPO needs. With Druva, users also lower TCO by up to 60 percent and remove the burden of legacy architecture, unifying disaster recovery, backup, and archives in the cloud.
Druva customers can meet RPOs ranging from minutes to one hour, depending on the workload being protected. This is thanks to Druva’s use of source global deduplication, which allows backups to run quicker while consuming fewer resources than traditional backups. This allows for running backups much more frequently throughout the day, versus traditional backups that only run at night due to the resources they consume.