Recovery time objective

Recovery time objective definition

The recovery time objective (RTO) is the maximum acceptable time that an application, computer, network, or system can be down after an unexpected disaster, failure, or comparable event takes place. RTO captures the maximum allowable time between restoration of normal service levels and resumption of typical operations and the unexpected failure or disaster. RTO defines a turning point, after which time the consequences of interruption from a disaster or failure become unacceptable.

What is a recovery time objective?

The recovery time objective (RTO) is the amount of time or real time during or after a disaster that can elapse without a business restoring its services and processes to acceptable levels before it will experience intolerable consequences associated with the disruption. The RTO answers the question: “How much real time will it take the business to recover after notification of the business process disruption?”

An RTO is not a deadline; it does not mandate a specific date for recovery. Instead, whether or not the objective of the RTO is achieved, the value of the recovery time objective is in the ability to come close, which can only come through careful planning and analysis.

How much time the IT department requires to recover critical data after a disaster defines RTO. In this sense, RTOs represent how long the enterprise can survive without IT services and infrastructure—and thus overall needs of the business.

Recovery time objective vs. recovery point objective

Recovery time objective and recovery point objective (RPO) are among a data protection or disaster recovery plan’s most important parameters. These objectives can guide the selection of an optimal data backup plan, as well as offer bases for identifying and analyzing viable strategies which could enable the enterprise to resume business processes within a timeframe at or near the RPO and RTO.

Although these two terms are related, it is important to understand the difference between them.

Every BCP sets forth a maximum allowable tolerance or threshold for data loss during a disruption. The recovery point objective (RPO) describes the amount of time that can pass during an event before data loss exceeds that tolerance.

Example: An outage occurs. If the RPO for this business is 12 hours and the last good copy of data available is from 10 hours ago, we are still within the RPO’s parameters for this business continuity plan.

In other words, recovery point objectives of a recovery plan specify the last point in time the IT team could achieve tolerable business recovery processing given how much data will be lost during that interval.

The recovery time objective (RTO) is the amount of real time a business has to restore its processes at an acceptable service level after a disaster to avoid intolerable consequences associated with the disruption. The RTO answers the question: “How much time after notification about the business process disruption should it take to resume normal operations?”

Another way to think about the difference between recovery time objective and recovery point objective is that RPO represents a changing amount of data that will require re-entry or may be lost during network downtime. RTO represents how much real time that can pass before the interruption impedes the flow of normal business operations unacceptably.

Recovery time actual (RTA) and recovery point actual (RPA) are always the elapsed time and lost data of an actual recovery process and are often different from these objectives. Only business disruption and disaster rehearsals can expose these actuals.

As mentioned above, RPOs and RTOs will differ based on application and data priority. Near-zero RPO and RTO for all applications are very costly, as the only way to ensure no lost data and 100 percent uptime is by ensuring continuous data replication inside failover virtual environments.

Due to the cost of a near-zero RPO, prioritize data and applications to match the expense of achieving the right RPO and RTO based on purpose, risk, and costs. RTO is concerned with systems and applications, meaning its calculation deals more with time limitations on application downtime than data recovery.

This is another way to express the difference between recovery point objective and recovery time objective: RPO is focused on how much data is lost after a failure. Bad user experience and irritated users are the realm of RTO, but RPO covers catastrophic issues such as the loss of hundreds of thousands of dollars in customer transactions.

How do recovery objectives work?

Many factors impact restore times, including the time of day and the day of the week when the disaster occurred. In fact, the entire operation affects both RTOs and RPOs.

Higher-priority applications often require more rigorous recovery objectives. The IT department must schedule continuous replication and snapshot replication in these cases. When the recovery objectives are near-zero, the team will create nearly 100 percent availability for data and applications by combining continuous replication and failover services.

Recovery time objective examples

Here are several examples of recovery time objectives:

A business creates a backup plan that uses traditional tape backups, and conducts scheduled backups twice a day at 0800 hours and 2000 hours for 1 hour each time. A primary site failure at 1600 hours allows the team to restore from the 0800 backup, which means one hour of RTO and, probably about a one hour RTA, because that is how long the system should take to get back online.

An organization could require item recovery capabilities that are more granular in some cases. For example, if a user deletes important files on an email and then deletes their trash folder, there should be an enterprise level solution. Email is a business-critical application for any enterprise, so IT will back it up continuously, allowing for granular backup and recovery and an RTO of several minutes.

Contrast this with an e-commerce retail site which deploys multiple databases for various purposes. For example, the site stores its historical order data in a document database, stores its product catalog in a relational database, and connects to the payment processor’s gateway via an API database.

Because IT can reconstruct data for it from other databases, the RTO and RPO for the document database are within 24 hours. Because the business only adds products periodically, RPO is not critical for the relational database. But the RTO is critical for the API database, because revenue stops when that goes down.

Calculating recovery time objective

To calculate the RTO, realize that you are calculating the threshold for how quickly the application’s information must be restored. How fast must access be normal again?

Calculate RTO as follows in this example. Consider an application that monitors mission-critical hardware and collects sensor data. Insight into the performance of that hardware is essential, and without it the business could be seriously impacted; this means there is a very short RTO for bringing that system back online.

An RTO for a less mission-critical application could be considerably longer. For example, an app that doesn’t have a huge bearing on the business or that the business uses infrequently can tolerate a longer RTO.

Does Druva offer disaster recovery solutions?

Druva’s cloud-native disaster recovery solutions offered as-a-service provide flexibility when it comes to enterprise RTO needs. With Druva, users also lower TCO by up to 60 percent and remove the burden of legacy architecture, unifying disaster recovery, data backups, and archives in the cloud.

Druva customers can meet an RTO in minutes. This is thanks to Druva’s use of source global deduplication, which allows data backups to run quicker while consuming fewer resources than traditional backups.

Learn more about understanding RPO and RTO here.