Platform
- Data Resiliency Cloud
  Data Resiliency Cloud
  Enterprise Cloud Backup and data management across edge, on-premises and cloud workloads
- Data Protection
  Data Protection
  Modernize data protection to reduce costs and complexity
- Cyber Resiliency
  Cyber Resiliency
  Be ready for cyber attacks with data that is always safe, always ready
  - Accelerated Ransomware Recovery
  - Security Posture & Observability
- Governance & Compliance
  Governance & Compliance
  Secure, protect, and streamline data governance for all your critical data, wherever it lives
  - eDiscovery and Legal Hold
  - Sensitive Data Management
- Take a Tour
Solutions
- Business Drivers
  Business Drivers
  Learn how Druva helps you accelerate key business initiatives
- SaaS Applications
  SaaS Applications
  Druva provides comprehensive data protection that supports multiple SaaS applications from a single platform. Discover the Druva difference today.
- Enterprise Workloads
  - Virtualization
    Virtualization
    Transform data center backup and disaster recovery for virtual environments
    
    VMware
    
    Nutanix
  - Databases
    Databases
    Reduce the cost and complexity of data protection for enterprise databases
    
    Oracle
    
    MS SQL
    
    SAP HANA
  - Files
    Files
    Discover a more cost-efficient way to protect on-premises and cloud NAS
    
    NAS/files
  - Public Cloud
    Public Cloud
    Protect native AWS and Azure deployments with secure backups without the cost and complexity
    
    AWS
    
    Microsoft Azure
- Enterprise Endpoints
  Enterprise Endpoints
  Unify SaaS apps and end-user device protection to reduce data risks. Improve cyber resilience and compliance by protecting enterprise workloads and assets.
- Free Trial
Customers
- Explore All Customer Stories
  We are trusted by the world's leading organizations to protect their data. Explore customer success stories to see how your peers are using Druva.
- Ransomware recovery ready
  Learn why Medallia chose Druva
  
  SaaS data protection across the enterprise
  See why Regeneron partnered with Druva
Resources
- 2023 Gartner® Magic Quadrant™
  See why Druva is recognized as a Visionary
  
  Data Resiliency for Dummies
  Get your guide to data resiliency
Partners
- Strategic Partners
  Strategic Partners
  Learn about Druva's strategic capabilities across platform, OEM, and other partnerships. Find out how Druva accelerates and protects customers' cloud journeys.
  - Dell Technologies
  - AWS
  - VMware
  - Nutanix
- Programs
  Programs
  Learn how you can profit with Druva and a cloud-first SaaS selling motion. Explore partner programs, access resources, and discover the benefits of partnering with Druva.
- Become a Partner
Company
- - Company
  - Leadership
  - Investors
  - Careers
  - Contact Us
  - Newsroom
  - Awards
  - Events
  - Blog
  - Diversity, Equity & Inclusion
- Get in touch with us
  Contact Us
  
  News, product innovations, and more
  Blog
Get Started
Support
Login
Language

Failover

Failover definition

Failover is the ability to switch automatically and seamlessly to a reliable backup system. When a component or primary system fails, either a standby operational mode or redundancy should achieve failover and lessen or eliminate negative impact on users.

To achieve redundancy upon the abnormal failure or termination of a formerly active version, a standby database, system, server, or other hardware component or network must always stand ready to automatically switch into action. In other words, all backup techniques including standby computer server systems must themselves be immune to failure, because failover is critical to disaster recovery (DR).

What is a failover?

Failover automation in servers includes pulse or heartbeat conditions. That is, heartbeat cables connect two servers or multiple servers in a network with the primary server always active. As long as the heartbeat continues or it perceives the pulse, the secondary server merely rests.

However, should the secondary server perceive any change in the pulse from the primary failover server, it will initiate its instances and take over the primary server’s operations. It will also message the technician or data center requesting that they bring the primary server back online. Some systems, called automated with manual approval configuration, simply alert the technician or data center instead, requesting the change to the server take place manually.

Virtualization simulates a computer environment using a virtual machine or pseudo machine running host software. In this way, the failover process can be independent of the physical hardware components of computer server systems.

How does failover work?

Active-active and active-passive or active-standby are the most common configurations for high availability (HA). Each implementation technique achieves failover in a different way, although both improve reliability.

Typically, at least two nodes actively and simultaneously running the same sort of service comprise an active-active high availability cluster. The active-active cluster distributes workloads across all the nodes more evenly, preventing any single node from overloading and achieving load balancing. And because more nodes remain available, throughput and response times improve. To ensure the HA cluster operates seamlessly and achieves redundancy, the individual configurations and settings of the nodes should be identical.

In contrast, in an active-passive cluster, although there must be at least two nodes, not all of them are active. In a two node system with the first node active, the second node will remain passive or on standby as the failover server. In this standby operational mode, it can remain ready should the active, primary server stop functioning to serve as a backup. However, unless there is a failure, clients only connect to the active server.

Just as in the active-active cluster, both servers in the active-standby cluster must be configured with the very same settings. This way, clients cannot perceive any change in service, even if the failover router or server must take over.

Clearly, in an active-standby cluster although the standby node is always running, actual utilization approaches zero.

In an active-active cluster, utilization of both nodes nears half and half— although each node can handle the entire load alone. However, this also means that node failure can cause performance to degrade if one active-active configuration node handles more than half of the load consistently.

Outage time during a failure is virtually zero with an active-active HA configuration, because both paths are active. With an active-passive configuration, outage time has the potential to be greater, as the system must switch from one node to the other, which requires time.

What is a failover cluster?

A failover cluster is a set of computer servers that provide fault tolerance (FT), continuous availability (CA), or high availability (HA) together. Failover cluster network configurations may use virtual machines (VMs), physical hardware only, or both.

If one of the servers in a failover cluster goes down, this triggers the failover process. Instantly sending the failed component’s workload to another node in the cluster, this prevents downtime.

Providing either HA or CA for applications and services is a failover cluster’s primary goal. Also known as fault tolerant (FT) clusters, CA clusters eliminate downtime when main or primary systems fail, enabling end users to keep using applications and services without interruptions or timeouts.

In contrast, despite a potential brief interruption in service, HA clusters offer minimal downtime, automatic recovery, and no data loss. The recovery process in HA clusters can be configured using failover cluster manager tools, which are included as part of most failover cluster solutions.

In a broader sense, a cluster is two or more nodes or servers, usually connected both physically with cables and via software. Additional clustering technologies such as parallel or concurrent processing, load balancing, and cloud storage solutions are included in some failover implementations.

Internet failover is essentially a redundant or secondary internet connection to be used as a failover link in case of a failure. This can be thought of as another piece of failover capability in servers.

What is an application server failover?

Application servers are simply servers that run applications. This means that application server failover is a failover strategy to protect these types of servers.

At a minimum, these application servers should have unique domain names, and ideally they should run on different servers. Failover cluster best practices typically include application server load balancing.

What is Failover testing?

Failover testing validates a system’s capacity during a server failure to allocate sufficient resources toward recovery. In other words, failover testing assesses failover capability in servers.

The test will determine whether the system has the capacity in the event of any kind of abnormal termination or failure to handle necessary extra resources and move operations to backup systems. For instance, failover and recovery testing determines the ability of the system to manage and power an additional CPU or multiple servers once it achieves a threshold for performance — one often breached during critical failures. This highlights the important relationship between failover testing, resilience, and security.

What is failover and failback?

In computing and related technologies such as networking, failover is the process of switching operations to a backup recovery facility. The backup site in failover is generally a standby or redundant computer network, hardware component, system, or server, often in a secondary disaster recovery (DR) location. Typically, failover involves using a failover tool or failover service of some type to temporarily halt and restart operations from a remote location.

A failback operation involves returning production to its original location after a scheduled maintenance period or a disaster. It is the return from standby to fully functional.

Typically, systems designers offer failover capability in systems, servers, or networks demanding CA, HA, or a high level of reliability. Failover practices have also become less reliant on physical hardware with little or no disruption in service thanks to the use of virtualization software.

Does Druva offer a cloud failover strategy?

With single-click failback to the primary site, post-event mitigation, Druva offers something unique in the industry. A simple, identical configuration of your primary and failover VMs is the first step. Data transfer starts once virtual machine disks are attached, and once transfer is completed, DNS connections are redirected and primary VMs are rebooted.

Now more than ever, threats from remote worker data and cyber attackers are increasing. Data is the fuel your enterprise needs; protect it with a robust DR strategy. Empower your approach by leveraging the global reach and scalability of Druva, built on AWS.

Watch the video below to learn more, and explore Druva DRaaS here to find out how the cloud ensures your data is always on, always safe.

Related Terms

Now that you’ve learned about failover, brush up on these related terms with Druva’s glossary:

What is an RTO?
What is the 3-2-1 backup rule?
What is a disaster recovery plan?

Failover

Failover definition

What is a failover?

How does failover work?

What is a failover cluster?

What is an application server failover?

What is Failover testing?

What is failover and failback?

Does Druva offer a cloud failover strategy?

Related Terms

Druva Data Resiliency Cloud

Cloud Backup & Recovery

Data Protection

Governance & Compliance

Cyber Resilience

Business drivers

Workloads

Partners

Customers

Resources

Company