Platform
- Data Security Cloud
  Data Security Cloud
  Fully managed data security across enterprise, cloud, SaaS, and end user.
- Data Protection
  Data Protection
  Modernize data protection to reduce costs and complexity
- Cyber Response & Recovery
  Cyber Response & Recovery
  Bounce back from cyber attacks with data that is always safe and ready.
- eDiscovery & Compliance
  eDiscovery & Compliance
  Secure, protect, and streamline data governance.
- Meet Dru - Your Copilot for Data Security
Solutions
- Use Cases
  Use Cases
  Learn how Druva helps you accelerate key business initiatives
- Key Technologies
  - Public Cloud
    Public Cloud
    Protect native AWS and Azure deployments with secure backups without the cost and complexity
    
    Amazon EC2
    
    Amazon RDS
    
    Azure
  - Hybrid Workloads
    Hybrid Workloads
    Transform data center backup and disaster recovery for virtual environments
    
    VMware
    
    Hyper-V
    
    Nutanix
    
    Oracle
    
    MS SQL
    
    SAP HANA
    
    NAS/files
  - Endpoint and SaaS Apps
    Endpoint and SaaS Apps
    Enterprise Cloud Backup and data management across edge, on-premises and cloud workloads
    
    End User Protection
    
    Microsoft 365
    
    Salesforce
    
    Google Workspace
    
    Microsoft Entra ID
    
    Microsoft Dynamics 365
- Free Trial
Customers
- Explore All Customer Stories
  We are trusted by the world's leading organizations to protect their data. Explore customer success stories to see how your peers are using Druva.
- Ransomware recovery ready
  Learn why Medallia chose Druva
  
  SaaS data protection across the enterprise
  See why Regeneron partnered with Druva
Resources
- Druva vs. Veeam TCO Calculator
  Find the hidden costs of legacy backup
  
  Forrester: Total Economic Impact of Druva 2024
  Customers see 224% ROI: Find out how
Partners
- Programs
  Programs
  Learn how you can profit with Druva and a cloud-first SaaS selling motion. Explore partner programs, access resources, and discover the benefits of partnering with Druva.
- Strategic Partners
  Strategic Partners
  Learn about Druva's strategic capabilities across platform, OEM, and other partnerships. Find out how Druva accelerates and protects customers' cloud journeys.
  - Dell Technologies
  - AWS
  - VMware
  - Nutanix
- Become a Partner
Company
- - Company
  - Leadership
  - Investors
  - Careers
  - Contact Us
  - Newsroom
  - Awards
  - Events
  - Diversity, Equity & Inclusion
  - Blog
- Get in touch with us
  Contact Us
  
  News, product innovations, and more
  Blog
Get Started
Support
Login
Language
- English
- Deutsch

Innovation Series

Building a New Backup Agent for Virtual Workloads Using Go

February 24, 2023 Hrishikesh Pallod, Principal Engineer

Introduction

Druva uses agents to back up data from different data source types such as VMs, NAS, file servers, and more. With different data types from each source, each agent was built from the ground up to cater to the requirement of the data source. This introduced some challenges in the development and maintenance of these agents.

That’s why we recently built a component to make the agent development process more streamlined. We named these components data movers. It includes generic business logic for these tasks and offers a set of interfaces that can be customized for specific workloads. This blog post talks about what data movers are and explains why and how we built its different variants.

Why we needed data movers?

The motive behind building the data movers component was to reduce duplication of code and facilitate better maintenance, provide clean and consistent interfaces for future development, and make it easier to incorporate new features. The data movers component aimed to improve the efficiency of the agents. By integrating their implementation with the data movers, developers can easily create and deploy backup and restore agents.

Previously, Druva agents were written using Python. As we were in the process of upgrading our architecture, we decided to write the data mover component in Go instead of Python. There are several reasons for this decision: Go is faster and more efficient, has built-in support for concurrency using goroutines, and its static typing means maintenance is easier. These benefits made Go the perfect fit for all our use cases.

Workload variations

Druva protects data for a variety of workload categories, such as virtualization, databases, NAS boxes, file servers, and SaaS applications. These workloads do vary in terms of the number and size of files being backed up, the nature of data, the methods used to identify changes or blocks of data, and the methods used to coordinate backup and restore tasks. In order to accommodate the wide range of requirements for these different workloads, we developed a few variations of the data mover component. Each variation was tailored for a workload category. By using these specialized data movers, Druva is able to provide comprehensive support for a wide range of workloads.

We have developed the following variants of data movers:

Data movers for Virtual Workloads

This data mover is designed to work with different types of virtualized workloads where data is read through specialized APIs. Virtual environments provide Change Block Tracking (CBT) APIs to identify data blocks that have changed across two snapshots. Backup and restore tasks for these workloads are typically coordinated through a Druva agent that is installed on a proxy device. Examples of workloads that fall into this category include VMware, AHV, HyperV, and Amazon EC2.

Data mover for Databases

This data mover is optimized for workloads that involve databases and can operate in one or multiple modes such as full backups, change-log-based backups, and differential backups. These workloads typically provide push-based streams for changed blocks and require a Druva library to be plugged into the application for orchestration. Examples of workloads that fall into this category include Oracle and SAP Hana.

Data mover for File Servers/NAS

This data mover is designed to work with fileserver data such as NAS shares or filesystem folders. It can use either traversal-based or more optimized application-specific methods to identify changes, depending on the specific workload. Backup and restore tasks for these workloads are typically coordinated through a Druva agent residing on the source or a proxy. Examples of workloads that fall into this category include NAS servers and Linux/Windows-based file servers.

The rest of this article is going to be focused on the first combination which is the data mover for virtual workloads.

Responsibilities of the data mover for virtual workloads

Orchestration of the backup and restore jobs.
Traversal through the virtual machine disks for change identification.
Concurrency management for parallel file IO and block IO tuned for maximum performance.
Local state management for detecting and correcting inconsistencies that can arise due to cases such as job failures.
Provision to store metadata separately to help with subsequent backup and restore jobs. Metadata like the VM layout for restoration, information related to incremental backups, and restore view for representation purposes among others.
Handling for thin and thick provisioned disks.
Stat management.
Error handling involves appropriate actions on a case-by-case basis.
Improve performance for backup/restore pipeline for batching, compression/decompression, and checksums. (This is a high-performance library developed by Druva. Read more about this on our Achieving >1TB/hr backup speed by implementing the core client-side data pipeline in Rust blog.)
Provide generic interfaces for application-specific implementations such as:
- Snapshotting during backup
- Creation of VMs during restore
- Attaching of disks to VMs during restore
- Identification of changed blocks
- Reading of data blocks during backups
- Writing of data blocks during restore

Backup workflow

The central component depicted within the blue dotted lines represents the common data mover component that is used to manage and coordinate backup tasks for virtual workloads. It is responsible for orchestrating the entire backup workflow, which includes preparing the source data, processing files and blocks for backup through multiple goroutine engines, completing post-upload processing, and creating the snapshot. It also maintains a local state to identify modifications to a VM such as the addition or removal of disks and to correct inconsistencies in the snapshot data. To perform operations specific to a workload, such as snapshotting, traversing through disks, and reading blocks, the data mover offers interfaces that the target workload must implement.

Restore workflow

The central component within the blue dotted lines represents the common data mover component that is used to manage and coordinate restoration tasks for virtual workloads. It is responsible for organizing the entire restoration process, including preparing the target infrastructure, concurrently downloading and writing files and blocks tuned to efficiency, and completing post-restore processing. To perform operations specific to a workload, such as creating virtual machines and disks, attaching disks, and writing to disks, the data mover publishes interfaces that the target workload must implement.

We have more data movers that we use at Druva. I’ll write more about them in a future blog post.

If you want to read more about how we are building the 100% SaaS-based data resilience platform, visit our Tech/Engineering blog section.

Building a New Backup Agent for Virtual Workloads Using Go

Introduction

Why we needed data movers?

Workload variations

Data movers for Virtual Workloads

Data mover for Databases

Data mover for File Servers/NAS

Responsibilities of the data mover for virtual workloads

Backup workflow

Restore workflow

Druva Blog: Cloud Technology & Data Protection Articles

Druva Data Security Cloud

The Druva Platform

Data Protection

Cyber Response & Recovery

eDiscovery & Compliance

Use Cases

Key Technologies

Customers

Resources

Partners

Company