Platform
- Data Security Cloud
  Data Security Cloud
  Fully managed data security across enterprise, cloud, SaaS, and end user.
- Data Protection
  Data Protection
  Modernize data protection to reduce costs and complexity
- Cyber Response & Recovery
  Cyber Response & Recovery
  Bounce back from cyber attacks with data that is always safe and ready.
- eDiscovery & Compliance
  eDiscovery & Compliance
  Secure, protect, and streamline data governance.
- Meet Dru - Your Copilot for Data Security
Solutions
- Use Cases
  Use Cases
  Learn how Druva helps you accelerate key business initiatives
- Key Technologies
  - Public Cloud
    Public Cloud
    Protect native AWS and Azure deployments with secure backups without the cost and complexity
    
    Amazon EC2
    
    Amazon RDS
    
    Azure
  - Hybrid Workloads
    Hybrid Workloads
    Transform data center backup and disaster recovery for virtual environments
    
    VMware
    
    Hyper-V
    
    Nutanix
    
    Oracle
    
    MS SQL
    
    SAP HANA
    
    NAS/files
  - Endpoint and SaaS Apps
    Endpoint and SaaS Apps
    Enterprise Cloud Backup and data management across edge, on-premises and cloud workloads
    
    End User Protection
    
    Microsoft 365
    
    Salesforce
    
    Google Workspace
    
    Microsoft Entra ID
    
    Microsoft Dynamics 365
- Free Trial
Customers
- Explore All Customer Stories
  We are trusted by the world's leading organizations to protect their data. Explore customer success stories to see how your peers are using Druva.
- Ransomware recovery ready
  Learn why Medallia chose Druva
  
  SaaS data protection across the enterprise
  See why Regeneron partnered with Druva
Resources
- Druva vs. Veeam TCO Calculator
  Find the hidden costs of legacy backup
  
  Forrester: Total Economic Impact of Druva 2024
  Customers see 224% ROI: Find out how
Partners
- Programs
  Programs
  Learn how you can profit with Druva and a cloud-first SaaS selling motion. Explore partner programs, access resources, and discover the benefits of partnering with Druva.
- Strategic Partners
  Strategic Partners
  Learn about Druva's strategic capabilities across platform, OEM, and other partnerships. Find out how Druva accelerates and protects customers' cloud journeys.
  - Dell Technologies
  - AWS
  - VMware
  - Nutanix
- Become a Partner
Company
- - Company
  - Leadership
  - Investors
  - Careers
  - Contact Us
  - Newsroom
  - Awards
  - Events
  - Diversity, Equity & Inclusion
  - Blog
- Get in touch with us
  Contact Us
  
  News, product innovations, and more
  Blog
Get Started
Support
Login
Language
- English
- Deutsch

Innovation Series

Achieving >1TB/hr backup speed by implementing the core client-side data pipeline in Rust

February 23, 2022 Kush Shukla, Principal Engineer

Druva provides the backup and restore of data to and from a variety of data sources (cloud apps, file servers, NAS, SQL servers, etc.), varying in magnitude as well as complexity. In our quest to serve our customers better, we are continuously striving for technologies which can make our core pipelines of backup and restore flexible, fast, and efficient.

Motivation and requirements for porting our core pipeline

Agent is an executable running on a customer’s device which is responsible for the backup and restore of data to and from the Druva Data Resiliency Cloud. Existing Druva agents are written in Python/Golang, and utilize the power of asynchronous programming (a form of parallel programming) to speed up operation for its specific workload.

However, one thing common across all agents is the complex data pipeline transporting data and uploading it to the Druva Cloud, and vice versa for restore.

We thought of carving out this common piece of complex data pipeline and bundling it as a library that will be utilized by our agent. This is the core of our agents and we want it to develop a data pipeline with the following features:

Performant in terms of network IOPS
Scale with the device resources (CPU/Mem)
Bindings for Golang/Python
Mature library support of asynchronous programming
Cross-platform support
Preferably no garbage collection as we want tighter control over resources

Reasons for selecting Rust

The Rust language caught our attention and we chose it for the following reasons:

Non-garbage collected (which means much tighter control over resources)
Close to C performance (comparison)
Compile to native object code
Easily compiles down to a shared library, allowing source-less distribution of the SDK (Software Development Kit), along with C-API (C-based Application Programming Interface) support
Very strong safety guarantees (compared to C and C++) for memory, and highly concurrent code
Built-in support for asynchronous programming

Challenges with adopting the Rust ecosystem and its solutions

Buffer copying

Python and Golang are Garbage Collected (GC) languages, meaning memory is released automatically for the allocated objects which are not in use by any part of the program. Rust is a non-GC language, meaning the memory of the allocated objects needs to be managed explicitly by the code.

The problem with interfacing the GC and non-GC runtimes is when data crosses the language boundary it needs to be copied. This is not a problem for primitive data-types, but for large buffers, such as our core data pipeline, it becomes a problem.

Solution:

To solve this problem we relied on the Rust layer for allocations and deallocations of memory. The Go/Python layer allocates the memory by calling the APIs of the Rust library, it then fills up the allocated space buffer with data through the Rust library API calls. Once the data is uploaded to the Druva Cloud, the allocated space can be freed by calling the API of Rust library. In this way, memory management lies completely on the Rust layer, buffers need not be copied, and Python and Golang layers are free from cleaning the garbage memory.

Pressure on the OS due to system threads

This problem is particularly for the Golang, as Python typically uses a single thread due to Global Interpreter Lock (GIL). In Golang, any system call spawns a system thread. Golang mitigates this by reusing already spawned threads as much as possible. But if N requests are to be made to the Rust layer, then N threads will be spawned — this N can reach an excess of 500 as well. Multiple such processes could exist on the same host and it might put pressure on the OS.

Solution:

To mitigate this issue, we treated the Rust layer as a Remote Procedure Call (RPC) server and Golang layer as an RPC client. The interface is then modeled on RPC-style request/response. Each request, having a unique ID, will be enqueued to a channel. The other end of the channel dispatches requests to the Rust side. This dispatcher will actually call the Rust library API, which means a system thread will be spawned. However, in this case only one thread per dispatcher will be spawned.

Once the request is handled, the pipeline will enqueue it to a queue from which a receiver go-routine will poll. Using the request ID, the receiver will send the response to the appropriate go-routine.

The effect of this is that the Go side can spawn many go-routines without incurring the system thread penalty. There can be multiple dispatcher and receiver go-routines to scale a high number of requests.

Key takeaways

Rust helped us deliver a fast and robust common data pipeline for all workloads. Adoption of the language enabled Druva to avoid a host of memory-related flaws. Our efforts on porting the core data pipeline to Rust delivered several performance benefits when compared to our existing agents in Golang/Python. We were able to achieve a little above 1TB/hr of backup speed with current architecture. For comparison, with Golang we clocked 800 GB/hr, and with Python around 500 GB/hr.

However, apart from domain-related problems, Rust has a steep learning curve. The onboarding of new developers into our team requires a 3-6 month timeframe. But with the velocity we are delivering new features, it’s worth the investment.

Next steps

Looking to learn more about the technical innovations and best practices powering cloud backup and data management? Visit the Innovation Series section of Druva’s blog archive.

Achieving >1TB/hr backup speed by implementing the core client-side data pipeline in Rust

Motivation and requirements for porting our core pipeline

Reasons for selecting Rust

Challenges with adopting the Rust ecosystem and its solutions

Buffer copying

Pressure on the OS due to system threads

Key takeaways

Next steps

Druva Blog: Cloud Technology & Data Protection Articles

Druva Data Security Cloud

The Druva Platform

Data Protection

Cyber Response & Recovery

eDiscovery & Compliance

Use Cases

Key Technologies

Customers

Resources

Partners

Company