Tech/Engineering, Innovation Series

The Druva Edge: Supercharging Data Protection for Cloud-Native Workloads

Abdul Thag, Senior Staff Software Engineer, Arati Joshi, Sr. Principal Engineer and Hrishikesh Pallod, Principal Engineer

Druva is a fully managed 100% SaaS platform that provides top security, availability, and scalability for various workloads. As organizations shift to the cloud, data protection becomes more complex. In this blog, we'll explore how Druva tackles the challenges of protecting cloud-native workloads while reducing Total Cost of Ownership (TCO). With the ability to process up to 10TB of data per hour, Druva ensures strong security without compromising performance.

Understanding Cloud Native Workloads

Organizations are moving to the cloud for better scalability, flexibility, and cost savings, making cloud-native workloads perfect for deploying modern applications.

Protecting cloud-native workloads is crucial to ensure data security and business continuity in dynamic cloud environments. Cloud vendors provide cloud-native backup and recovery solutions, for example EC2 EBS snapshots. But these solutions are often limited in functionality. Druva offers a more comprehensive solution, including cross-cloud backups and advanced ransomware recovery capabilities.

Druva currently protects EC2 VM, AWS S3, Azure VM, Azure SQL. Druva's portfolio will continue to expand in the future, incorporating a broader range of cloud-native workloads.

Several characteristics of Cloud Native workloads can be challenging for the data protection workflows.

  • Cloud Vendor Throttles and Limits: Different cloud providers impose throttles and resource limits at various levels, which can impact backup/restore performance and efficiency.

  • Deployment: Accessing data from managed services is more complex than on-premise standalone services. For example, a Druva agent can be deployed on an on-prem SQL server to directly back up data. However, with Azure SQL Managed Database, the same approach doesn't apply, requiring a different method for integrating with custom backup solutions. 

  • Data Access APIs: Backup solutions must integrate with cloud providers’ APIs. This reliance on APIs adds to the Total Cost of Ownership (TCO) for customers, and backup solutions must also continuously adapt to evolving API changes. 

  • Large Data Volumes: Managing and backing up vast amounts of data can be resource-intensive and complex. Navigating data across multiple clouds can incur cost.

Overcoming the Challenges with Druva's Advanced Technology 

  • Druva’s Cloud-First Architecture: Druva’s solutions are built from the ground up to operate natively in the cloud. With more than a decade of experience in Saas, Druva has developed deep expertise to fully leverage the cloud platform for enhanced scalability, resilience, and performance.

  • Efficient Resource Management: Druva can leverage cloud resources like on-demand network bandwidth and scalable backup components (e.g., proxies) to provide efficient and adaptive backup operations. Majority of the resources in data protection workflows are deployed in Druva account, minimising TCO and management on customer’s side.

Overview of Druva Architecture 

Druva has developed a scalable SaaS based architecture. Druva Cloud is the central engine of the Druva ecosystem, built by leveraging multiple AWS services. It orchestrates the data protection workflows. Druva cloud hosts the “Druva File System”, a core component responsible for managing metadata, snapshots, life cycle, deduplication and other data related tasks. Data is securely stored on cloud object stores like Azure Blobs or AWS s3.  

Druva Architecture

Each cloud-native workload requires a distinct method of data access for backup and restore workflows. Here are some examples,

Workload

Mechanism for Backup and Restore

EC2

EBS direct APIs

Azure VM

PageBlob  APIs

Azure SQL

Bulk Copy Program (BCP)

AWS S3

S3 APIs



Depending on the method, few workloads need a compute agent in customer accounts, which acts as a gateway for data transfer. But most heavy resources are deployed in Druva account, minimizing TCO and management on customer’s side.

Workload data is routed through Druva Agent, which communicates with Druva Cloud and also handles data transfer to a cloud object store (Aws or Azure). Druva agent consists of workload specific interfaces, data mover and a common data pipeline

Scale of Druva Data pipeline

Every workload has its own challenges, requirements, and nuances. To ensure the Druva Agent can scale effectively for any workload, it's essential to independently quantify the performance of the data pipeline.

Performance test setup

The test setup uses a dummy workload, which connects to a common pipeline via data mover interface. Three types of workloads are simulated by creating a test dataset of specific characteristic

  • Object store workload : Documents of size 2-8MB

  • Object store workload: Media files of size 50-100MB which are non compressible

  • VM Instance: Few large files (100s of GBs in size) representing VM disk snapshot.

The setup was deployed on an EC2 instance with adequate resources, allowing for optimal performance during testing. Backup and restore rates were measured, and the results demonstrate strong performance, providing confidence in the scalability of the data pipeline. Druva’s data pipeline can handle up to 10TB data per hour.

Backup rate

Driving force behind the scale

The exceptional data performance is possible by a well-designed data pipeline. The Druva Agent's common data pipeline is a critical component of the backup/restore data path designed to be compute, network, and I/O intensive. Built with Rust, the pipeline is highly efficient and scalable.

  • The pipeline leverages native asynchronous programming, enabling it to support high concurrency.

  • Each component can scale independently based on workload and environment requirements. 

  • Latencies across individual stages vary - for example, compression typically takes less time than data upload. so workers at each stage can be tuned individually for optimal performance.

  • As the system scales, careful control over memory and CPU consumption is maintained with built-in backpressure propagation.

  • The pipeline can seamlessly integrate with any workload specific components.

Limitations from cloud environment

Cloud vendors enforce throttles at multiple levels, including instance, subscription, region, account etc, which limit factors like requests per second, network usage, vCPU quotas, and more. While Druva products are capable of achieving high rates, cloud limitations may prevent customers from reaching the maximum potential. 

Druva has optimized workflows wherever possible to manage these throttles. Druva's products are future-ready - when cloud providers increase service quotas, Druva’s data pipeline can seamlessly take advantage of these improvements without major design changes.

Conclusion

As more organizations transition their workloads to the cloud, the need for robust protection of cloud-native workloads has become increasingly important. Druva offers a seamless, comprehensive solution to ensure data security and recovery in the cloud. Its advanced data pipeline is designed to scale efficiently, ensuring that backup and restore processes are both reliable and fast, regardless of workload size.

Learn more about Druva’s cloud-native solutions for protecting public cloud data here.