Platform
- Data Resiliency Cloud
  Data Resiliency Cloud
  Enterprise Cloud Backup and data management across edge, on-premises and cloud workloads
- Data Protection
  Data Protection
  Modernize data protection to reduce costs and complexity
- Cyber Resiliency
  Cyber Resiliency
  Be ready for cyber attacks with data that is always safe, always ready
  - Accelerated Ransomware Recovery
  - Security Posture & Observability
- Governance & Compliance
  Governance & Compliance
  Secure, protect, and streamline data governance for all your critical data, wherever it lives
  - eDiscovery and Legal Hold
  - Sensitive Data Management
- Take a Tour
Solutions
- Business Drivers
  Business Drivers
  Learn how Druva helps you accelerate key business initiatives
- SaaS Applications
  SaaS Applications
  Druva provides comprehensive data protection that supports multiple SaaS applications from a single platform. Discover the Druva difference today.
- Enterprise Workloads
  - Virtualization
    Virtualization
    Transform data center backup and disaster recovery for virtual environments
    
    VMware
    
    Nutanix
  - Databases
    Databases
    Reduce the cost and complexity of data protection for enterprise databases
    
    Oracle
    
    MS SQL
    
    SAP HANA
  - Files
    Files
    Discover a more cost-efficient way to protect on-premises and cloud NAS
    
    NAS/files
  - Public Cloud
    Public Cloud
    Protect native AWS and Azure deployments with secure backups without the cost and complexity
    
    AWS
    
    Microsoft Azure
- Enterprise Endpoints
  Enterprise Endpoints
  Unify SaaS apps and end-user device protection to reduce data risks. Improve cyber resilience and compliance by protecting enterprise workloads and assets.
- Free Trial
Customers
- Explore All Customer Stories
  We are trusted by the world's leading organizations to protect their data. Explore customer success stories to see how your peers are using Druva.
- Ransomware recovery ready
  Learn why Medallia chose Druva
  
  SaaS data protection across the enterprise
  See why Regeneron partnered with Druva
Resources
- 2023 Gartner® Magic Quadrant™
  See why Druva is recognized as a Visionary
  
  Data Resiliency for Dummies
  Get your guide to data resiliency
Partners
- Strategic Partners
  Strategic Partners
  Learn about Druva's strategic capabilities across platform, OEM, and other partnerships. Find out how Druva accelerates and protects customers' cloud journeys.
  - Dell Technologies
  - AWS
  - VMware
  - Nutanix
- Programs
  Programs
  Learn how you can profit with Druva and a cloud-first SaaS selling motion. Explore partner programs, access resources, and discover the benefits of partnering with Druva.
- Become a Partner
Company
- - Company
  - Leadership
  - Investors
  - Careers
  - Contact Us
  - Newsroom
  - Awards
  - Events
  - Blog
  - Diversity, Equity & Inclusion
- Get in touch with us
  Contact Us
  
  News, product innovations, and more
  Blog
Get Started
Support
Login
Language

Tech/Engineering

Data protection-as-a-service begins with a resilient architecture

August 11, 2020 Stephen Manley, CTO

The evolution from “products” to “services” has disrupted the IT infrastructure industry in a way that new hardware and software never could. Since a service model shifts responsibility for delivering business outcomes from the customer to the vendor, as-a-service companies are adopting different architectures. The biggest shift has been a focus on global resiliency. While resiliency was always an important requirement for standalone products, it becomes an obsession when you manage thousands of companies’ data.

Of the infrastructure services, data protection-as-a-service (DPaaS) must be the most resilient because it is a customer’s last line of defense. As data infrastructure becomes even more important, DPaaS must deliver service levels that exceed what customers can cobble together with legacy products. DPaaS cannot simply run traditional backup solutions at scale but must be re-architected for comprehensive resiliency across storage, compute, network, geography, and management. When you’re personally responsible for your customers’ data protection, everything changes.

Why does data protection resiliency matter so much?

When data became mission critical, so did data protection. Businesses no longer accept weekly backups with 2-3% failure rates and no self-service because every step of the application lifecycle depends on creating and recovering data quickly.

DevOps and application teams create backups before any significant change, so they can recover quickly if something goes wrong. They need an always-on, self-service backup, and recovery service. Even for static applications, administrators need reliable, frequent backups, so they can truncate database logs and not overrun their storage capacity.

Extreme environments like edge/IoT, high-performance computing, and unstructured data lakes require reliable backups to keep pace with the influx of data. They often have limited bandwidth, small backup windows, and high rates of data change, so if even a single backup fails, the change rate can make it impossible for the backup process to ever “catch up” again.

Since businesses depend on data protection to be always-on, even in the face of hardware, software, and human errors, customers should be able to:

Create new backups anytime
Restore backups anytime
Meet their recovery point objective (RPO) and recovery time objective (RTO)

Legacy data protection approaches are not resilient

Today, backup teams lack the people, technology, and processes to deliver a truly resilient solution.

Traditional vendors sell siloed products that the backup team must then synthesize into a solution. Each backup team tries to create an internal service, but they lack the technology and processes to build a true “as-a-service” solution.

First, each product measures only its own resiliency. For example, a deduplication appliance can ensure that the data is invulnerable to corruption or loss, but either the appliance or the backup application could suffer an outage. During the downtime, backups and restores will fail. The product claims success, even as the backup team fails, because it does not deliver a complete solution.

Second, even integrated on-premises products are constrained by their physical limitations. Clustered backup appliances are resilient to node failures, but data center failures (e.g. natural disasters, ransomware attacks) can compromise their availability. Even in the absence of a disaster, the most scalable physical cluster still hits physical limits that prevent it from meeting the customers’ requirements.

Third, resiliency requires global integrated management to maintain and upgrade the infrastructure. Today to fix bugs, support new workloads, or integrate new appliances, backup teams run cascading upgrades of backup clients, servers, appliances. Those upgrades may disable services for hours at a time, and with the interdependence of sites (e.g. to replicate backups), the disruption may be global. With siloed products and distributed backup teams, it is almost impossible to provide a global resilient (dps) data protection service.

How to architect a resilient DPaaS

A resilient DPaaS architecture must deliver an integrated solution for storage, compute, networking, geography, and management.

Data protection resilience begins by building a reliable, scalable storage layer. Not only must the backup data be protected from underlying disk hardware errors, but it must also have a resilient, immutable metadata layer to prevent data loss from deduplication errors and ransomware attacks. Simply “building on reliable storage” like Amazon S3 is insufficient because the protection storage layer (e.g. deduplication, encryption, and catalog) must be as reliable as the underlying object storage. Metadata and data both matter.

Since so many workloads cannot afford extended backup downtime, the backup/recovery process must be both modular and restartable. When processing millions of concurrent backups, compute resources will fail in the middle of operations. A resilient service should auto-detect the failed processes, restart the operations from where it left off, and never affect any other backup process.

The network is the most common cause of backup failures, so a DPaaS architecture should minimize its exposure to outages by transmitting as little data as possible. Source-based deduplication reduces the amount of data transmitted over the network. Furthermore, a modular, restartable backup architecture not only minimizes data reprocessing but also network retransmission.

DPaaS must be resilient not only to system outages but also to data-center failures. When there is a massive outage, customers need data protection more than ever. Therefore, the backup processing must not be tied to one data center. Most importantly, the backup service must store data across multiple regions, so customers’ backups are safe regardless of what happens.

Finally, DPaaS cannot incur downtime for system or software upgrades. Features, bug fixes, and upgrades must be transparent to the customers. A resilient service does not restrict users with “upgrade outages” or “garbage collection windows.” Instead, the restartable, modular system enables rolling upgrades so that the only way users know something has changed is when they enjoy a new functionality.

Druva’s resilient DPaaS architecture

Druva built a resilient DPaaS with resiliency woven through the architecture, so that it is both simple and scalable.

For backup resiliency, Druva’s cloud-native file system splits the metadata and data. Metadata is stored in DynamoDB while the data resides in object storage. The split ensures that backups are immutable, isolated from ransomware, and verifiably correct. Druva stores backups in 15 AWS regions, so customers can be confident that their data will be preserved in the face of any geographical threat.

To ensure that backup and restore processes succeed, both the data and processing layers are modular. Druva’s metadata and data layer scale independently in the cloud, so there is always capacity and performance available for customer backups, no matter how large or urgent. By using containers and serverless functions, Druva can start (or restart) backups at any time on any infrastructure. Those containers and functions also enable Druva to transparently upgrade the service, so customers automatically get the most up-to-date functionality. Druva’s global source-based deduplication uses the file system’s high-performance centralized metadata to transfer the minimum amount of data, so the service minimizes its dependency on the network.

Druva’s cloud platform runs over five million backups and restores a day because it was built to be resilient to errors and outages – not just at a component level, but as an integrated service. The storage, network, compute, and management work together globally to keep customers’ data safe, secure, and available.

Conclusion

Data protection has become a mission critical requirement for most organizations. Every part of the organization — developers, operations, compliance, and security — depends on data protection working 24×7. Unfortunately, it is nearly impossible for a team to build a resilient data protection service from legacy components. The architecture, deployment, and processes need to be designed from the ground up.

Data protection-as-a-service redefines the resiliency of cloud data protection. Instead of building “reliable” storage or backup appliance silos, it incorporates: storage, compute, networking, geography, and management. Since each of those elements can fail and cause outages, the system builds in global resiliency. Instead of measuring “storage uptime” or “percentage of successful backups” Druva users get what they need — protected data that is always on, whenever and wherever they need it. Druva redesigned every part of data protection. That’s what happens when you stop selling products and start taking responsibility for keeping your customers safe.

Learn more about the Druva Cloud Platform and how it can help your business can improve your business resiliency while reducing IT cost and complexity.

Data protection-as-a-service begins with a resilient architecture

Why does data protection resiliency matter so much?

Legacy data protection approaches are not resilient

How to architect a resilient DPaaS

Druva’s resilient DPaaS architecture

Conclusion

Blog

Druva Data Resiliency Cloud

Cloud Backup & Recovery

Data Protection

Governance & Compliance

Cyber Resilience

Business drivers

Workloads

Partners

Customers

Resources

Company