Tech/Engineering

From Backup to Intelligence: Building a Security Data Lakehouse with Apache Iceberg

Anand Apte, Distinguished Engineer and Shubham Deshmukh, Sr. Principal Engineer

Every day, Druva processes tens of millions of backups—hundreds of petabytes that most treat as a dormant insurance policy. While backup has traditionally been treated like an insurance policy, something you hope you never have to use, we see it differently.

When a security incident unfolds, this "insurance policy" becomes the most valuable asset in the room. It effectively acts as a forensic black box: a continuous, versioned record of every file and every change across every device leading up to the breach.

The challenge we faced was that traditional backup systems were never designed to be read this way. 

They were built to answer Recovery questions: “Give me this file from yesterday.” They stumble when asked Security questions: “Show me every device where this specific malicious hash has appeared in the last 90 days.” To bridge this gap, we had to stop thinking about backups as snapshots and start thinking about them as intelligence.

Why is traditional backup snapshot mounting too slow for incident response?

In the world of traditional backup vendors, performing a deep security search across historical data is a heavy lift. Because their data is stored in proprietary, static snapshots, they often require you to mount a snapshot to see what’s inside.

It is infrastructure-heavy, painfully slow, and increasingly difficult to scale at large data volumes. During a live attack, you don't have hours to wait for a snapshot to mount; you need answers in seconds.

The Shift: A Metadata-First Architecture

We realized that to find threats at scale, we shouldn't have to "open" the backup at all. We needed a way to query the metadata, the DNA of the files directly.

We turned to Apache Iceberg on AWS S3 Tables. By building a Security Data Lakehouse, we decoupled the intelligence from the storage. Instead of a series of disconnected snapshots, we created a continuously evolving, versioned timeline of file state. Together, this stack provided:

  • ACID Transactions: Ensuring data consistency as millions of backups land daily.

  • Native Time Travel: Allowing us to "rewind" the environment to any specific second to see exactly when a file arrived.

  • Columnar Performance: Making metadata queries lightning-fast without ever needing to mount or re-hydrate the actual backup data.

The Foundation: A Versioned View of Reality

The key architectural shift was moving from independent snapshots to a continuously evolving timeline. Each file is modeled as a time-bounded lifecycle: when it appears, how it changes, and when it is no longer present.

What previously required stitching together multiple snapshots manually becomes a set of bounded queries:

  • What did this system look like before the incident?

  • When did this file first appear?

  • Is it still present?

These are no longer recovery operations; they are analytical queries over versioned data. This shift from snapshots to timelines is what makes large-scale forensic analysis practical.

Ingestion: Continuous Change Processing

Rather than relying on periodic batch pipelines, we process data incrementally. Across multiple sources, including file activity streams and threat intelligence feeds, we continuously ingest and apply changes to maintain an up-to-date view of the system. This approach allows us to:

  • Avoid expensive full refreshes.

  • Keep ingestion latency low.

  • Maintain consistency between historical and current state.

How does a Security Data Lakehouse accelerate threat hunting?

By treating our metadata as a high-performance analytical dataset, we transformed the backup process into a dual-purpose security engine.

1. Threat Watch (Proactive Detection)

Threat Watch is our "always-on" sentinel. As data is backed up, our Iceberg-based engine performs continuous, incremental scanning against a live library of global threat signatures (IOCs) in near real-time.

Impact: If a match is found, the system can Auto-Quarantine the affected snapshot, ensuring you don't accidentally invite the intruder back into your environment during a restore.

2. Threat Hunt (The Forensic Investigator)

When your SOC identifies a new "bad hash," Threat Hunt allows you to go on the offensive. Because our metadata is indexed and queryable, you can run a Global Search to find that signature across your entire history.

Impact: You can instantly map the Blast Radius, finding where a file landed first and how far it spread, turning days of manual forensic labor into a single analytical query.

From Insurance to Intelligence

This architecture represents a fundamental shift in the value of a backup. We have moved from a passive recovery model to an active source of security truth.

  • Every backup contributes to a growing, searchable intelligence pool.

  • No infrastructure overhead: Search across time and thousands of workloads without mounting a single volume.

  • Clean Recovery: Transition from "I hope this works" to "I know this is clean."

At Druva’s scale, this means turning hundreds of petabytes of “insurance” data into actionable insights. Backup is no longer just the last line of defense; it is the most powerful tool in your security stack for detecting and responding to threats.

How Resilient is Your Backup?

Take our 5-minute Cyber Resilience Assessment to identify gaps in your current forensic and recovery workflows.

Take the 5-Minute Assessment | Watch Demo Video 

FAQs: Security Data Lakehouse Architecture

What is a Security Data Lakehouse in backup?

It is a centralized architecture that decouples metadata from storage, allowing you to query backup data like a database without mounting individual snapshots.

Why use Apache Iceberg for cyber resilience?

Iceberg provides ACID transactions and native time travel, enabling forensic teams to "rewind" to any second and see exactly when a file was first compromised.

How does metadata-first search speed up threat hunting?

By indexing the "DNA" of files, you can search for malicious hashes across petabytes of history in seconds, rather than hours spent mounting traditional backups.

Druva Blog: Cloud Technology & Data Protection Articles