Platform
- Data Security Cloud
  Data Security Cloud
  Fully managed data security across enterprise, cloud, SaaS, and end user.
- Data Protection
  Data Protection
  Modernize data protection to reduce costs and complexity
- Cyber Response & Recovery
  Cyber Response & Recovery
  Bounce back from cyber attacks with data that is always safe and ready.
- eDiscovery & Compliance
  eDiscovery & Compliance
  Secure, protect, and streamline data governance.
- Meet Dru - Your Copilot for Data Security
Solutions
- Use Cases
  Use Cases
  Learn how Druva helps you accelerate key business initiatives
- Key Technologies
  - Public Cloud
    Public Cloud
    Protect native AWS and Azure deployments with secure backups without the cost and complexity
    
    AWS
    
    Azure
  - Hybrid Workloads
    Hybrid Workloads
    Transform data center backup and disaster recovery for virtual environments
    
    VMware
    
    Hyper-V
    
    Nutanix
    
    Oracle
    
    MS SQL
    
    SAP HANA
    
    NAS/files
  - Endpoint and SaaS Apps
    Endpoint and SaaS Apps
    Enterprise Cloud Backup and data management across edge, on-premises and cloud workloads
    
    End User Protection
    
    Microsoft 365
    
    Salesforce
    
    Google Workspace
    
    Microsoft Entra ID
- Free Trial
Customers
- Explore All Customer Stories
  We are trusted by the world's leading organizations to protect their data. Explore customer success stories to see how your peers are using Druva.
- Ransomware recovery ready
  Learn why Medallia chose Druva
  
  SaaS data protection across the enterprise
  See why Regeneron partnered with Druva
Resources
- Druva vs. Veeam TCO Calculator
  Find the hidden costs of legacy backup
  
  Data Resiliency for Dummies
  Get your guide to data resiliency
Partners
- Programs
  Programs
  Learn how you can profit with Druva and a cloud-first SaaS selling motion. Explore partner programs, access resources, and discover the benefits of partnering with Druva.
- Strategic Partners
  Strategic Partners
  Learn about Druva's strategic capabilities across platform, OEM, and other partnerships. Find out how Druva accelerates and protects customers' cloud journeys.
  - Dell Technologies
  - AWS
  - VMware
  - Nutanix
- Become a Partner
Company
- - Company
  - Leadership
  - Investors
  - Careers
  - Contact Us
  - Newsroom
  - Awards
  - Events
  - Diversity, Equity & Inclusion
  - Blog
- Get in touch with us
  Contact Us
  
  News, product innovations, and more
  Blog
Get Started
Support
Login
Language
- English
- Deutsch

Tech/Engineering

Effective Data Recovery | File-Level Restores [Part 1]

July 14, 2023 Rakesh Sharma, Sr. Staff Software Engineer

Introduction

The ability to restore individual files is one of the most basic functionalities of a data backup and protection tool. As easy as it may seem, performing this function on a file located in a VM is a resource-intensive and time-consuming process. Even for restoring a small file, users need to restore the contents of the entire VM and then handpick the file from the data set.

To get around this problem, we planned to use File-Level Restore or simply FLR. Part 1 of this blog series explains what is FLR, and how we went about implementing FLR. The second part of this blog will discuss the improvements we made in the FLR restore process to achieve faster execution speeds.

What is FLR?

FLR stands for File-Level Restore. It refers to the ability to restore individual files or folders from a virtual disk file that is stored in the cloud. This eliminates the need of downloading the entire virtual disk and attaching it to a virtual machine. FLR provides a more efficient and granular way to recover specific files or folders, saving time and resources compared to traditional full-disk restores.

Significance of FLR

File-Level Restore (FLR) plays a vital role in providing customers enhanced control over the data restoration process. Without FLR, even for a file as small as 16 MB, users would have to restore the entire virtual machine, leading to the potential download of multiple terabytes of data from the cloud. FLR offers a streamlined and efficient solution for selectively restoring specific files from a virtual disk backup.

The high adoption rate of FLR (accounting for approximately 50% of all restores) underscores the importance of this feature and its frequent utilization by customers.

Before diving deep into the solution, here’s the tech stack that we used and a few key terms that we will use frequently to explain the details of the solution.

Technology stack that we used

Language: Python
FUSE: User module implemented with Python
Loop Devices

Terminology

FLR: File-Level Restore.
Virtual disk: A virtual disk is a file that appears as a physical disk drive to the guest operating system. Virtual hard disk files store information such as the operating system, program files, and data files.
Disk Offset: An offset into a disk is simply the character location within that disk, usually starting with 0; thus "offset 240" is the 241st byte in the disk.
File Offset: An offset into a file is simply the character location within that file, usually starting with 0; The important thing to note is that a File Offset is converted to a Disk Offset before the data can be read and this conversion is done by the underlying FileSystem.
Target/Target VM: Used interchangeably to refer to a target virtual machine where data has to be restored.

Original Solution

To access a file stored within a virtual disk, it is necessary to determine the disk offsets of the data blocks corresponding to that file. In the best-case scenario, the blocks of the file are stored sequentially. While in the worst-case recovery scenario, the data blocks may be scattered across the entire virtual disk.

On a Linux machine, it is possible to treat a regular file as a block device by utilizing a Loop Device. The Loop Device can treat the virtual disk as a block device on the local Linux machine and mount its volumes with the appropriate filesystem type. Once the volumes are successfully mounted, we gain the ability to read any desired file(s) or folder(s) stored within them. The mounted filesystem handles the conversion from file offsets to disk offsets seamlessly.

Fortunately, even if the virtual disk is not locally available, we can still mount its volumes on the local machine using Loop Devices and FUSE (Filesystem in Userspace). This enables us to access and work with the contents of the virtual disk without requiring its physical presence on the local system.

FUSE

FUSE (Filesystem in Userspace) is a software interface that allows non-privileged users to create their own file systems on Unix and Unix-like operating systems without modifying kernel code. It achieves this by running file system code in user space while providing a bridge to the kernel interfaces.

In our use case, FUSE plays a crucial role in redirecting read system calls to cloud storage instead of serving them locally. This allows us to simulate file system operations according to our requirements.

Reading a single block of data looks like this.

The agent process is a process running on the local machine and is responsible for serving the reads from the cloud.

By adopting this approach, we can selectively read the desired file(s) or folder(s) without the need to download the entire virtual disk. For instance, if we only need to restore a 16 MB file, there's no need to retrieve the entire 2 TB virtual disk. This approach significantly reduces the amount of data transferred.

However, it's important to note that this solution has limitations in terms of download speed. Specifically, the FLR performance was approximately 20 GBPH (gigabytes per hour).

To address this limitation, we implemented a solution using duplicate loop devices per disk, which we’ll describe in the second part of this blog.

After downloading a block, the next step is to copy it to the target virtual machine. However, it is undesirable to download the entire file before initiating the write operation on the target VM.

Initially, we utilized a VMware tools API called InitiateFileTransferToGuest to copy files to the target VM. This API accepts a source_path and destination_path and handles both reading and writing data to the target VM. While this API sufficed for file transfers, it exhibited poor performance when dealing with large data transfers in the gigabytes range. To address this limitation, we implemented a custom Reader/Writer Pipeline, which significantly enhanced the efficiency and performance of transferring GBs of data to the target VM.

Next steps

Although FLR helped us eliminate the need of downloading the entire virtual disk to restore a single file, its poor performance for large file restores became a hindrance.

Improving the performance of large file transfers was a key element of our FLR implementation. Stay tuned for Part 2 of this blog where we take about the improvements that we made.

To learn more about Druva’s technical innovations and how we deliver the best cloud-based backup and restore solution on the market, visit the tech/engineering section of the blog archive.

Effective Data Recovery | File-Level Restores [Part 1]

Introduction

What is FLR?

Significance of FLR

Technology stack that we used

Terminology

Original Solution

FUSE

Next steps

Blog

Druva Data Security Cloud

The Druva Platform

Data Protection

Cyber Response & Recovery

eDiscovery & Compliance

Use Cases

Key Technologies

Customers

Resources

Partners

Company