A Beginner’s Guide to Self-Healing Storage

A Beginner’s Guide to Self-Healing Storage
2

When thinking of a traditional filing system, what comes to mind? Perhaps a categorical or hierarchical set of descriptors used to organize and store information until it’s ready for use. For data stored on disks or in the cloud, a similar structure is applied but on a broader scale.

Not only is the process of organizing, storing and managing unstructured data important but also the need to ensure the durability of that data when the time comes to retrieve it. While there are bound to be setbacks such as disk failures and server crashes, the key to a reliable system is in its ability to combat these issues, or “self-heal” itself without impacting the data or user productivity.

Using a hierarchical name-space to organize, store, and manage unstructured data is a common strategy used by applications and end users alike. The software layer that manages this name-space—on top of the physical storage—is conventionally known as a file system. There are many systems in wide use today, with some of the more popular options being UFS, ext4, NTFS, xfs, HFS+, and VxFS.

A typical file system needs to provide basic functionality, including the ability to:

  • Store, modify, and retrieve data for a specified file, or a portion of it
  • List the contents of a folder, including the various attributes of its children, such as size, modification time-stamps, access control lists (ACLs), etc.
  • Manage the free space of the physical device

A single file system operation may involve modification not only of data but also information about the data itself, such as the metadata. While performing these operations, a good file system needs to ensure:

  • Durability of data, from the time it was written until the time (should it come) that it is permanently deleted
  • Consistency of metadata
  • Good performance and scalability, in terms of service times and throughput

While this may seem obvious, the fact is that software and hardware can and do crash, either partially and permanently or totally and temporarily—making these requirements more than simply an afterthought.

In a conventional file system, a file is represented internally by an inode. An inode is not the data itself but is, in essence, a list of the categories or disk blocks that contains the actual file. In addition, it contains a file’s attributes, including modification time-stamp, size, and ACL. Similarly, a folder’s inode contains a list of its children and their inode numbers:

Deshkar_storage_1

Performing a single file system operation, such as creating a file, involves multiple metadata update operations; for example, creating a new inode and its directory entry in the parent folder:

Deshkar_storage_2

The danger in any multi-step operation is that a failure may occur at any point—the process may crash, the disk may lose the update, etc. These failures become apparent when the operating system does not show the created file, refuses to create a file with the same name, or exhibits a resource leak.

There are two standard approaches for dealing with such failures:

  1. Use transactional mechanisms to update multiple metadata objects. These typically adhere to ACID guarantees (Atomicity, Consistency, Isolation, Durability). In a transactional system, general failures arising out of a process or system crash are usually handled cleanly—the entire transaction is rolled back or rolled forward, using redo/undo transaction logging.
  2. Use ordered updates. Here, multiple updates are ordered in such a way that, at any point, a partial list of updates is safe (from an overall system behavior perspective). Periodically, though, these incomplete or partial updates need to be cleansed in order to free up space on the physical device. For more detail on ordered updates, read the seminal paper, Soft Updates, from the ACM Transactions on Computer Systems, Vol. 18, No. 2, May 2000.

Traditionally, file systems have deployed offline utilities such as fsck (File System Consistency Check) or chkdsk (Check Disk) to fix such metadata inconsistencies and restore sanity. These tools are mostly off-line,which implies downtime or outage for the file system. Depending on the circumstances, this may lead to an extended outage—adversely impacting the productivity of end-users and creating frustration for IT admins.

Encountering challenges like these would be minimal to none for a cloud-scale file system, which is designed to be functional around-the-clock and accessible to millions of users across the globe.

How Druva Uniquely Leverages AWS Storage

To combat these issues, Druva products make use of a custom file system. The key features of the Druva cloud file system are:

  • Source-side data deduplication (a.k.a. dedupe)
  • Continuous data protection
  • Compressed and encrypted data storage, both in-transit and at-rest
  • Policy-based data retention

The Druva cloud file system addresses two critical concerns regarding data reliability:

Durability

The Druva cloud file system is hosted on Amazon’s public cloud, utilizing AWS S3 for data storage. S3 is the industry leader, designed to provide 99.999999999% durability of objects and the ability to sustain the concurrent loss of data in two facilities.

Druva’s cloud file system also uses the AWS DynamoDB service to manage its metadata. Amazon DynamoDB synchronously replicates data across three facilities within an AWS Region, ensuring durability for the file system’s metadata as well.

When inSync is hosted on-premise, Druva’s cloud file system uses the local file system to store data and an embedded BerkeleyDB database engine to manage metadata. Data—and database—durability remains a top priority, so the system makes use of underlying disk subsystem reliability mechanisms, i.e., RAID storage. Redundancy may also be achieved via the dual-destination backup feature in inSync.

Availability

Druva’s inSync cloud service is hosted on Amazon Elastic Compute Cloud (EC2) instances, accessible over WAN, and hosts millions of devices and backups across the globe. Operating at this scale means that extended outages for cleaning up inconsistencies anywhere in the system are simply not acceptable, and high availability is essential. Failover needs to be seamless, regardless of EC2 failures.

On-premise, Druva inSync runs inside customers’ data centers and allows for high availability via the dual-destination backup feature mentioned earlier. Availability is no less of a priority for on-premises deployments, as tens of thousands of devices are backed up regularly to Druva inSync.

Self-Healing Storage

Like any other file system, the Druva cloud file system may face crashes, in the form of process failure, network disconnects, etc. In addition, the loss of database entries can occur due to disk corruption or other failures. Even well-intentioned anti-virus software can create havoc if it is misconfigured.

At these large scales, bringing down services to regularly detect and correct inconsistencies is simply not feasible. It is crucial for inSync that the Druva cloud file system continues to serve both backup and restore requests, despite any possible storage inconsistencies. After all, the last thing anyone wants to see is a restore failure!

To achieve this, a restore is simulated for the latest snapshot of each device as a regular inSync maintenance procedure. If an inconsistency is detected during the simulated restore process, it is purged, ensuring that the snapshot is restorable, even though it may miss a few files. This guarantees that if a restore is attempted for the snapshot it won’t fail due to metadata inconsistencies of any kind.

inSync then forces a full backup to confirm that the subsequent snapshot will be clean and fully restorable. In this way, Druva storage ensures restorable snapshots for all mobile devices or laptops being backed up.

There are other possible inconsistencies which may not impact the restore process but may prevent compaction or incremental backups of the device. To detect and fix them, the Druva cloud file system has its own fsck functionality, with the ability to detect, report, and fix inconsistencies.

Both of these mechanisms run in the background during off-peak hours, as a regular, scheduled maintenance procedure—allowing for minimal impact to end users.

Given the scale at which Druva storage operates, it would be almost impossible to manually detect and fix metadata inconsistencies. Making the process automated and self-healing is the only way Druva’s serviceability could scale at these levels and continue to provide the data durability our customers expect.

Interested in learning more? Sign up for a personal demo and discover how Druva can help your enterprise.

druva-insync-free-trial

 

Shekhar Deshkar

Shekhar Deshkar

Shekhar Deshkar leads Druva’s storage engineering team in capacity of Chief Architect. He’s been associated with Druva for more than 4 years. Prior to Druva, he worked at Marvell Inc. and Symantec Corporation (formerly Veritas). Shekhar’s primary area of focus has been file-system and related storage technologies including caching, transactional systems, snapshotting, clustering, distributed file system protocols, and flash/SSDs. Shekhar enjoys taking challenges head-on in areas of concurrency, scalability, and performance of distributed storage systems. Shekhar loves it the most when his work simplifies day-to-day life for Druva customers.

2 Comments

  1. lakefeststormlake.com 6 months ago

    Hello, I desire to subscribe for this website to get newest updates, therefore where
    can i do it please help out.

  2. Kathy Cho 6 months ago

    Hi lakefeststormlake.com!

    Thanks for your interest. Just scroll to the top of this blog post and add your email where it says “Sign up for email updates” on the right side.

Leave a reply

Your email address will not be published. Required fields are marked *

*