Platform
- Data Resiliency Cloud
  Data Resiliency Cloud
  Enterprise Cloud Backup and data management across edge, on-premises and cloud workloads
- Data Protection
  Data Protection
  Modernize data protection to reduce costs and complexity
- Cyber Resiliency
  Cyber Resiliency
  Be ready for cyber attacks with data that is always safe, always ready
  - Accelerated Ransomware Recovery
  - Security Posture & Observability
- Governance & Compliance
  Governance & Compliance
  Secure, protect, and streamline data governance for all your critical data, wherever it lives
  - eDiscovery and Legal Hold
  - Sensitive Data Management
- Take a Tour
Solutions
- Business Drivers
  Business Drivers
  Learn how Druva helps you accelerate key business initiatives
- SaaS Applications
  SaaS Applications
  Druva provides comprehensive data protection that supports multiple SaaS applications from a single platform. Discover the Druva difference today.
- Enterprise Workloads
  - Virtualization
    Virtualization
    Transform data center backup and disaster recovery for virtual environments
    
    VMware
    
    Nutanix
  - Databases
    Databases
    Reduce the cost and complexity of data protection for enterprise databases
    
    Oracle
    
    MS SQL
    
    SAP HANA
  - Files
    Files
    Discover a more cost-efficient way to protect on-premises and cloud NAS
    
    NAS/files
  - Public Cloud
    Public Cloud
    Protect native AWS and Azure deployments with secure backups without the cost and complexity
    
    AWS
    
    Microsoft Azure
- Enterprise Endpoints
  Enterprise Endpoints
  Unify SaaS apps and end-user device protection to reduce data risks. Improve cyber resilience and compliance by protecting enterprise workloads and assets.
- Free Trial
Customers
- Explore All Customer Stories
  We are trusted by the world's leading organizations to protect their data. Explore customer success stories to see how your peers are using Druva.
- Ransomware recovery ready
  Learn why Medallia chose Druva
  
  SaaS data protection across the enterprise
  See why Regeneron partnered with Druva
Resources
- 2023 Gartner® Magic Quadrant™
  See why Druva is recognized as a Visionary
  
  Data Resiliency for Dummies
  Get your guide to data resiliency
Partners
- Strategic Partners
  Strategic Partners
  Learn about Druva's strategic capabilities across platform, OEM, and other partnerships. Find out how Druva accelerates and protects customers' cloud journeys.
  - Dell Technologies
  - AWS
  - VMware
  - Nutanix
- Programs
  Programs
  Learn how you can profit with Druva and a cloud-first SaaS selling motion. Explore partner programs, access resources, and discover the benefits of partnering with Druva.
- Become a Partner
Company
- - Company
  - Leadership
  - Investors
  - Careers
  - Contact Us
  - Newsroom
  - Awards
  - Events
  - Blog
  - Diversity, Equity & Inclusion
- Get in touch with us
  Contact Us
  
  News, product innovations, and more
  Blog
Get Started
Support
Login
Language

Tech/Engineering, Innovation Series

Oracle RMAN Image Copy + SBT Stream Backups Deduplication

November 09, 2022 Sudhakar Paulzagade, Distinguished Engineer and Santosh Patil, Senior Software Engineer

Introduction

Druva’s Oracle data protection solution (Phoenix Backup Store and Direct-to-Cloud) helps customers protect their Oracle standalone as well as clustered (RAC) environments. DTC (Direct-to-Cloud) technology allows customers to stream backups of their Oracle databases running either in the data center or in the cloud directly to Druva’s deduplicated storage in AWS S3. The solution supports source-side deduplication technology and implements Oracle SBT API. The PBS solution helps customers retain the latest copies of their backups on local storage thereby improving RTO.

When it comes to SBT-based full and incremental stream backups, deduplication becomes quite challenging due to Oracle Recovery Manager (RMAN) multiplexing. In this blog, we talk about how we dealt with this problem while developing the Druva Oracle DTC solution and achieved very predictable deduplication rates.

RMAN Image Copy

Druva supports two protection solutions for Oracle: Oracle DTC and PBS (Phoenix Backup Store). The PBS solution is a dump-and-sweep solution where we provide customers with RMAN template scripts to be run on the Oracle production server. The NFS mount point is exported from PBS and mounted on the Oracle server being protected. The template scripts that Druva provides use RMAN image copy along with incremental merge technology to write RMAN backups to the NFS mount point. After the RMAN image copy job completes, a file system-level snapshot is created on the PBS, and data from the snapshot is uploaded to the Druva cloud deduplicated storage.

In the case of image copy backups, RMAN writes a copy of Oracle data, control, and archived redo log files to the NFS mount. The copy is similar to the OS copy command. There is no change in the format of data, control, and archived redo log files. RMAN retains the native file format during the copy process.

Due to the nature of image copy backups, we get excellent deduplication when we upload this data to the cloud. The deduplication ratios can be compared to that of file backups using Druva FS Agent.

RMAN Multiplexing Problem

Druva’s Oracle DTC solution uses the implementation of the SBT stream API provided by Oracle. The block size used by Oracle RMAN to write backup pieces to storage during backup is generally 256KB for data files and archived log files. The deduplication block size that we use in Druva is 1MB for Oracle. During the initial development, we saw unpredictable deduplication rates due to the following reasons:

In the case of SBT stream backups, multiple Oracle slave processes concurrently read data from multiple files and send buffers on one or more streams. Because the Oracle RMAN write block size is 256KB and Druva Block size is 1MB, the order of blocks changes from backup to backup affecting deduplication.
The problem is further aggravated if RMAN multi-section backups are used.

Avoiding the Impact of RMAN Multiplexing and Multi-Section Backups

FILESPERSET is the option Oracle RMAN uses to control the multiplexing of the files into the backup piece. This option controls how many data files or archived redo log files are written to a particular backup piece. As explained in the previous section, since the files are read concurrently and asynchronously, the order of blocks within the backup stream is different each time. We use FILESPERSET = 1 in our auto-generated RMAN backup scripts. This setting ensures that only one data file or archived redo log file is written to one backup piece. This also ensures that the blocks are in the same order. So far, we have not used multi-section backups because the size of data files in our customer environments is not really huge. They generally range in the GBs.

Local Fingerprint Cache

The second problem that we dealt with was with the local fingerprint cache. Druva Phoenix uses a local fingerprint cache to avoid network round trips to the deduplication server. The fingerprint cache is file name based. When a file is backed up to deduplication storage for the first time, it is chunked at block boundaries and a probe call is made to the deduplication engine running in the cloud to check whether the fingerprint already exists. If the fingerprint already exists no data is sent to the deduplication server; only the reference count of the fingerprint is increased. If this is a fingerprint that the deduplication engine has not seen before, the block is sent to the deduplication server. During this process, the fingerprints for all blocks of the file are also cached on the local system. In the subsequent backup, the probe first happens on the local fingerprint cache. If the fingerprint is found in the local cache there is no need to send a probe to the deduplication server. This saves network round trips, thereby increasing the overall performance and also saving computational costs on the deduplication index running in the cloud.

But with RMAN, we faced a unique problem. When a backup of a data file is done using RMAN, it gives it a unique name in every backup. This causes the file to appear as a new file in the deduplication cache, and therefore, all probes are sent to the deduplication server running in the cloud affecting performance and increasing the deduplication index lookup costs.

To address this problem, Druva Oracle DTC implements an RMAN stream handler that opens up the RMAN data file and archive log streams. The stream handler identifies the block size of the data file, which is by default 8K in Oracle for datafiles and 512 bytes for archived log files. After the block size is identified, more stream blocks are inspected to help Druva establish relationships between the backup piece and the data file contained within the backup stream. The local fingerprint cache uses the file number and relative file number to create the unique identity in the fingerprint cache.

Deduplication Ratios

FILESPERSET along with the Oracle RMAN SBT stream handler saw very good deduplication numbers for the Oracle DTC solution. The numbers cannot be compared directly with files and folders backup because, even after adding the stream handler, there are some optimizations RMAN does that may affect deduplication.

Since Druva is a 100% SaaS-based solution, the index lookup cost and network matter a lot — Druva’s Deduplication Storage Index uses AWS Dynamo DB, and every index lookup costs us in terms of COGS. Network usage matters because we move data over WAN as opposed to LAN in data center. Overall, these features reduced our network round trips to the deduplication server and index lookup cost in the cloud, enabling us to achieve predictable deduplication ratios.

Next Steps

Visit the Oracle data protection page of the Druva website for an in-depth look at how Druva secures these critical workloads. Learn more about the technical innovations and best practices powering cloud backup and data management in the Innovation Series section of Druva’s blog archive.

Oracle RMAN Image Copy + SBT Stream Backups Deduplication

Introduction

RMAN Image Copy

RMAN Multiplexing Problem

Avoiding the Impact of RMAN Multiplexing and Multi-Section Backups

Local Fingerprint Cache

Deduplication Ratios

Next Steps

Blog

Druva Data Resiliency Cloud

Cloud Backup & Recovery

Data Protection

Governance & Compliance

Cyber Resilience

Business drivers

Workloads

Partners

Customers

Resources

Company