Druva’s Oracle data protection solution (Phoenix Backup Store and Direct-to-Cloud) helps customers protect their Oracle standalone as well as clustered (RAC) environments. DTC (Direct-to-Cloud) technology allows customers to stream backups of their Oracle databases running either in the data center or in the cloud directly to Druva’s deduplicated storage in AWS S3. The solution supports source-side deduplication technology and implements Oracle SBT API. The PBS solution helps customers retain the latest copies of their backups on local storage thereby improving RTO.
When it comes to SBT-based full and incremental stream backups, deduplication becomes quite challenging due to Oracle Recovery Manager (RMAN) multiplexing. In this blog, we talk about how we dealt with this problem while developing the Druva Oracle DTC solution and achieved very predictable deduplication rates.
RMAN Image Copy
Druva supports two protection solutions for Oracle: Oracle DTC and PBS (Phoenix Backup Store). The PBS solution is a dump-and-sweep solution where we provide customers with RMAN template scripts to be run on the Oracle production server. The NFS mount point is exported from PBS and mounted on the Oracle server being protected. The template scripts that Druva provides use RMAN image copy along with incremental merge technology to write RMAN backups to the NFS mount point. After the RMAN image copy job completes, a file system-level snapshot is created on the PBS, and data from the snapshot is uploaded to the Druva cloud deduplicated storage.
In the case of image copy backups, RMAN writes a copy of Oracle data, control, and archived redo log files to the NFS mount. The copy is similar to the OS copy command. There is no change in the format of data, control, and archived redo log files. RMAN retains the native file format during the copy process.
Due to the nature of image copy backups, we get excellent deduplication when we upload this data to the cloud. The deduplication ratios can be compared to that of file backups using Druva FS Agent.
RMAN Multiplexing Problem
Druva’s Oracle DTC solution uses the implementation of the SBT stream API provided by Oracle. The block size used by Oracle RMAN to write backup pieces to storage during backup is generally 256KB for data files and archived log files. The deduplication block size that we use in Druva is 1MB for Oracle. During the initial development, we saw unpredictable deduplication rates due to the following reasons:
- In the case of SBT stream backups, multiple Oracle slave processes concurrently read data from multiple files and send buffers on one or more streams. Because the Oracle RMAN write block size is 256KB and Druva Block size is 1MB, the order of blocks changes from backup to backup affecting deduplication.
- The problem is further aggravated if RMAN multi-section backups are used.
Avoiding the Impact of RMAN Multiplexing and Multi-Section Backups
FILESPERSET is the option Oracle RMAN uses to control the multiplexing of the files into the backup piece. This option controls how many data files or archived redo log files are written to a particular backup piece. As explained in the previous section, since the files are read concurrently and asynchronously, the order of blocks within the backup stream is different each time. We use FILESPERSET = 1 in our auto-generated RMAN backup scripts. This setting ensures that only one data file or archived redo log file is written to one backup piece. This also ensures that the blocks are in the same order. So far, we have not used multi-section backups because the size of data files in our customer environments is not really huge. They generally range in the GBs.
Local Fingerprint Cache
The second problem that we dealt with was with the local fingerprint cache. Druva Phoenix uses a local fingerprint cache to avoid network round trips to the deduplication server. The fingerprint cache is file name based. When a file is backed up to deduplication storage for the first time, it is chunked at block boundaries and a probe call is made to the deduplication engine running in the cloud to check whether the fingerprint already exists. If the fingerprint already exists no data is sent to the deduplication server; only the reference count of the fingerprint is increased. If this is a fingerprint that the deduplication engine has not seen before, the block is sent to the deduplication server. During this process, the fingerprints for all blocks of the file are also cached on the local system. In the subsequent backup, the probe first happens on the local fingerprint cache. If the fingerprint is found in the local cache there is no need to send a probe to the deduplication server. This saves network round trips, thereby increasing the overall performance and also saving computational costs on the deduplication index running in the cloud.
But with RMAN, we faced a unique problem. When a backup of a data file is done using RMAN, it gives it a unique name in every backup. This causes the file to appear as a new file in the deduplication cache, and therefore, all probes are sent to the deduplication server running in the cloud affecting performance and increasing the deduplication index lookup costs.
To address this problem, Druva Oracle DTC implements an RMAN stream handler that opens up the RMAN data file and archive log streams. The stream handler identifies the block size of the data file, which is by default 8K in Oracle for datafiles and 512 bytes for archived log files. After the block size is identified, more stream blocks are inspected to help Druva establish relationships between the backup piece and the data file contained within the backup stream. The local fingerprint cache uses the file number and relative file number to create the unique identity in the fingerprint cache.
FILESPERSET along with the Oracle RMAN SBT stream handler saw very good deduplication numbers for the Oracle DTC solution. The numbers cannot be compared directly with files and folders backup because, even after adding the stream handler, there are some optimizations RMAN does that may affect deduplication.
Since Druva is a 100% SaaS-based solution, the index lookup cost and network matter a lot — Druva’s Deduplication Storage Index uses AWS Dynamo DB, and every index lookup costs us in terms of COGS. Network usage matters because we move data over WAN as opposed to LAN in data center. Overall, these features reduced our network round trips to the deduplication server and index lookup cost in the cloud, enabling us to achieve predictable deduplication ratios.
Visit the Oracle data protection page of the Druva website for an in-depth look at how Druva secures these critical workloads. Learn more about the technical innovations and best practices powering cloud backup and data management in the Innovation Series section of Druva’s blog archive.