Platform
- Data Resiliency Cloud
  Data Resiliency Cloud
  Enterprise Cloud Backup and data management across edge, on-premises and cloud workloads
- Data Protection
  Data Protection
  Modernize data protection to reduce costs and complexity
- Cyber Resiliency
  Cyber Resiliency
  Be ready for cyber attacks with data that is always safe, always ready
  - Accelerated Ransomware Recovery
  - Security Posture & Observability
- Governance & Compliance
  Governance & Compliance
  Secure, protect, and streamline data governance for all your critical data, wherever it lives
  - eDiscovery and Legal Hold
  - Sensitive Data Management
- Take a Tour
Solutions
- Business Drivers
  Business Drivers
  Learn how Druva helps you accelerate key business initiatives
- SaaS Applications
  SaaS Applications
  Druva provides comprehensive data protection that supports multiple SaaS applications from a single platform. Discover the Druva difference today.
- Enterprise Workloads
  - Virtualization
    Virtualization
    Transform data center backup and disaster recovery for virtual environments
    
    VMware
    
    Nutanix
  - Databases
    Databases
    Reduce the cost and complexity of data protection for enterprise databases
    
    Oracle
    
    MS SQL
    
    SAP HANA
  - Files
    Files
    Discover a more cost-efficient way to protect on-premises and cloud NAS
    
    NAS/files
  - Public Cloud
    Public Cloud
    Protect native AWS and Azure deployments with secure backups without the cost and complexity
    
    AWS
    
    Microsoft Azure
- Enterprise Endpoints
  Enterprise Endpoints
  Unify SaaS apps and end-user device protection to reduce data risks. Improve cyber resilience and compliance by protecting enterprise workloads and assets.
- Free Trial
Customers
- Explore All Customer Stories
  We are trusted by the world's leading organizations to protect their data. Explore customer success stories to see how your peers are using Druva.
- Ransomware recovery ready
  Learn why Medallia chose Druva
  
  SaaS data protection across the enterprise
  See why Regeneron partnered with Druva
Resources
- 2023 Gartner® Magic Quadrant™
  See why Druva is recognized as a Visionary
  
  Data Resiliency for Dummies
  Get your guide to data resiliency
Partners
- Strategic Partners
  Strategic Partners
  Learn about Druva's strategic capabilities across platform, OEM, and other partnerships. Find out how Druva accelerates and protects customers' cloud journeys.
  - Dell Technologies
  - AWS
  - VMware
  - Nutanix
- Programs
  Programs
  Learn how you can profit with Druva and a cloud-first SaaS selling motion. Explore partner programs, access resources, and discover the benefits of partnering with Druva.
- Become a Partner
Company
- - Company
  - Leadership
  - Investors
  - Careers
  - Contact Us
  - Newsroom
  - Awards
  - Events
  - Blog
  - Diversity, Equity & Inclusion
- Get in touch with us
  Contact Us
  
  News, product innovations, and more
  Blog
Get Started
Support
Login
Language

Innovation Series

Go Lang Memory Management: How to Overcome the Challenge

August 11, 2022 Sanjay Vora, Principal Engineer

Overview

When processes written in Go Lang work with unstructured data, they tend to consume a lot more memory than expected. This blog gives insight into how we at Druva overcame this problem and leverage Go Lang in a memory-intensive application.

Background

Our backup process is written in Go. This is a typical use case where the process requires frequent Go Lang memory management allocation. It reads hundreds of files in parallel, does some processing like compression of data, and sends that over the network. The backup process works on GBs of data per minute. To read these files, it allocates and processes buffers. However, previous buffers that are no longer in use are not freed (or at least that’s what it looked like at the start). The end result is a backup process that consumes huge amounts of memory.

How did we identify the root cause of the issue?

At first, it looked like a memory leak issue. We scanned our code multiple times to see if we were releasing any object references. With Go being a garbage-collected language, that possibility was very minimal. We tried forcing frequent GC by setting different config values — it did not help.

We used pprof to do memory profiling and to find out where maximum memory allocation happens. As expected, it was happening at a place where we were allocating buffers before reading data into the buffer. But then we noticed a surprising result in our memory profiling. We took memory profiles at frequent intervals and specifically at a time when we were seeing high memory usage. The profile showed huge allocated memory but very low in-use memory. What does this mean? This means GC has released the memory. If it was not released by GC, it would have been captured in the in-use memory profile. The top command on Linux also was showing the same information, very low RSS (resident set size) memory but high VSZ (virtual memory) usage.

What was happening here? Because we were allocating a buffer of differing sizes — from 1 byte to 16MB — Go's memory scavenger was not handling those requests efficiently. It was allocating more new pages and was not reusing already allocated and unused pages. This was a result of fragmentation due to random buffer size.

How did we solve the problem?

We used a BufferPool to solve the memory fragmentation. Allocate a fixed-size buffer. Even if the request is for 1b, allocate a 1kb buffer. We kept four different sizes for the buffer: 1K, 100K, 1M, and 16M. This is a typical use case for our process for these sizes.

What the process does is get a buffer from the BufferPool, and once it is done, release the buffer back to the BufferPool to reuse again. This way we reduced the memory allocation calls.

BufferPool is implemented in similar lines of Sync.Pool. The only difference is Sync.Pool goes through GC, but BufferPool does not.

Code

Result

Memory footprint reduced from ~20GB to ~2GB. Also, because there were fewer memory allocation and deallocation requests, it reduced the load on GC. This in turn resulted in higher throughput.

Key takeaways

While Go may have some problems with handling unstructured data, with careful handling these can be eliminated and should not become a blocker in realizing Go’s many advantages like concurrency, faster execution, or utilizing its rich set of tools.

Next steps

Learn more about the technical innovations and best practices powering cloud backup and data management. Visit the Innovation Series section of Druva’s blog archive.

Join the team!

Looking for a career where you can shape the future of cloud data protection? Druva is the right place for you! Collaborate with talented, motivated, passionate individuals in a friendly, fast-paced environment; visit the careers page to learn more.

About the author

Sanjay is focused on making processes become more efficient and performant while reducing resource utilization — be it client processes or microservices. His interests include learning new technology regarding cloud and microservices and helping implement best practices.

Go Lang Memory Management: How to Overcome the Challenge

Overview

Background

How did we identify the root cause of the issue?

How did we solve the problem?

Code

Result

Key takeaways

Next steps

Join the team!

About the author

Blog

Druva Data Resiliency Cloud

Cloud Backup & Recovery

Data Protection

Governance & Compliance

Cyber Resilience

Business drivers

Workloads

Partners

Customers

Resources

Company