Innovation Series

Achieving the Perfect Trifecta: Zero Vulnerabilities, Minimal Attack Surface, and Slimmed Container Images

Kush Shukla, Principal Engineer and Swapnil Kesur, Staff Software Engineer

Introduction

At Druva, we extensively use containerization technology (Docker) for deploying applications on AWS. Containers provide a lightweight and portable way to package and deploy software, which makes them a perfect fit for Druva’s 100% SaaS-based application. Containers also provide a high level of isolation between different applications, which is critical for our multi-tenant applications. There is no such thing as a free lunch and containers are no exception. One of the downsides of using containerized applications is the inherent security risks that get packaged from the base layer. This article shares the approach we followed to mitigate the security risks associated with our container images, which led us to achieve the perfect trifecta of zero vulnerability, minimal attack surface, and slimmed container images.

Druva is an industry leader in SaaS-based data protection and we are vigilant about and closely monitor our Infosec score. Our Infosec score is a metric we use to measure the security of container images. We take into account a variety of factors, such as known vulnerabilities in the software packages included in the image, the age of the image, and the overall size of the image. So, with the Infosec score as the single metric to improve, the goal was pretty clear: we want all our container images with zero vulnerabilities, regularly updated, and of minimal possible size. This activity is imperative for us to ascertain the secure application experience to our customers. 

State of Docker Containers in Druva Cloud 

The Druva Data Resiliency Cloud on AWS mostly contains Docker containers hosting microservices written in Go and Python programming languages. 

Go code is compiled into machine code, which is then packaged into a container. Go's static linking and the ability to build a single binary executable make it easy to create small, self-contained container images. This is because there's no need to include the runtime environment in the container image, as the binary executable contains all the necessary dependencies.

Python, on the other hand, is an interpreted language, which means that it needs to have the Python interpreter and any necessary dependencies along with the application code included in the container image in order to run the code. This increases the size of the container image compared to a compiled language like Go, which only needs to include the binary executable.

The average size of the Go-based Docker container is around 500 MB, while the Python-based Docker containers were around 1.6 GB. The generated SBOM (Software Bill of Materials) of the Go-based Docker images contains around 400 packages while Python-based Docker containers contain 500 packages. Most of the packages listed on the SBOM are introduced from the Base OS layer of the container. The Snyk vulnerability scan on these containers detected around 125 vulnerabilities per container. The detected vulnerabilities are from packages that are not even used by the application running inside the container and are just contributing to increasing the attack surface.

Code snippet

Fig 1. Some of the packages that are listed in the SBOM of the container 

Approach, Solution, and Results

The state of the containers in the ecosystem revealed that the Docker images are bloated with unused packages. We wanted to get rid of such packages to minimize the vulnerability count, attack surface, and size. 

For Go-based containers, we had the statically linked self-reliant application binaries running inside the containers. We employed “docker-slim” (specifically build command of the tool) and executed it on the existing Go-based images using the command below:

docker-slim build --pull --target $IMAGE --tag "$IMAGE.slim" \
        --show-plogs --http-probe-off --show-clogs --show-blogs \
        --continue-after 10 --copy-meta-artifacts "./$META_FOLDER"

The execution of this command created the slimmed version of the input Go-based images. The slimmed version of the images is 11x smaller than the actual images. The tool brought the image size down to around 45 MB from 500 MB. 

Next, we put the slimmed version of the images to test. We deployed the slimmed images to the cloud in a testing environment and ran our test suite. The test suites ran green, which validated the correctness of the slimmed images. We also ran the Snyk scan on the slimmed images and the scan results showed that we were down to zero vulnerability. We then generated the SBOM of the slimmed image and we were down to around 50 packages, which were essential Go packages only.

Code snippet

Fig 2. Some of the packages that are listed in the SBOM of the slimmed Go-based container 

The SBOM (as shown in Fig 2) explains the reason for the zero vulnerability count on the slimmed images. The reason that we encountered no vulnerabilities on the slimmed version of the image was that we completely eliminated all the unused packages from the Base OS, which was the sole reason for the vulnerabilities. The slimmed image contains only Go packages and hence no vulnerabilities.

The above approach worked and solved all the targeted problems for the Go-based containers, but the same approach failed to work for the Python-based containers. Python containers contain the code as well as the runtime for execution. When the container is passed through the docker-slim build execution, the tool removes most of the code. This happened because, during the container execution, all code paths can’t be traversed, hence the tool detected them unused and removed them.

Inspired by many popular Python libraries, which are released pre-compiled, we employed the “PyInstaller” utility to package our Python application along with its dependencies. This enabled us to run the packaged application without installing the Python runtime or any modules inside the container. With this packaging in place, we could now apply the same solution to the Python-based containers that we'd developed for the Go-based containers.

We passed the packaged Python-based container through the docker-slim build command which created the slimmed packaged version of the input Python-based images. The slimmed version of the images is 8x smaller than the actual images. The tool brought the image size down to around 200 MB from 1.6GB. We performed the same tests and analysis on the slimmed packaged Python-based containers and they too passed for correctness and reported zero vulnerabilities along with only Python packages in the SBOM.

Code snippet

Fig 3. Some of the packages that are listed in the SBOM of the slimmed packaged Python-based container

Cherry on the Cake

The slimmed version of the container images not only improves the security posture of the deployment, but also makes the deployment smooth, fast, and cost-efficient. The slimmed images are light on the disk, therefore reducing AWS ECR cost. The small-sized containers are faster to pull, resulting in faster deployment times. We observed up to 90% faster push and pull of slimmed container images as compared to actual images. Druva spawns more than 6000 container images in the production environment, so this is a great value addition to the organization.

On average, the time taken by the Engineering team to fix a vulnerability (reported in the Snyk scan) and push it to production is two weeks. With a slimmed-down version of the container image, containing no vulnerability, this is a considerable time saving on the precious engineering efforts which could very well be directed towards the implementation of innovative features.

Key Takeaways

The slimmed version of the container images containing precisely what's necessary for the application is highly recommended and a best practice for organizations. A slimmed image with no vulnerabilities presents a minimal attack surface, thereby securing your applications from potential malicious activities from hackers. Apart from the security, they make the deployment faster and more cost-effective by saving on engineering efforts. We saw increased business value in terms of security and efficiency by integrating this solution into our continuous delivery pipelines.

Next steps

Looking to learn more about the technical innovations and best practices powering cloud backup and data management? Visit the Innovation Series section of Druva’s blog archive.