Deceptive rm Command in Dockerfile

Mert Açıkportalı
4 min readNov 16, 2021

--

Subject

There are already many Dockerfile best practices explained on various websites (I especially like this one -> https://sysdig.com/blog/dockerfile-best-practices/); yet, the subject of this post is still not emphasized enough. Thus, I wanted to write a short, dedicated post. It’ll be about:

  • Keeping the size of Docker images small
  • Dangers of putting credentials into Docker images

Consider the following two Dockerfiles:

Dockerfile-samelayer

FROM alpine
RUN fallocate -l 256M dummy-file && rm -f dummy-file

Dockerfile-differentlayer

FROM alpine
RUN fallocate -l 256M dummy-file
RUN rm -f dummy-file

(fallocate is a Linux-specific command to generate dummy files)

How do you think these two images differ, specifically in size and final filesystem? If Docker has been a part of your life for some time already, you most likely know the answer.

To be honest, if someone asked this question to me back when I started learning containerization, I’d have said sizes and the filesystems would be identical for both Dockerfiles. After all, we are removing the dummy-file in both Dockerfiles, right?

Well, I’d be wrong…

Inspection

After building the images, let’s take a look at how big the images are:

Hmm… Even though we remove the dummy-file in both Dockerfiles, the image size of Dockerfile-differentlayer is huge as if we haven’t removed the file.

By using the magnificent https://github.com/wagoodman/dive tool, we can understand why.

Dockerfile-samelayer

Take a look at the filesystem of the Dockerfile-samelayer first:

No trace of dummy-file. Nothing surprising here.

Let’s take a look at the image size now:

As expected, the file is actually removed from the image(the final image size is 5.6 MB); therefore the dummy-file is not contributing to the size of the image.

Dockerfile-differentlayer

Let’s see if the dummy-file is in the filesystem this time:

The filesystem is identical to the Dockerfile-samelayer in every way. dummy-file is again not in the filesystem.

Let’s check the image size, where things get a bit complicated:

The rm command is not doing what it is supposed to do. The only effect that it has is there is no trace of dummy-file in the filesystem. (This is also not entirely true, I’ll come back to that later.) The file is still contributing to the final image size (the final image size is 274 MB).

So, what happened?

Explanation

Layers happened. In Docker, the image size is the sum of all layers and a layer cannot have a negative size. That’s because Docker uses Union filesystem.

But, where is the dummy-file? We know it’s somewhere, but where?

In our example, we created the file in the second layer but removed it in the third layer. Which means,dummy-file is still in the second layer. (See Union filesystem) To prove this, let’s view the filesystem of the second layer of rm-test:differentlayer.

Here we can see that the dummy-file is still accessible! This is also one of the reasons why sensitive data shouldn't be put into Docker images. It asks for all sorts of trouble.

So, is it really impossible to actually remove a file after the image is created?

No, but I consider it as a last resort solution. Only if I’m reaaaaally desperate, I might give it a try. See if you are still interested -> https://medium.com/@samhavens/how-to-make-a-docker-container-smaller-by-deleting-files-7354b5c6c8f1

Obligatory Meme Ending

--

--

Mert Açıkportalı
Mert Açıkportalı

Written by Mert Açıkportalı

From Vault 11, the Last Survivor 💫 I have a theoretical degree in Theoretical Physics 👨‍🎓

No responses yet