Deceptive rm Command in Dockerfile

Subject

  • Keeping the size of Docker images small
  • Dangers of putting credentials into Docker images

Consider the following two Dockerfiles:

Dockerfile-samelayer

FROM alpine
RUN fallocate -l 256M dummy-file && rm -f dummy-file

Dockerfile-differentlayer

FROM alpine
RUN fallocate -l 256M dummy-file
RUN rm -f dummy-file

(fallocate is a Linux-specific command to generate dummy files)

How do you think these two images differ, specifically in size and final filesystem? If Docker has been a part of your life for some time already, you most likely know the answer.

To be honest, if someone asked this question to me back when I started learning containerization, I’d have said sizes and the filesystems would be identical for both Dockerfiles. After all, we are removing the dummy-file in both Dockerfiles, right?

Well, I’d be wrong…

Inspection

Hmm… Even though we remove the dummy-file in both Dockerfiles, the image size of Dockerfile-differentlayer is huge as if we haven’t removed the file.

By using the magnificent https://github.com/wagoodman/dive tool, we can understand why.

Dockerfile-samelayer

No trace of dummy-file. Nothing surprising here.

Let’s take a look at the image size now:

As expected, the file is actually removed from the image(the final image size is 5.6 MB); therefore the dummy-file is not contributing to the size of the image.

Dockerfile-differentlayer

The filesystem is identical to the Dockerfile-samelayer in every way. dummy-file is again not in the filesystem.

Let’s check the image size, where things get a bit complicated:

The rm command is not doing what it is supposed to do. The only effect that it has is there is no trace of dummy-file in the filesystem. (This is also not entirely true, I’ll come back to that later.) The file is still contributing to the final image size (the final image size is 274 MB).

So, what happened?

Explanation

But, where is the dummy-file? We know it’s somewhere, but where?

In our example, we created the file in the second layer but removed it in the third layer. Which means,dummy-file is still in the second layer. (See Union filesystem) To prove this, let’s view the filesystem of the second layer of rm-test:differentlayer.

Here we can see that the dummy-file is still accessible! This is also one of the reasons why sensitive data shouldn't be put into Docker images. It asks for all sorts of trouble.

So, is it really impossible to actually remove a file after the image is created?

No, but I consider it as a last resort solution. Only if I’m reaaaaally desperate, I might give it a try. See if you are still interested -> https://medium.com/@samhavens/how-to-make-a-docker-container-smaller-by-deleting-files-7354b5c6c8f1

Obligatory Meme Ending

--

--

From Vault 11, the Last Survivor 💫 I have a theoretical degree in Theoretical Physics 👨‍🎓

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Mert Açıkportalı

From Vault 11, the Last Survivor 💫 I have a theoretical degree in Theoretical Physics 👨‍🎓