Should I minimize the number of docker layers?

The documentation doesn't elaborate on the topic a lot. It says:

Minimize the number of layers

Prior to Docker 17.05, and even more, prior to Docker 1.10, it was important to minimize the number of layers in your image. The following improvements have mitigated this need:

In Docker 1.10 and higher, only RUN, COPY, and ADD instructions create layers. Other instructions create temporary intermediate images, and no longer directly increase the size of the build.

Docker 17.05 and higher add support for multi-stage builds, which allow you to copy only the artifacts you need into the final image. This allows you to include tools and debug information in your intermediate build stages without increasing the size of the final image.

It looks like the latest Docker versions don't solve the problem of handling many layers. They rather strive to reduce their number in the final image. Most importantly, the docs don't tell why many layers are bad.

I'm aware of the AUFS limit of 42 layers. It makes sense to keep the number of layers small for widely used images because it helps other images built on top of them fit the restriction. However, there are another storage drivers and images for other purposes.

It is also good to keep images small for an obvious reason - they take up disk space and network bandwidth. However, I don't think that chaining RUN statements and thus squashing many layers into one helps in general. In case different RUNs update different parts of the filesystem one layer and many layers together should be approximately the same in size.

On the other hand, many layers allow to make use of cache and rebuild images faster. They are also pulled in parallel.

I work in a small team with a private Docker registry. We won't ever meet the 42 layers restriction and care mostly about performance and development speed.

If so, should I minimize the number of docker layers?


I work in a small team with a private Docker registry. We won't ever meet the 42 layers restriction and care mostly about performance and development speed.

If so, should I minimize the number of docker layers?

In your case, no.
What needs to be minimized is the build time, which means:

  • making sure the most general steps, and the longest are first, that will then cached, allowing you to fiddle with the last lines of your Dockerfile (the most specific commands) while having a quick rebuild time.
  • making sure the longest RUN command come first and in their own layer (again to be cached), instead of being chained with other RUN commands: if one of those fail, the long command will have to be re-executed. If that long command is isolated in its own (Dockerfile line)/layer, it will be cached.

That being said, the documentation you mention comes from docker/docker.github.io, precisely PR 4992 and PR 4854, after a docker build LABEL section.
So this section comes after a similar remark about LABEL, and just emphasize the commands creating layers.
Again, in your case, that would not be important.


I just wanted to see what were the differences of 2 images, one built with multiple RUNs and the other built with one RUN concatenating commands.

In the first case, the images are doing trivial operations (creating and deleting files).

Content of the "single" layer image:

FROM busybox

RUN echo This is the 1 > 1 \
    && rm -f 1 \
    && echo This is the 2 > 2 \
    && rm -f 2 \
# ... for about 70 commands

Content of the multiple layers image:

FROM busybox

RUN echo This is the 1 > 1
RUN rm -f 1
RUN echo This is the 2 > 2
RUN rm -f 2
# ... for about 70 layers

The build time is very different (multiple: 0m34,973s, singular: 0m0,568s). The container start-up time is also different but less noticeable (multiple: 0m0,435s, singular: 0m0,378s). I've run different times the images but the times do not change that much.

Concerning the space, I've looked on purpose for the worst case for the multiple layer case and as expected the multiple layer image is bigger than the single layer.

In another test, I concatenated layers that only add content to the image. The build time does not change from the previous case but the run-time case shows something a little different: the multi layer image is faster to start-up than the single layer image. Concerning the space, same results.

I don't think this proves anything but I had fun in doing it :P


Reducing the number of layers is less of a goal itself. Rather what you need to focus on is reducing build time and also reducing image size.

Build time is reduced by keeping common layers that rarely change at the top of your Dockerfile, or in a base image. This allows the layer to be cached and reused in later builds. This is less about reducing the number of layers, and more about ordering your layers well.

Reducing image size helps reduce the disk usage on registry servers, which see a large hit to the disk when images are stored for each build on a CI system. It also reduces the network time to transfer the image. When you have one layer that downloads a large temporary file and you delete it in another layer, that has the result of leaving the file in the first layer, where it gets sent over the network and stored on disk, even when it's not visible inside your container. Changing permissions on a file also results in the file being copied to the current layer with the new permissions, doubling the disk space and network bandwidth for that file.

The standard solution to reduce the image size in the above scenarios is to chain the RUN commands so that temporary files are never stored to an image layer. This has the side effect of reducing the number of image layers.

There's one last issue, which is excessive caching. This is commonly seen with the apt-get update and apt-get install ... commands in Debian images. If you do not chain these commands together, an update to the apt-get install command will reused a possibly stale cache from the previous layers apt-get update command, and will fail months later when it cannot find the needed packages. Therefore, you should chain these commands even though it will increase build time, because the other option is to have build failures in the future.

Therefore, it's more the side effects of reducing layers that you want, not necessarily reducing the layers for the sake of reducing layers.