How to flatten a Docker image?
I made a Docker container which is fairly large. When I commit the container to create an image, the image is about 7.8 GB big. But when I export
the container (not save
the image!) to a tarball and re-import it, the image is only 3 GB big. Of course the history is lost, but this OK for me, since the image is "done" in my opinion and ready for deployment.
How can I flatten an image/container without exporting it to the disk and importing it again? And: Is it a wise idea to do that or am I missing some important point?
Now that Docker has released the multi-stage builds in 17.05, you can reformat your build to look like this:
FROM buildimage as build
# your existing build steps here
FROM scratch
COPY --from=build / /
CMD ["/your/start/script"]
The result will be your build environment layers are cached on the build server, but only a flattened copy will exist in the resulting image that you tag and push.
Note, you would typically reformulate this to have a complex build environment and only copy over a few directories. Here's an example with Go to make a single binary image from source code and a single build command without installing Go on the host and compiling outside of docker:
$ cat Dockerfile
ARG GOLANG_VER=1.8
FROM golang:${GOLANG_VER} as builder
WORKDIR /go/src/app
COPY . .
RUN go-wrapper download
RUN go-wrapper install
FROM scratch
COPY --from=builder /go/bin/app /app
CMD ["/app"]
The go file is a simple hello world:
$ cat hello.go
package main
import "fmt"
func main() {
fmt.Printf("Hello, world.\n")
}
The build creates both environments, the build environment and the scratch one, and then tags the scratch one:
$ docker build -t test-multi-hello .
Sending build context to Docker daemon 4.096kB
Step 1/9 : ARG GOLANG_VER=1.8
--->
Step 2/9 : FROM golang:${GOLANG_VER} as builder
---> a0c61f0b0796
Step 3/9 : WORKDIR /go/src/app
---> Using cache
---> af5177aae437
Step 4/9 : COPY . .
---> Using cache
---> 976490d44468
Step 5/9 : RUN go-wrapper download
---> Using cache
---> e31ac3ce83c3
Step 6/9 : RUN go-wrapper install
---> Using cache
---> 2630f482fe78
Step 7/9 : FROM scratch
--->
Step 8/9 : COPY --from=builder /go/bin/app /app
---> Using cache
---> 5645db256412
Step 9/9 : CMD /app
---> Using cache
---> 8d428d6f7113
Successfully built 8d428d6f7113
Successfully tagged test-multi-hello:latest
Looking at the images, only the single binary is in the image being shipped, while the build environment is over 700MB:
$ docker images | grep 2630f482fe78
<none> <none> 2630f482fe78 6 days ago 700MB
$ docker images | grep 8d428d6f7113
test-multi-hello latest 8d428d6f7113 6 days ago 1.56MB
And yes, it runs:
$ docker run --rm test-multi-hello
Hello, world.
Up from Docker 1.13, you can use the --squash
flag.
Before version 1.13:
To my knowledge, you cannot using the Docker api. docker export
and docker import
are designed for this scenario, as you yourself already mention.
If you don't want to save to disk, you could probably pipe the outputstream of export into the input stream of import. I have not tested this, but try
docker export red_panda | docker import - exampleimagelocal:new
Build the image with the --squash
flag:
https://docs.docker.com/engine/reference/commandline/build/#squash-an-images-layers---squash-experimental
Also consider mopping up unneeded files, such as the apt cache:
RUN apt-get clean && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
Take a look at docker-squash
Install with:
pip install docker-squash
Then, if you have a image, you can squash all layers into 1 with
docker-squash -f <nr_layers_to_squash> -t new_image:tag existing_image:tag
A quick 1-liner that is useful for me to squash all layers:
docker-squash -f $(($(docker history $IMAGE_NAME | wc -l | xargs)-1)) -t ${IMAGE_NAME}:squashed $IMAGE_NAME