What is the purpose of the VOLUME instruction in a Dockerfile?

In the documentation it says that the VOLUME instruction creates a mout point but I created a image using

FROM alpine
RUN mkdir /myvol
RUN echo "hello world" > /myvol/greeting

and I was able to mount /myvol or any any other path on the containers filesytem using docker run -v vol:/myvol myimage and was able to see the data there in /var/lib/docker/vol/_data on the host machine.

What difference would adding VOLUME myvol to the Dockerfile make?


Solution 1:

I've been struggling with understanding this quite a bit and had to do some actual testing as documentation was a bit to vague to me.

With VOLUME directive in the Dockerfile you explicitly declare a volume that container created from that image exposes even if it is not explicitly mounted when container is created at container creation time - e.g. docker run -v <volume>:/data <image name>.

Instead I can have a directive in the Dockerfile

FROM alpine

RUN mkdir /data && echo "Some data" > /data/mydata
VOLUME /data

Start the container from image built with above Dockerfile:

docker run -ti --rm --name volume-test voltest

Inspect the running container

docker container inspect volume-test

...
        "Mounts": [
            {
                "Type": "volume",
                "Name": "c4d070456dfa65540bd5c75b958930837bbf4277f4a4169b791679127f29a73a",
                "Source": "/var/snap/docker/common/var-lib-docker/volumes/c4d070456dfa65540bd5c75b958930837bbf4277f4a4169b791679127f29a73a/_data",
                "Destination": "/data",
                "Driver": "local",
                "Mode": "",
                "RW": true,
                "Propagation": ""
            }
        ]
...

As you can see there is a volume mounted to /data directory of container. This anonymous volume was automatically created during container creation because of VOLUME directive in the Dockerfile and since container was started with --rm option it will be automatically removed when container is stopped (assuming nothing else will use it at that time). You can confirm this by using docker volume ls after stopping the container.

This allows usage of such ad-hoc volumes from other containers, for example mounting them by running:

docker run --rm -ti --name alpine-vol --volumes-from volume-test alpine sh

Check the /data directoy in the newly started container, it will contain original container's data written on the volume.

I definitely see use of this when data needs to be shared between containers but does not need to persist after original container has been removed (e.g. as part of sidecar pattern). If data persistence is required you can still explicitly mount a volume into the same directory.

Solution 2:

The obvious use of VOLUME is to populate a new persistent volume with data supplied by your container. Regardless of what sort of application you are deploying, there's almost always some initial data you need to start with. The documentation makes clear that this copy only takes place when the volume is newly created.