What is the difference between an image and a repository?

I am brand new to Docker and following the Getting Started tutorial. At step 7 it says

type docker images command and press RETURN. The command lists all the images on your local system. You should see docker/whalesay in the list.

$ docker images
REPOSITORY           TAG         IMAGE ID            CREATED            VIRTUAL SIZE
docker/whalesay      latest      fb434121fc77        3 hours ago        247 MB
hello-world          latest      91c95931e552        5 weeks ago        910 B

but the first column clearly says "repository", not e.g. "image name". I have also noticed on other people's machines that, because an image can have multiple tags, this listing often contains duplicate entries - one for each tag. So is this a list of images, a list of repositories, a list of image-tag combinations or something else? What is the difference between an image and a repository?

Also, given that images and repositories are different things, how can I just list my repositories?

This is nothing to do with containers.


Solution 1:

Yes, this is very confusing terminology.

Simplest answer:

Image: a single image.

Repository: a collection of images.

Details:

Image: Uniquely referenced by the Image ID, the 12 digit hex code (e.g. 91c95931e552). [1]

Repository: Contains one or more images. So the hello-world repository could contain two different images: 91c95931e552 and 1234abcd5678.

Image alias - I'm going to define image alias to mean an alias that references a specific image. The format of an image alias is repository:tag. This way, you can use a human-friendly alias such as hello-world:latest instead of the 12-digit code.

Example:

Let's say I have these images:

REPOSITORY           TAG         IMAGE ID
docker/whalesay      latest      fb434121fc77
hello-world          latest      91c95931e552
hello-world          v1.1        91c95931e552
hello-world          v1.0        1234abcd5678

The repositories are: docker/whalesay, hello-world.

The images are fb434121fc77, 91c95931e552, 1234abcd5678. Notice that the 2nd and 3rd rows have the same Image ID, so they are the same image.

The image aliases are:

docker/whalesay:latest
hello-world:latest
hello-world:v1.1
hello-world:v1.0

So hello-world:latest and hello-world:v1.1 are simply two aliases for the same image.

Additional Details:

  • Repository name format can also prepend an optional user or namespace, which is useful when using a public registry like Docker Hub. E.g. docker/whalesay. Otherwise, you will have a lot of repository name conflicts.

  • If you leave out the tag when referencing an image alias, it will automatically add :latest. So when you specify hello-world, it will be interpreted as hello-world:latest. Warning: latest doesn't actually mean anything special, it's just a default tag.

  • [1] Actually, the full Image ID is a 64 digit hex code truncated to 12 digits, but you don't need to care about that.

Solution 2:

Quoted from the official Docker documentation:

A repository potentially holds multiple variants of an image.

(see: https://docs.docker.com/userguide/dockerimages)

This means: A Docker image can belong to a repository, e.g. when it was pushed to a Docker registry (with docker push my/reporitory:version1). On the other side, a repository contains multiple versions of an image (= different tags). So when you build an new version of your image, you can give it a tag (docker tag 518a41981a6a my/reporitory:version2) and push it to your repository as the next version (docker push my/reporitory:version2).

Here's an example from the Docker documentation (see the link above). As you can see, it shows one repository called ouruser/sinatra which contains various versions (latest, devel, v2) of the same image:

$ docker images ouruser/sinatra
REPOSITORY          TAG     IMAGE ID      CREATED        VIRTUAL SIZE
ouruser/sinatra     latest  5db5f8471261  11 hours ago   446.7 MB
ouruser/sinatra     devel   5db5f8471261  11 hours ago   446.7 MB
ouruser/sinatra     v2      5db5f8471261  11 hours ago   446.7 MB

In your example, you have two repositories (docker/whalesay and hello-world) which only contains one tagged image (called latest, which just means there is not tag actually and the latest images is shown).

Solution 3:

It's easiest to define several terms here because they all interrelate:

Image: This is the filesystem layers and metadata used to package an application in a way to run containers. Each image must have an ID on a docker engine.

Reference: This is a pointer to an image. There are different types of references, either just the image ID, usually the it is a repository and tag, and sometimes you will pin to a specific checksum using a sha256 hash instead of a changeable tag. The important part is that you can have multiple pointers to the same image, and that it is not necessary to have any references to an image other than the image ID. When you delete a reference, docker will just delete that pointer unless it was the last pointer to that image ID.

Registry: This is a server that holds images. Similar to how a Git server holds source code, or an artifact server for binaries, a registry is where you push and pull images to and from.

Repository: The path to a directory of images on a registry server is the repository. This includes the registry hostname and port if you aren't using the default Docker Hub registry. In an image reference, this repository is the part before the final colon and tag.

Tag: A specific image within a repository. If you do not specify a tag, docker will default to the tag name "latest". This is the part after the final colon, and is often used for a version number.


To take an example reference:

registry-server:5000/team/service-a:build-42
  • "registry-server:5000" is the registry server name (and port) where you would push/pull this image.

  • "registry-server:5000/team/service-a" is the repository.

  • "build-42" is the tag.

  • "registry-server:5000/team/service-a:build-42" is a reference.

Unlike other systems where you push and pull to a server and then specific what files to send there, pushing and pulling docker images to and from a registry server defines the destination and source of the image using a reference that includes the repository and tag in that name. So to push images to a different location, you create a new reference (using the docker tag command) to the same image with the new repository and tag, and then run your push command against that reference.

Typically when someone refers to an "image name" they are referring to either a repository name (if you want to specify a tag separately) or a complete reference that you can use to pull or push an image.


how can I just list my repositories?

docker image ls --format '{{.Repository}}' | sort -u

I included the sort -u to de-dup the output since you may have multiple images with the same repository and different tags.