What is HEALTHCHECK really used for when running Docker in swarm mode?

I'm having a hard time figuring out what HEALTHCHECK really is used for when running Docker in swarm mode.

One place suggests that Docker will restart a task which is considered unhealthy. Another place explains that Docker will stop sending traffic to tasks that are unhealthy. The Docker documentation itself only explains what the HEALTHCHECK directive is, and how to configure it. It makes no attempt to explain what happens when a task goes unhealthy.

In other words, I'm struggling to find a clear and trustworthy explanation of what HEALTCHECK really does.

Furthermore, looking at the Docker REST API, this particular piece of data (is a task healthy or not) is not even exposed for tasks (it is exposed for containers though). This makes it hard to use this metric for monitoring a Docker Swarm, so it doesn't seem to me that this is the primary purpose of the metric either.

What really happens when a task becomes unhealthy when running Docker in swarm mode?


You setup healthchecks the same ways your first link suggests. All those ways will tell docker what command to run, how often to run it, etc.

If you use docker run to start a container, the UI will show unhealthy when healthchecks fail, but docker will do nothing to the container. It's up to you or some higher level monitoring solution to act on it.

If you use docker service create (or docker stack deploy) to create a Swarm service and that healthcheck fails, it will stop/kill the task (container) and reschedule a new task to replace that replica of the service. During the stop/kill (it tries to gracefully stop it, but kills after 10s like all docker containers), Swarm will stop overlay inbound traffic to that task like it does for all stopping tasks.