Update Docker container without downtime
The ideal target scenario
Yes, you should use a load balancer and update one instance at a time. I'm not sure where inter-container communication comes in.
As an example, imagine you have a load balancer which serves your site A. Users only connect to it as and only know it as "A". The load balancer knows that there are two or more backends (B, C, etc.), and whether they're VMs or containers doesn't matter.
Then, you want to upgrade the backends, which in this case are Apache instances.
- take B out of the eligible backends for the load balancer so it's no longer accepting any traffic.
- wait for the currently-live requests to be served and existing connections closed.
- update the container or underlying VM that serves B
- restart B, wait for it to load and start working
- test B to make sure it's serving new requests properly
- add B back to the load balancer backend pool to re-enable traffic
Then, do the same process for C, D, etc.
Note that there's an open request for in-place upgrades of Docker containers, from Nov 2013, but it doesn't appear to have much progress so the above solution is what you should do in the mean time.
What to do for an existing live site
Presumably, you're asking this because you're already running a live site in this model and you would like to upgrade it without downtime. So, we need to get to the ideal target state above, but incrementally.
Let's assume that:
- you have a DNS name pointing to your container
- your container runs on some IP address
- your users don't know the container's IP address and it's not hard-coded anywhere
If these assumptions are false, you should first fix it such that this is correct.
Then, follow these steps:
- create a load balancer at a new IP and point it at the existing container as its only backend
- change DNS to point to the load balancer rather than the container IP directly
- add an identical Apache backend with the same VM + container setup
- now you have a load balancer with two backends B and C, so follow the directions in the "ideal target scenario" section for upgrading them one at-a-time
How to update a load balancer
The easy (hosted) way
The easiest option is to not run your own balancer. For example, if you're using a cloud platform which provides load balancing as a service, consider using it and then maintenance and update of the load balancer is not an issue.
The manual way
If you are running your own load balancer, adding another layer of indirection (i.e., DNS) will help. Let's assume the following:
- that we have a host name resolving to the IP of our load balancer A which we would like to update
- our load balancer has a backend pool of P1, P2, etc.
We proceed as follows:
- create a new load balancer B with the new software version
- add all backend pool instances P1, P2, etc. to our new load balancer B as backends
-
add B's IP address to the DNS resolution along with A
- now we're effectively using DNS as a load balancer
- if the entries for A and B are unweighted, they're effectively 50-50
- now watch to see how B performs, whether there are any errors, etc.
-
if anything is wrong with B, undo as follows:
- remove B from the DNS config
- wait for the the B entry in the DNS to disappear (i.e., wait for TTL to expire)
- turn down B
- assume you've done the "burn-in" test for B and everything is fine
- update the priority and weight for B in DNS gradually
- remove A from DNS entirely
- wait for DNS TTL to expire; A should not be getting any requests anymore
- turn down A
and you're done.
Details, diagrams, and tooling
See these write-ups and tools that can help you automate the process, but the general idea is the same:
- Quay: Zero Downtime deployments
- Zero Downtime Frontend Deploys with Vulcand on CoreOS
The Moral
"All problems in computer science can be solved by another level of indirection, except of course for the problem of too many indirections." — David Wheeler