Docker daemon ignores daemon.json on boot
My Docker Daemon seems to ignore /etc/docker/daemon.json
on boot.
Similar to this question, I'm having some troubles telling the Docker daemon that it should not use the default 172.17.*
range. That range is already claimed by our VPN and prevents people connected through that VPN from making a connection to the server Docker runs on.
The hugely annoying thing is that every time I reboot my server, Docker claims an IP from the VPN's range again, regardless of what I put in /etc/docker/daemon.json
. I have to manually issue
# systemctl restart docker
directly after boot before people on the 172.17.*
network can reach the server again.
This obviously gets forgotten quite often and leads to many problem tickets.
My /etc/docker/daemon.json
looks like this:
{
"default-address-pools": [
{
"base": "172.20.0.0/16",
"size": 24
}
]
}
and is permissioned like so:
-rw-r--r-- 1 root root 123 Dec 8 10:43 daemon.json
I have no idea how to even start diagnosing this problem; any ideas?
For completeness:
- Ubuntu 18.04.5 LTS
- Docker version 19.03.6, build 369ce74a3c
EDIT: output of systemctl cat docker
:
# /lib/systemd/system/docker.service
[Unit]
Description=Docker Application Container Engine
Documentation=https://docs.docker.com
After=network-online.target firewalld.service containerd.service
Wants=network-online.target
Requires=docker.socket
Wants=containerd.service
[Service]
Type=notify
# the default is not to use systemd for cgroups because the delegate issues still
# exists and systemd currently does not support the cgroup feature set required
# for containers run by docker
ExecStart=/usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock
ExecReload=/bin/kill -s HUP $MAINPID
TimeoutSec=0
RestartSec=2
Restart=always
# Note that StartLimit* options were moved from "Service" to "Unit" in systemd 229.
# Both the old, and new location are accepted by systemd 229 and up, so using the old location
# to make them work for either version of systemd.
StartLimitBurst=3
# Note that StartLimitInterval was renamed to StartLimitIntervalSec in systemd 230.
# Both the old, and new name are accepted by systemd 230 and up, so using the old name to make
# this option work for either version of systemd.
StartLimitInterval=60s
# Having non-zero Limit*s causes performance problems due to accounting overhead
# in the kernel. We recommend using cgroups to do container-local accounting.
LimitNOFILE=infinity
LimitNPROC=infinity
LimitCORE=infinity
# Comment TasksMax if your systemd version does not support it.
# Only systemd 226 and above support this option.
TasksMax=infinity
# set delegate yes so that systemd does not reset the cgroups of docker containers
Delegate=yes
# kill only the docker process, not all processes in the cgroup
KillMode=process
[Install]
WantedBy=multi-user.target
Output of sudo docker info
(after systemctl restart docker
):
Client:
Debug Mode: false
Server:
Containers: 34
Running: 19
Paused: 0
Stopped: 15
Images: 589
Server Version: 19.03.6
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version:
runc version:
init version:
Security Options:
apparmor
seccomp
Profile: default
Kernel Version: 4.15.0-140-generic
Operating System: Ubuntu 18.04.5 LTS
OSType: linux
Architecture: x86_64
CPUs: 16
Total Memory: 47.16GiB
Name: linuxsrv
ID: <redacted>
Docker Root Dir: /var/lib/docker
Debug Mode: false
Username: <redacted>
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
127.0.0.0/8
Registry Mirrors:
http://172.16.30.33:6000/
Live Restore Enabled: false
WARNING: No swap limit support
Solution 1:
There are multiple address pools used by docker. The default-address-pools
applies to all new user created bridge networks. It's possible you'll need to delete and recreate those networks after changing this setting.
There's also bip
, set in the daemon.json
file with a line like:
"bip": "192.168.63.1/24"
The bip
setting applies to the default bridge network named bridge
and needs to be set to the CIDR for the gateway on that bridge network (so you can't define it to 192.168.63.0/24
, the trailing .1
was important).
And if you are using swarm mode, overlay networks have their own address pools shared across nodes in the overlay network. That needs to be configured during docker swarm init
with the --default-addr-pool
flag.
Lastly if you are running docker via snap, the location of this file is /var/snap/docker/current/etc/docker/daemon.json
and it doesn't appear that is preserved across updates, so you'll need to replace this file again after an update.
Solution 2:
Although I thought I resolved the problem using BMitch's answer, I was wrong - the docker0
address was still in the wrong 172.17.*.*
range after boot.
After a lot more digging, it turned out that, somehow, I had multiple versions of dockerd
installed:
- the one you get if you install as per the docs
- ...the one installed via Snap 🤦♂️
Apparently, the one from Snap was the one started at boot, while the other one was the one started by running sudo systemctl restart docker
.
Uninstalling & purging the one from Snap that escaped (...evaded?) my attention finally solved this pesky problem.