CoreDNS failing due to a loop: how to feed kubelet with proper resolvConf?
This is where investigation started: CoreDNS couldn't work for more that a couple of seconds, giving the following errors:
$ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
ingress-nginx ingress-nginx-controller-8xcl9 1/1 Running 0 11h
ingress-nginx ingress-nginx-controller-hwhvk 1/1 Running 0 11h
ingress-nginx ingress-nginx-controller-xqdqx 1/1 Running 2 (10h ago) 11h
kube-system calico-kube-controllers-684bcfdc59-cr7hr 1/1 Running 0 11h
kube-system calico-node-62p58 1/1 Running 2 (10h ago) 11h
kube-system calico-node-btvdh 1/1 Running 0 11h
kube-system calico-node-q5bkr 1/1 Running 0 11h
kube-system coredns-8474476ff8-dnt6b 0/1 CrashLoopBackOff 1 (3s ago) 5s
kube-system coredns-8474476ff8-ftcbx 0/1 Error 1 (2s ago) 5s
kube-system dns-autoscaler-5ffdc7f89d-4tshm 1/1 Running 2 (10h ago) 11h
kube-system kube-apiserver-hyzio 1/1 Running 4 (10h ago) 11h
kube-system kube-controller-manager-hyzio 1/1 Running 4 (10h ago) 11h
kube-system kube-proxy-2d8ls 1/1 Running 0 11h
kube-system kube-proxy-c6c4l 1/1 Running 4 (10h ago) 11h
kube-system kube-proxy-nzqdd 1/1 Running 0 11h
kube-system kube-scheduler-hyzio 1/1 Running 5 (10h ago) 11h
kube-system kubernetes-dashboard-548847967d-66dwz 1/1 Running 0 11h
kube-system kubernetes-metrics-scraper-6d49f96c97-r6dz2 1/1 Running 0 11h
kube-system nginx-proxy-dyzio 1/1 Running 0 11h
kube-system nginx-proxy-zyzio 1/1 Running 0 11h
kube-system nodelocaldns-g9wxh 1/1 Running 0 11h
kube-system nodelocaldns-j2qc9 1/1 Running 4 (10h ago) 11h
kube-system nodelocaldns-vk84j 1/1 Running 0 11h
kube-system registry-j5prk 1/1 Running 0 11h
kube-system registry-proxy-5wbhq 1/1 Running 0 11h
kube-system registry-proxy-77lqd 1/1 Running 0 11h
kube-system registry-proxy-s45p4 1/1 Running 2 (10h ago) 11h
kubectl describe
on that pod didn't bring much to the picture:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 67s default-scheduler Successfully assigned kube-system/coredns-8474476ff8-dnt6b to zyzio
Normal Pulled 25s (x4 over 68s) kubelet Container image "k8s.gcr.io/coredns/coredns:v1.8.0" already present on machine
Normal Created 25s (x4 over 68s) kubelet Created container coredns
Normal Started 25s (x4 over 68s) kubelet Started container coredns
Warning BackOff 6s (x11 over 66s) kubelet Back-off restarting failed container
But viewing logs did:
$ kubectl logs coredns-8474476ff8-dnt6b -n kube-system
.:53
[INFO] plugin/reload: Running configuration MD5 = 5b233a0166923d642fdbca0794b712ab
CoreDNS-1.8.0
linux/amd64, go1.15.3, 054c9ae
[FATAL] plugin/loop: Loop (127.0.0.1:49048 -> :53) detected for zone ".", see https://coredns.io/plugins/loop#troubleshooting. Query: "HINFO 2906344495550081187.9117452939332601176."
It's great that troubleshooting documentation was linked! I started browsing that page and discovered, that indeed my /etc/resolv.conf
contained problematic local IP nameserver 127.0.0.53
.
Also, I found real DNS IPs in /run/systemd/resolve/resolv.conf
, but the question now is: how to perform the action described in the troubleshooting documentation, saying:
Add the following to your kubelet config yaml: resolvConf: (or via command line flag --resolv-conf deprecated in 1.10). Your “real” resolv.conf is the one that contains the actual IPs of your upstream servers, and no local/loopback address. This flag tells kubelet to pass an alternate resolv.conf to Pods. For systems using systemd-resolved, /run/systemd/resolve/resolv.conf is typically the location of the “real” resolv.conf, although this can be different depending on your distribution.
So, the questions are:
- how to find or where to create mentioned kubelet config yaml,
- at what level should I specify the
resolvConf
value, and - can it accept multiple values? I have two nameservers defined. Should they be given as separate entries or an array?
Solution 1:
/etc/resolv.conf/
is located in each of your nodes. You can edit it by SSH
ing into the node.
Then you have to restart kubelet
for changes to take effect.
sudo systemctl restart kubelet
(If that does not work, restart your nodes with sudo reboot
)
/home/kubernetes/kubelet-config.yaml
(also located on each of your nodes) file contains kubelet's config. You can create new resolv.conf
file, and point to it with resolvConf
field
apiVersion: kubelet.config.k8s.io/v1beta1
...
kind: KubeletConfiguration
...
resolvConf: <location of the file>
Important: New configuration will only be applied to pods created after the update. It's highly recommended to drain your node before changing configuration.
can it accept multiple values? I have two nameservers defined. Should they be given as separate entries or an array?
Kubelet Configuration documentation states resolvConf
is of type string, so probably only single value is accepted.