How to debug when Kubernetes nodes are in 'Not Ready' state
I initialized the master node and add 2 worker nodes, but only master and one of the worker node show up when I run the following command:
kubectl get nodes
also, both these nodes are in 'Not Ready' state. What are the steps should I take to understand what the problem could be?
- I can ping all the nodes from each of the other nodes.
- The version of Kubernetes is 1.8.
- OS is Cent OS 7
-
I used the following repo to install Kubernetes:
cat <<EOF > /etc/yum.repos.d/kubernetes.repo [kubernetes] name=Kubernetes baseurl=http://yum.kubernetes.io/repos/kubernetes-el7-x86_64 enabled=1 gpgcheck=0 repo_gpgcheck=0 EOF yum install kubelet kubeadm kubectl kubernetes-cni
First, describe nodes and see if it reports anything:
$ kubectl describe nodes
Look for conditions, capacity and allocatable:
Conditions:
Type Status
---- ------
OutOfDisk False
MemoryPressure False
DiskPressure False
Ready True
Capacity:
cpu: 2
memory: 2052588Ki
pods: 110
Allocatable:
cpu: 2
memory: 1950188Ki
pods: 110
If everything is alright here, SSH into the node and observe kubelet
logs to see if it reports anything. Like certificate erros, authentication errors etc.
If kubelet
is running as a systemd service, you can use
$ journalctl -u kubelet
Steps to debug:-
In case you face any issue in kubernetes, first step is to check if kubernetes self applications are running fine or not.
Command to check:- kubectl get pods -n kube-system
If you see any pod is crashing, check it's logs
if getting NotReady
state error, verify network pod logs.
if not able to resolve with above, follow below steps:-
-
kubectl get nodes
# Check which node is not in ready state -
kubectl describe node nodename
#nodename which is not in readystate -
ssh to that node
-
execute
systemctl status kubelet
# Make sure kubelet is running -
systemctl status docker
# Make sure docker service is running -
journalctl -u kubelet
# To Check logs in depth
Most probably you will get to know about error here, After fixing it reset kubelet with below commands:-
systemctl daemon-reload
systemctl restart kubelet
In case you still didn't get the root cause, check below things:-
-
Make sure your node has enough space and memory. Check for
/var
directory space especially. command to check:-df
-kh
,free -m
-
Verify cpu utilization with top command. and make sure any process is not taking an unexpected memory.
I was having similar issue because of a different reason:
Error:
cord@node1:~$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
node1 Ready master 17h v1.13.5
node2 Ready <none> 17h v1.13.5
node3 NotReady <none> 9m48s v1.13.5
cord@node1:~$ kubectl describe node node3
Name: node3
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ----------------- ------------------ ------ -------
Ready False Thu, 18 Apr 2019 01:15:46 -0400 Thu, 18 Apr 2019 01:03:48 -0400 KubeletNotReady runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
Addresses:
InternalIP: 192.168.2.6
Hostname: node3
cord@node3:~$ journalctl -u kubelet
Apr 18 01:24:50 node3 kubelet[54132]: W0418 01:24:50.649047 54132 cni.go:149] Error loading CNI config list file /etc/cni/net.d/10-calico.conflist: error parsing configuration list: no 'plugins' key
Apr 18 01:24:50 node3 kubelet[54132]: W0418 01:24:50.649086 54132 cni.go:203] Unable to update cni config: No valid networks found in /etc/cni/net.d
Apr 18 01:24:50 node3 kubelet[54132]: E0418 01:24:50.649402 54132 kubelet.go:2192] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
Apr 18 01:24:55 node3 kubelet[54132]: W0418 01:24:55.650816 54132 cni.go:149] Error loading CNI config list file /etc/cni/net.d/10-calico.conflist: error parsing configuration list: no 'plugins' key
Apr 18 01:24:55 node3 kubelet[54132]: W0418 01:24:55.650845 54132 cni.go:203] Unable to update cni config: No valid networks found in /etc/cni/net.d
Apr 18 01:24:55 node3 kubelet[54132]: E0418 01:24:55.651056 54132 kubelet.go:2192] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
Apr 18 01:24:57 node3 kubelet[54132]: I0418 01:24:57.248519 54132 setters.go:72] Using node IP: "192.168.2.6"
Issue:
My file: 10-calico.conflist was incorrect. Verified it from a different node and from sample file in the same directory "calico.conflist.template".
Resolution:
Changing the file, "10-calico.conflist" and restarting the service using "systemctl restart kubelet", resolved my issue:
NAME STATUS ROLES AGE VERSION
node1 Ready master 18h v1.13.5
node2 Ready <none> 18h v1.13.5
node3 Ready <none> 48m v1.13.5