Kubernetes failing to start: failed to build map of initial containers
Starting today, k3s is failing to start with the following error: "Failed to start ContainerManager" err="failed to build map of initial containers from runtime: no PodsandBox found with Id '9f141a500138e081ae1a641d7d4c00c3029ecce87da6e2fc80f4a14bd0a965fd'
.
After this log line, it crashes.
I can't find anything on the internet, so does anyone here have an idea how to solve this?
I'm running k3s version: k3s version v1.21.5+k3s2 (724ef700)
Let me know if I need to provide additional details.
Log:
...
I1021 12:04:55.508161 78816 kuberuntime_manager.go:222] "Container runtime initialized" containerRuntime="containerd" version="v1.4.11-k3s1" apiVersion="v1alpha2"
I1021 12:04:55.508361 78816 server.go:1191] "Started kubelet"
E1021 12:04:55.509247 78816 cri_stats_provider.go:369] "Failed to get the info of the filesystem with mountpoint" err="unable to find data in memory cache" mountpoint="/var/lib/rancher/k3s/agent/containerd/io.containerd.snapshotter.v1.overlayfs"
E1021 12:04:55.509273 78816 kubelet.go:1306] "Image garbage collection failed once. Stats initialization may not have completed yet" err="invalid capacity 0 on image filesystem"
I1021 12:04:55.509255 78816 server.go:149] "Starting to listen" address="0.0.0.0" port=10250
I1021 12:04:55.509887 78816 server.go:409] "Adding debug handlers to kubelet server"
I1021 12:04:55.510952 78816 fs_resource_analyzer.go:67] "Starting FS ResourceAnalyzer"
I1021 12:04:55.512769 78816 scope.go:111] "RemoveContainer" containerID="e5ce1c151a24558e69f544794a15bb6d1238139439a0c6174acf720a4f531a7c"
I1021 12:04:55.512865 78816 volume_manager.go:271] "Starting Kubelet Volume Manager"
I1021 12:04:55.512923 78816 desired_state_of_world_populator.go:141] "Desired state populator starts to run"
INFO[2021-10-21T12:04:55.516702675+02:00] RemoveContainer for "e5ce1c151a24558e69f544794a15bb6d1238139439a0c6174acf720a4f531a7c"
DEBU[2021-10-21T12:04:55.527595023+02:00] openat2 not available, falling back to securejoin
I1021 12:04:55.538886 78816 controller.go:611] quota admission added evaluator for: leases.coordination.k8s.io
I1021 12:04:55.545188 78816 kubelet_network_linux.go:56] "Initialized protocol iptables rules." protocol=IPv4
I1021 12:04:55.561242 78816 kubelet_network_linux.go:56] "Initialized protocol iptables rules." protocol=IPv6
I1021 12:04:55.561266 78816 status_manager.go:157] "Starting to sync pod status with apiserver"
I1021 12:04:55.561282 78816 kubelet.go:1846] "Starting kubelet main sync loop"
E1021 12:04:55.561318 78816 kubelet.go:1870] "Skipping pod synchronization" err="[container runtime status check may not have completed yet, PLEG is not healthy: pleg has yet to be successful]"
I1021 12:04:55.571567 78816 shared_informer.go:247] Caches are synced for endpoint slice config
I1021 12:04:55.571570 78816 shared_informer.go:247] Caches are synced for service config
INFO[2021-10-21T12:04:55.604476442+02:00] RemoveContainer for "e5ce1c151a24558e69f544794a15bb6d1238139439a0c6174acf720a4f531a7c" returns successfully
I1021 12:04:55.604584 78816 scope.go:111] "RemoveContainer" containerID="4d7578dd7f7574fd5deeae1ed53cf67d0a2fe64aa1d1214b1ba865622c05b4cd"
INFO[2021-10-21T12:04:55.604877204+02:00] labels have been set successfully on node: <node name>
INFO[2021-10-21T12:04:55.604936435+02:00] RemoveContainer for "4d7578dd7f7574fd5deeae1ed53cf67d0a2fe64aa1d1214b1ba865622c05b4cd"
I1021 12:04:55.612875 78816 kuberuntime_manager.go:1044] "Updating runtime config through cri with podcidr" CIDR="10.42.0.0/24"
INFO[2021-10-21T12:04:55.612967745+02:00] No cni config template is specified, wait for other system components to drop the config.
I1021 12:04:55.613044 78816 kubelet_network.go:76] "Updating Pod CIDR" originalPodCIDR="" newPodCIDR="10.42.0.0/24"
I1021 12:04:55.623215 78816 kubelet_node_status.go:71] "Attempting to register node" node="<node name>"
E1021 12:04:55.645403 78816 kubelet.go:1384] "Failed to start ContainerManager" err="failed to build map of initial containers from runtime: no PodsandBox found with Id '9f141a500138e081ae1a641d7d4c00c3029ecce87da6e2fc80f4a14bd0a965fd'"
With help of https://github.com/kubernetes/kubelet/issues/21 I finally figured it out.
After manually starting containerd with the following command: containerd -c /var/lib/rancher/k3s/agent/etc/containerd/config.toml -a /run/k3s/containerd/containerd.sock --state /run/k3s/containerd --root /var/lib/rancher/k3s/agent/containerd
(which I found in the k3s logs), I could search the container using crictl: k3s crictl ps -a | grep 9f141a
which gave me an container id. Then I removed the pod using k3s crictl rm <id>
and restarted k3s and now it's working again.