Kubernetes - How to debug Failed Scheduling "0 nodes are available"
I often find myself trying to spin up a new pod, only to get an error saying that no node is available. Something like:
0/9 nodes are available: 1 node(s) had no available volume zone, 8 node(s) didn't match node selector.
I'm always at a loss when I get those messages. How am I supposed to debug that?
Solution 1:
At the beginning my advice is to take a look at Kubernetes Scheduler Component:
Component on the master that watches newly created pods that have no node assigned, and selects a node for them to run on.[-] Factors taken into account for scheduling decisions include individual and collective resource requirements, hardware/software/policy constraints, affinity and anti-affinity specifications, data locality, inter-workload interference and deadlines.
A scheduler watches for newly created Pods that have no Node assigned. For every Pod that the scheduler discovers, the scheduler becomes responsible for finding the best Node for that Pod to run on For every newly created pods or other unscheduled pods, kube-scheduler selects a optimal node for them to run on. However, every container in pods has different requirements for resources and every pod also has different requirements. Therefore, existing nodes need to be filtered according to the specific scheduling requirements.
As per documentation:
In a cluster, Nodes that meet the scheduling requirements for a Pod are called feasible nodes. If none of the nodes are suitable, the pod remains unscheduled until the scheduler is able to place it.
kube-scheduler selects a node for the pod in a 2-step operation. Standard kube-scheduler based on Default policies:
- Filtering
- Scoring
Looking into those two policies you can find more information where the decisions were made. For example:
for scoring at the stage CalculateAntiAffinityPriorityMap This policy helps implement pod anti-affinity.
Below you can find quick review based on Influencing Kubernetes Scheduler Decisions
- Node name: by adding a node’s hostname to the .spec.nodeName parameter of the Pod definition, you force this Pod to run on that specific node. Any selection algorithm used by the scheduler is ignored. This method is the least recommended.
- Node selector: by placing meaningful labels on your nodes, a Pod can use the nodeSelector parameter to specify one or more key-value label maps that must exist on the target node to get selected for running that Pod. This approach is more recommended because it adds a lot of flexibility and establishes a loosely-coupled node-pod relationship.
- Node affinity: this method adds even more flexibility when choosing which node should be considered for scheduling a particular Pod. Using Node Affinity, a Pod may strictly require to be scheduled on nodes with specific labels. It may also express some degree of preference towards particular nodes by influencing the scheduler to give them more weight.
- Pod affinity and anti-affinity: when Pod coexistence (or non-coexistence) with other Pods on the same node is essential, you can use this method. Pod affinity allows a Pod to require that it gets deployed on nodes that have Pods with specific labels running. Similarly, a Pod may force the scheduler not to place it on nodes having Pods with particular labels.
- Taints and tolerations: in this method, instead of deciding which nodes the Pod gets scheduled to, you decide which nodes should not accept any Pods at all or only selected Pods. By tainting a node, you’re instructing the scheduler not to consider this node for any Pod placement except if the Pod tolerates the taint. The toleration consists of a key, value, and the effect of the taint. Using an operator, you can decide whether the entire taint must match the toleration for a successful Pod placement or only a subset of the data must match.
As per k8s documentaions:
1. NodeName is the simplest form of node selection constraint, but due to its limitations it is typically not used. Some of the limitations of using nodeName to select nodes are:
- If the named node does not exist, the pod will not be run, and in some cases may be automatically deleted.
- If the named node does not have the resources to accommodate the pod, the pod will fail and its reason will indicate why, e.g. OutOfmemory or OutOfcpu. Node names in cloud environments are not always predictable or stable.
2. The affinity/anti-affinity feature, greatly expands the types of constraints you can express. The key enhancements are: - the language is more expressive (not just “AND of exact match”) - you can indicate that the rule is “soft”/“preference” rather than a hard requirement, so if the scheduler can’t satisfy it, the pod will still be scheduled - you can constrain against labels on other pods running on the node (or other topological domain), rather than against labels on the node itself, which allows rules about which pods can and cannot be co-located
The affinity feature consists of two types of affinity, node affinity and inter-pod affinity/anti-affinity. Node affinity is like the existing nodeSelector (but with the first two benefits listed above),
There are currently two types of pod affinity and anti-affinity, called requiredDuringSchedulingIgnoredDuringExecution and preferredDuringSchedulingIgnoredDuringExecution which denote “hard” vs. “soft” requirements.
Hope this help.
Additional resources:
Affinity and anti-affinity
scheduler Performance Tuning
Making Sense of Taints and Tolerations in Kubernetes
Kubernetes Taints and Tolerations PreferNoSchedule