Kubernetes node pool will not autoscale to 0 nodes

Solution 1:

On GKE 1.18, my experiments show that I'd have to add a node taint in order to make the node pool able to shrink to zero:

$ gcloud container node-pools create ... \
      --min-nodes 0 \
      --max-nodes 2 \
      --node-taints=...  # Without a taint, my node pool won't scale down to zero somehow.

Solution 2:

Autoscaler will not reduce your node pool to 0.

Note: If you specify a minimum of zero nodes, an idle node pool can scale down completely. However, at least one node must always be available in the cluster to run system Pods.

-- Google Cloud: Kubernetes engine cluster autoscaler

However, cluster autoscaler cannot completely scale down to zero a whole cluster. At least one node must always be available in the cluster to run system pods. So you need to keep at least one node. But this doesn’t mean you need to keep one expensive node running idle.

-- Medium.com: Scale your kubernetes cluster to almost zero with gke autoscaler

You can explicitly reduce your node pool to zero (0) with command:

$ gcloud container clusters resize CLUSTER_NAME --node-pool NAME_OF_THE_POOL --num-nodes 0

But be aware that this approach will have a drawback.

Image a situation where:

  • You scale down cluster to zero nodes with command above
  • You create a workload on the cluster that has zero nodes

Autoscaler will not be able to increase a number of nodes from zero. It will not have the means to tell if additional resources are required. The pods that were running in kube-system on those nodes were essential to determine if another node is required.

There is an article with use case similar to yours. Please take a look: Medium.com: Scale your kubernetes cluster to almost zero with gke autoscaler

Another way to do it is with pod disruption budgets. Please take a look on below resources:

  • Kubernetes.io: Disruptions
  • Kubernetes.io: How disruption budgets work.
  • Kubernetes.io: Configure pod disruption budget

Possible reasons that can prevent cluster autoscaler from removing a node:

  • Pods with restrictive PodDisruptionBudget.
  • Kube-system pods that:
    • are not run on the node by default,
    • don't have a pod disruption budget set or their PDB is too restrictive (since CA 0.6).
  • Pods that are not backed by a controller object (so not created by deployment, replica set, job, stateful set etc).
  • Pods with local storage.
  • Pods that cannot be moved elsewhere due to various constraints (lack of resources, non-matching node selectors or affinity, matching anti-affinity, etc)
  • Pods that have the following annotation set: "cluster-autoscaler.kubernetes.io/safe-to-evict": "false"

Unless the pod has the following annotation (supported in CA 1.0.3 or later):

"cluster-autoscaler.kubernetes.io/safe-to-evict": "true"

-- Github.com: Kubernetes autoscaler: what types of pods can prevent ca from removing a node

CA doesn't remove underutilized nodes if they are running pods that it shouldn't evict

Other possible reasons for not scaling down:

  • the node group already has the minimum size,
  • there was a failed attempt to remove this particular node, in which case Cluster Autoscaler will wait for extra 5 minutes before considering it for removal again,

-- Github.com: I have a couple of nodes with low utilization but they are not scaled-down why