How to gracefully remove a node from Kubernetes?

I want to scale up/down the number of machines to increase/decrease the number of nodes in my Kubernetes cluster. When I add one machine, I’m able to successfully register it with Kubernetes; therefore, a new node is created as expected. However, it is not clear to me how to smoothly shut down the machine later. A good workflow would be:

  1. Mark the node related to the machine that I am going to shut down as unschedulable;
  2. Start the pod(s) that is running in the node in other node(s);
  3. Gracefully delete the pod(s) that is running in the node;
  4. Delete the node.

If I understood correctly, even kubectl drain (discussion) doesn't do what I expect since it doesn’t start the pods before deleting them (it relies on a replication controller to start the pods afterwards which may cause downtime). Am I missing something?

How should I properly shutdown a machine?


Solution 1:

List the nodes and get the <node-name> you want to drain or (remove from cluster)

kubectl get nodes

1) First drain the node

kubectl drain <node-name>

You might have to ignore daemonsets and local-data in the machine

kubectl drain <node-name> --ignore-daemonsets --delete-local-data

2) Edit instance group for nodes (Only if you are using kops)

kops edit ig nodes

Set the MIN and MAX size to whatever it is -1 Just save the file (nothing extra to be done)

You still might see some pods in the drained node that are related to daemonsets like networking plugin, fluentd for logs, kubedns/coredns etc

3) Finally delete the node

kubectl delete node <node-name>

4) Commit the state for KOPS in s3: (Only if you are using kops)

kops update cluster --yes

OR (if you are using kubeadm)

If you are using kubeadm and would like to reset the machine to a state which was there before running kubeadm join then run

kubeadm reset

Solution 2:

  1. Find the node with kubectl get nodes. We’ll assume the name of node to be removed is “mynode”, replace that going forward with the actual node name.
  2. Drain it with kubectl drain mynode
  3. Delete it with kubectl delete node mynode
  4. If using kubeadm, run on “mynode” itself kubeadm reset

Solution 3:

Rafael. kubectl drain does work as you describe. There is some downtime, just as if the machine crashed.

Can you describe your setup? How many replicas do you have, and are you provisioned such that you can't handle any downtime of a single replica?