Kube-Reserved/System-Reserved vs Eviction Threshold
I'd like to properly prepare our self-managed clusters for resource pressure scenarios. From the docs I cannot understand the need for configuring the --eviction-hard
parameter, when we can achieve the same effect by setting up proper values via --kube-reserved
for kubelet and --system-reserved
for system daemons.
Let me ask through an example. Why would I need to set the reservations for Kubelet and system-daemons, when seemingly it would suffice to configure --eviction-hard
? Whenever there's a resource pressure in general this should be enough to trigger a pod eviction event. So what's the reason for the existence of the options for Kubelet and system-daemons reservations?
As per official documentation:
Node Allocatable
'Allocatable' on a Kubernetes node is defined as the amount of compute resources that are available for pods. The scheduler does not over-subscribe 'Allocatable'. 'CPU', 'memory' and 'ephemeral-storage' are supported as of now.
The Node
allocatable (the resources that scheduler can use to allocate the workload) can be defined as:
Node allocatable
=Node capacity
-kube-reserved
-system-reserved
Also, as for:
-
kube-reserved
:
kube-reserved
is meant to capture resource reservation for kubernetes system daemons like thekubelet
,container runtime
,node problem detector
, etc. It is not meant to reserve resources for system daemons that are run as pods.kube-reserved
is typically a function ofpod density
on the nodes.-- Kubernetes.io: Docs: Tasks: Administer cluster: Reserve compute resources: Kube reserved
-
system-reserved
:
system-reserved
is meant to capture resource reservation for OS system daemons likesshd
,udev
, etc.system-reserved
should reservememory
for thekernel
too sincekernel
memory is not accounted to pods in Kubernetes at this time. Reserving resources for user login sessions is also recommended (user.slice
in systemd world).-- Kubernetes.io: Docs: Tasks: Administer cluster: Reserve compute resources: System reserved
In short, you can easily imagine what would happen when you do not reserve enough resources for system components and the Kubelet
.
You can even come to the situation where the eviction handler will not come to play because the system will already go into unstable state.
Also worth to mention that:
One thing that you can do with
--kube-reserved
and--system-reserved
is to reserve the CPU needed for those components where the--eviction-hard
is basing only on the memory and ephemeral storage.