Multiple instances of /usr/bin/cloud-init using a lot of CPU
I have a home server running Ubuntu Server 20.04.3, I use it to host a Docker Swarm, and a couple of KVM guests. Yesterday I rebooted it (as I've done many times before) and since then I've noticed that the fans are quite loud.
Just now (about 24 hours later) I used top
to take a look, and I discovered 10 instances of the following command:
/usr/bin/python3 /usr/bin/cloud-init devel hotplug-hook -s net query
They are each using between 30.8% and 73.8% CPU, in total they use 466% CPU. Each has been running for less than a minute, but as soon as one terminates another takes its place.
Looking in /var/log/cloud-init.log
, the following lines are repeated over and over:
2021-11-19 10:19:58,268 - hotplug_hook.py[DEBUG]: hotplug-hook called with the following arguments: {hotplug_action: query, subsystem: net, udevaction: None, devpath: None}
2021-11-19 10:19:58,268 - handlers.py[DEBUG]: start: hotplug-hook: Handle reconfiguration on hotplug events
2021-11-19 10:19:58,268 - hotplug_hook.py[DEBUG]: Fetching datasource
2021-11-19 10:19:58,269 - handlers.py[DEBUG]: start: hotplug-hook/check-cache: attempting to read from cache [trust]
2021-11-19 10:19:58,269 - util.py[DEBUG]: Reading from /var/lib/cloud/instance/obj.pkl (quiet=False)
2021-11-19 10:19:58,269 - util.py[DEBUG]: Read 14682 bytes from /var/lib/cloud/instance/obj.pkl
2021-11-19 10:19:58,272 - util.py[DEBUG]: Reading from /run/cloud-init/.instance-id (quiet=False)
2021-11-19 10:19:58,272 - util.py[DEBUG]: Read 37 bytes from /run/cloud-init/.instance-id
2021-11-19 10:19:58,272 - stages.py[DEBUG]: restored from cache with run check: DataSourceNoCloud [seed=/var/lib/cloud/seed/nocloud-net][dsmode=net]
2021-11-19 10:19:58,272 - handlers.py[DEBUG]: finish: hotplug-hook/check-cache: SUCCESS: restored from cache with run check: DataSourceNoCloud [seed=/var/lib/cloud/seed/nocloud-net][dsmode=net]
2021-11-19 10:19:58,272 - hotplug_hook.py[DEBUG]: hotplug not supported for event of type net
2021-11-19 10:19:58,272 - handlers.py[DEBUG]: finish: hotplug-hook: SUCCESS: Handle reconfiguration on hotplug events
2021-11-19 10:19:58,273 - hotplug_hook.py[DEBUG]: Exiting hotplug handler
2021-11-19 10:19:58,273 - util.py[DEBUG]: Reading from /proc/uptime (quiet=False)
2021-11-19 10:19:58,273 - util.py[DEBUG]: Read 21 bytes from /proc/uptime
2021-11-19 10:19:58,273 - util.py[DEBUG]: cloud-init mode 'hotplug-hook' took 0.042 seconds (0.04)
As far as I can tell, hotplug-hook is meant to respond to hardware being added/removed. What could cause it to repeatedly be triggered like this?
Solution 1:
This is a bug in cloud-init 21.3 and has been fixed in cloud-init 21.4:
https://bugs.launchpad.net/cloud-init/+bug/1946003
You can safely rm /etc/udev/rules.d/10-cloud-init-hook-hotplug.rules
as a workaround on affected instances.