CPU cores remain offline after hotplug
When my UPS triggers an "on-battery" event, I want all cores to switch off to conserve power. The PC has 8 cores on two chips, UPS batteries need replacement every 3 months because of high current peaks. To alleviate my cost of ownership the following instructions are executed when the UPS fires such event:
for c in /sys/devices/system/cpu/cpu*/online; do
echo 0 >$c
done
Cores 1 to 7 are successfully set offline while core 0 remains up, as expected.
Then lscpu
and atop
confirm that only cpu 0 remains online, and as a further indicator, the cpu temperatures fall from 90°C to 60°C.
When grid power comes back, the inverse command is executed:
for c in /sys/devices/system/cpu/cpu*/online; do
echo 1 >$c
done
but the cores don't come online. At this point, cat /sys/devices/system/cpu/cpu*/online
prints 1
for every cpu from 0 to 7, and lscpu
reports that all cpus have returned online, but all my threads apparently continue to run exclusively on core 0, atop
still lists only one core, and the system load remains bounded at about 100% rather than 800% as it normally does. Also, the cpu temperatures remain steady at 60°C.
Anomalously, while top
lists the per-process cpu usage as percentages, these individual percentages sum up to about 100%, while the 60-seconds load average reported by top
is a steady 8.
Attributes:
- Linux 4.1.1
- Debian 8
- LXC in active use
- KVM module loaded, not in active use
- CPU constantly loaded with over 8 runnable threads
Update:
I updated the kernel from 4.1.1 to 4.5.4. After testing, the same defect is still present.
Solution 1:
This is due to a known bug in LXC regarding the cpuset
cgroup.
A few workarounds are described here: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=824519 .