System becomes completely unresponsive minutes after starting 7z - how to work around or fix this?

I've never encountered this before, but most of the time I was careful to stick a nice ionice -c3 in front of the command to be executed. But this time the use of nice and ionice merely delayed the effect.

Either way, I am using Ubuntu 20.04 as my main system, i.e. as a desktop machine. It's fully patched and up-to-date, running the 5.4 kernel (i.e. none of the other available ones). I am running it with Cinnamon (and yes, it's Ubuntu proper, not Mint or so), but the system had to be installed via a server ISO when I last reinstalled it, because none of the desktop ISOs booted successfully. I'm mentioning this because I am not sure if this plays into it somehow.

When I start 7z to compress a file of dozens of GiB in size, the system becomes entirely unresponsive. No way to switch to the text console, no way to connect via SSH, no mouse cursor movement ...

The only way to recover from this situation is to power-cycle the system (in my case I long-press the power key).

Now I read a bit about the Linux scheduler (I didn't actively tinker with it) after this happened and read that the CFS (Completely Fair Scheduler) is the default scheduler since some versions back. However, clearly it's starving all other processes in favor of a program started with nice ionice -c3 ... doesn't seem even remotely fair.

The systemd journal shows nothing other than indications that the file system driver failed to write for the process hosting a VM, which was concurrently running.

How can I further diagnose this and eventually fix this, so that a system which I intend on using as a desktop won't become totally unresponsive?

NB: I'd rather that the OOM killer steps in and snipes some process than the system becoming totally unresponsive. But as far as I can tell the OOM killer didn't mind.


The system has 64 GiB of RAM, no swap file is active (I can live with the handful of instances where a program fails because I'm out of memory)

# sysctl -A | grep -v _domain | grep '\.sched'
kernel.sched_autogroup_enabled = 1
kernel.sched_cfs_bandwidth_slice_us = 5000
kernel.sched_child_runs_first = 0
kernel.sched_itmt_enabled = 1
kernel.sched_latency_ns = 24000000
kernel.sched_migration_cost_ns = 500000
kernel.sched_min_granularity_ns = 3000000
kernel.sched_nr_migrate = 32
kernel.sched_rr_timeslice_ms = 100
kernel.sched_rt_period_us = 1000000
kernel.sched_rt_runtime_us = 950000
kernel.sched_schedstats = 0
kernel.sched_tunable_scaling = 1
kernel.sched_util_clamp_max = 1024
kernel.sched_util_clamp_min = 1024
kernel.sched_wakeup_granularity_ns = 4000000

... and (UUID redacted) ...

# cat /proc/cmdline
BOOT_IMAGE=/vmlinuz-5.4.0-73-generic root=UUID=xxx ro quiet loglevel=3 vga=current nosplash udev.log_priority=3 rd.systemd.show_status=auto rd.udev.log_priority=3 plymouth.enable=0

(the latter to show that I haven't told it to use a different scheduler)

And the overall configuration is (slightly redacted):

# inxi -b -C -G -m
System:    Host: XXX Kernel: 5.4.0-73-generic x86_64 bits: 64 Desktop: Cinnamon 4.4.8
           Distro: Ubuntu 20.04.2 LTS (Focal Fossa)
Machine:   Type: Desktop System: Dell product: Precision 5820 Tower X-Series v: N/A serial: XXX
           Mobo: Dell model: 02M8NY v: A01 serial: /XXX/XXX/ UEFI: Dell v: 2.8.0 date: 01/15/2021
Memory:    RAM: total: 62.52 GiB used: 3.56 GiB (5.7%)
           Array-1: capacity: 3 TiB note: check slots: 8 EC: None
           Device-1: DIMM3 size: 16 GiB speed: 2666 MT/s
           Device-2: DIMM7 size: No Module Installed
           Device-3: DIMM1 size: 16 GiB speed: 2666 MT/s
           Device-4: DIMM5 size: No Module Installed
           Device-5: DIMM4 size: 16 GiB speed: 2666 MT/s
           Device-6: DIMM8 size: No Module Installed
           Device-7: DIMM2 size: 16 GiB speed: 2666 MT/s
           Device-8: DIMM6 size: No Module Installed
CPU:       Topology: 10-Core model: Intel Core i9-9820X bits: 64 type: MT MCP L2 cache: 16.5 MiB
           Speed: 1200 MHz min/max: 1200/4200 MHz Core speeds (MHz): 1: 1200 2: 1200 3: 1200 4: 1200 5: 1200 6: 1200 7: 1201
           8: 1201 9: 1201 10: 1201 11: 1200 12: 1200 13: 1201 14: 1200 15: 1200 16: 1200 17: 1200 18: 1200 19: 1200 20: 1200
Graphics:  Device-1: Advanced Micro Devices [AMD/ATI] Ellesmere [Radeon Pro WX 7100] driver: amdgpu v: 5.6.20.20.45
           Display: server: X.Org 1.20.9 driver: amdgpu unloaded: modesetting
           resolution: 1920x1080~60Hz, 1920x1080~60Hz, 1920x1080~60Hz
           OpenGL: renderer: AMD Radeon Pro WX 7100 Graphics v: 4.6.14756 Core Profile Context FireGL 20.45

Solution 1:

Ubuntu 20.04 is using Kernel 5.4 (with HWE: 5.8) with only the mq-deadline scheduler compiled in.

You can check it by viewing /sys/block/sda/queue/scheduler. The active scheduler has square brackets around. If there are other schedulers compiled in the kernel they are also shown.

Example:

# cat /sys/block/sda/queue/scheduler
[mq-deadline] none

# uname -r
5.4.0-26-generic

The mq-deadline scheduler does not support the mechanism used by ionice at the moment, see: https://unix.stackexchange.com/a/160081/27458

Solution: Switch to BFQ scheduler

The bfq scheduler does not need to be compiled in the kernel, it can be loaded afterwards using a kernel module.

Switch to the BFQ scheduler:

# modprobe  "bfq"
# echo "bfq" > /sys/block/sda/queue/scheduler
# echo "bfq" > /etc/modules-load.d/bfq.conf
# echo 'ACTION=="add|change", KERNEL=="sd*[!0-9]|sr*", ATTR{queue/scheduler}="bfq"' > /etc/udev/rules.d/60-scheduler.rules

Check:

# cat /sys/block/sda/queue/scheduler
mq-deadline [bfq] none

It is probably a good idea to also do a reboot and check again.

BFQ not available on "virtual" kernel

If you are using a "virtual" kernel, you will probably not have the bfq kernel module available because it does not include the linux-modules-extra-5.xxx package.

You can solve this by switching to the "generic-HWE" kernel:

# sudo apt-get install linux-generic-hwe-20.04  linux-tools-generic-hwe-20.04
# reboot

After reboot you should be on kernel 5.8.0-xxx-generic. You can check this:

# uname -r
5.8.0-59-generic

Now you can apply the above solution.

Alternative solution: Systemd scope

If you don't want to switch the IO scheduler, you can use a Systemd scope with a lower IO weight.

Create a file /usr/local/bin/mh_ionice with contents:

#!/bin/bash
if (( EUID == 0 )); then USERMODE=''; else USERMODE='--user'; fi
systemd-run \
  --collect \
  --quiet \
  --scope \
  $USERMODE \
  --nice=19 \
  --property="IOAccounting=yes" \
  --property="IOWeight=1" \
  "$@"

Make it executable:

chmod 755 /usr/local/bin/mh_ionice

Now you can run:

mh_ionice  [heavy_command] [arg] [arg] [arg]

Solution 2:

If the system locks up and becomes completely unresponsive, it sounds like you may be running out of memory.

Enabling more aggressive options in the OOM killer might help the system recover but doesn't help 7z finish.

You could use cgroups to limit RSS or ulimit to restrict the memory 7z can use, which might prevent the lockup. Careful tweaking of memory parameters in cgroups might allow 7z to thrash while giving the rest of the system good performance.

Adding swap space might allow other programs to be pushed out of memory, freeing up more for 7z to run.

Adding too much swap space might replace OOM lockup with thrashing, which is only slightly better. Reducing swap space might allow an OOM killer to kill the job instead of thrashing.

Obviously, if 7z is running out of memory, adding more ram would help the most.