Ubuntu 18.04 error on waking up from sleep : Read-error on swap device
Solution 1:
The Ubuntu 18.04 kernel you are currently using is missing a rather important bug fix.
The fix for this is already present in the upstream Linux kernel version 4.16.8. (The suspend bug effectively started happening in kernel version 4.15). Ubuntu only needs to cherry-pick this small patch from upstream. The bug frequently causes Xorg crashes immediately after suspend, i.e. it crashes the whole graphical login session.
Note this bug often happens without showing Read-error on swap device
. Most of the time, there was no error in the kernel log. (A few times, it showed EXT4-fs error
and Buffer I/O error
instead). Also, these error messages could be caused by a hardware failure instead. When diagnosing this problem, please focus on other, more distinct details.
A test kernel is available at the end of this Ubuntu bug, i.e. in this comment: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1776887/comments/5
So far no-one has reported their results from suspending with the Ubuntu test kernel. It might be that if someone can report success, it will encourage the Ubuntu developer to finally include the bug fix. I could be wrong though, I'm not 100% sure what's holding this up.
There is also a known workaround. You can avoid the crash if you configure the kernel command line to include the option scsi_mod.scan=sync
.
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1776887
This upstream bug has been confirmed to affect Ubuntu users[1]. As per the fix commit (below), the most frequent symptom is a crash of Xorg/Xwayland, i.e. killing the entire GUI, when a laptop is woken from system sleep. Frequency of the bug is described as once every few days[2].
[1] E.g. this user confirms the bug & very specific workaround: https://bugs.launchpad.net/ubuntu/+source/xorg-server/+bug/1760450/comments/11
[2] E.g. this log of crashes: https://bugzilla.redhat.com/show_bug.cgi?id=1553979#c23
This is a bug in blk-core.c. It is not specific to any one hardware driver. Technically the suspend bug is triggered by the SCSI core - which is used by all SATA devices.
The commit also includes a test which quickly and reliably proves the existence of a horrifying bug.
I guess you might avoid this bug only if you have root on NVMe. The other way to not hit the Xorg crash is if you don't use all your RAM, so there's no pressure that leads to cold pages of Xorg being swapped. Also, you won't reproduce the Xorg crash if you suspend+resume immediately. (This frustrated my tests at one point, it only triggered after left the system suspended over lunch :).
Fix: "block: do not use interruptible wait anywhere"
in kernel 4.17: https://github.com/torvalds/linux/commit/1dc3039bc87ae7d19a990c3ee71cfd8a9068f428
in kernel 4.16.8: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/commit/?h=linux-4.16.y&id=7859056bc73dea2c3714b00c83b253d4c22bf7b6
lack of fix in 4.15.0-24.26 (ubuntu 18.04): https://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux/+git/bionic/tree/block/blk-core.c?id=Ubuntu-4.15.0-24.26#n856
I.e., this bug is still present in Ubuntu source package linux-4.15.0-24.26 (and 4.15.0-23.25). I attach hardware details (lspci-vnvn.log) of a system where this bug is known to happen.
Regards Alan
WORKAROUND: Use kernel parameter: scsi_mod.scan=sync