Ubuntu desktop hangs occasionally during regular use
This is the bug of Nouveau video driver (kernel extension). For details, check the bugs at bugs.freedesktop.org or at GitLab, especially: #93629, #99900 and #100567 (which are related to SCHED_ERROR
/CTXSW_TIMEOUT
).
To debug the freeze, you can use Magic SysRq key, for example:
Note: Consider holding ⇧ Shift (depending on your keyboard).
- Alt-SysRq-9 (no ⇧ Shift) - Set the console log level to 9 to show more of kernel messages
- Alt-SysRq-w - Display list of blocked (D state) tasks
- Alt-SysRq-l - Shows a stack backtrace for all active CPUs.
- Alt-SysRq-t - Output a list of current tasks and their information to the console
- Alt-SysRq-p - Output the current registers and flags to the console
- Alt-SysRq-q - Display all active high-resolution timers and clock sources.
- Alt-SysRq-m - Output current memory information to the console
Other things to try during freeze:
Note: Consider holding ⇧ Shift (depending on your keyboard).
- Reset the nice level of all high-priority and real-time tasks by hitting Alt-SysRq-n.
- Try forcing a return to a text console by hitting Control-Alt-F1 (from F1 to F12).
- Kill all processes on the current virtual console (can kill X) by hitting Alt-SysRq-k.
- Perform a system crash (if it is configured) by Alt-SysRq-c.
If nothing works, you should perform a safe reboot by Alt-SysRq-REISUB, which is:
- Alt-SysRq-R: UnRaw (take control of keyboard back from X).
- Alt-SysRq-E: tErminate (send SIGTERM to all processes).
- Alt-SysRq-I: kIll (send SIGKILL to all processes, forcing them to terminate immediately).
- Alt-SysRq-S: Sync all mounted filesystems (flush data to disk).
- Alt-SysRq-U: Unmount (remount all filesystems in read-only mode),
-
Alt-SysRq-B: immediately reBoot the system.
Note: If above hard reboot combination won't work, the freeze could be caused by defected hardware, not video drivers.
Note: If some SysRq options doesn't work, due to "This sysrq operation is disabled" error, enable by:
echo 1 | sudo tee /proc/sys/kernel/sysrq
See: Configuring SysRq in Linux.
After reboot, check your kern.log
for details, especially call traces generated by above kernel commands. This can help to find the right bug report for it, and find the solution. Check the following kern.log
example.
You can check the latest crash log by:
journalctl -b -1 # Then hit Shift-G to jump to the end.
Suggested solution:
- Upgrade your Ubuntu and kernel to the latest version.
- If problem repeats, the workaround is to install NVIDIA drivers, which replaces Nouveau video driver.
- If same happens with NVIDIA drivers, this can be related to the hardware issue or graphic card overheating (try lowering your overclocking features).
Enable persistent logging
sudo mkdir /var/log/journal
Reboot
Make sure persistent logging is enabled by browsing /var/log/journal
and checking if a random named directory exists.
After the incident
List system boots
sudo journalctl --list-boots
Extract the boot with the incident
sudo journalctl -b caf0524a1d394ce0bdbcff75b94444fe > /tmp/errorlog
or just
sudo journalctl -b caf0524a1d394ce0bdbcff75b94444fe
Inspect the log.