Suspending with root on USB
I have a laptop running Ubuntu 14.04 from a root file system on USB storage. This is not working well, because after wake up from suspend, ext4 will frequently try to write to the root file system before USB is ready.
Here is what I see in the kernel log when this happens, notice how I get a bunch of I/O errors on sda1
, and then one second later the USB storage drive is finally detected by the kernel.
[ 2826.517419] wlan0: associated
[ 2826.517452] IPv6: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready
[ 2827.575371] EXT4-fs warning (device sda1): ext4_end_bio:317: I/O error -5 writing to inode 1733735 (offset 0 size 0 starting block 12629950)
[ 2827.575380] Buffer I/O error on device sda1, logical block 12629694
[ 2827.575400] EXT4-fs warning (device sda1): ext4_end_bio:317: I/O error -5 writing to inode 3148603 (offset 0 size 8192 starting block 12844470)
[ 2827.575404] Buffer I/O error on device sda1, logical block 12844212
[ 2827.575411] Buffer I/O error on device sda1, logical block 12844213
[ 2827.575448] EXT4-fs warning (device sda1): ext4_end_bio:317: I/O error -5 writing to inode 3015015 (offset 0 size 90112 starting block 6588832)
[ 2827.575453] Buffer I/O error on device sda1, logical block 6588576
[ 2827.575461] Buffer I/O error on device sda1, logical block 6588577
[ 2827.575465] Buffer I/O error on device sda1, logical block 6588578
[ 2827.575469] Buffer I/O error on device sda1, logical block 6588579
[ 2827.575473] Buffer I/O error on device sda1, logical block 6588580
[ 2827.575477] Buffer I/O error on device sda1, logical block 6588581
[ 2827.575481] Buffer I/O error on device sda1, logical block 6588582
[ 2828.857284] sd 0:0:0:0: [sda] No Caching mode page found
[ 2828.857293] sd 0:0:0:0: [sda] Assuming drive cache: write through
At first there is no visible indication outside of the kernel log, that the problem has triggered, but if I let Ubuntu stay running beyond this point, then the file system will get errors and eventually switch to read only mode. At that point I have to reboot into recovery mode and run fsck.ext4
manually from a root shell in order to repair the file system.
Is there some setting I can change such that access to the root device after waking from suspend can be delayed until the USB drive is ready?
The reason this problem is only seen with USB devices and not with other devices is a combination of two factors:
- USB storage unlike other storage media relies on kernel threads for operation.
- When resuming from suspend, the kernel wakes up all threads simultaneously.
The outcome is that during resume there will be a race between the USB system on one hand trying to detect media and syslog on the other hand trying to write log messages from suspend and resume to disk.
If syslog happens to attempt a write before the USB device has been detected ext4 gets an error, which for some reason isn't handled cleanly and eventually the file system will need fsck to be run manually.
The solution I found was to give kernel threads a 12 second head start before other threads are woken up. These are the changes I had to make to the kernel in order for that to work:
--- linux-3.13.0/kernel/power/suspend.c.orig 2014-01-20 03:40:07.000000000 +0100
+++ linux-3.13.0/kernel/power/suspend.c 2014-08-04 00:57:43.847038640 +0200
@@ -299,6 +299,8 @@
goto Resume_devices;
}
+unsigned int resume_delay = 0;
+
/**
* suspend_finish - Clean up before finishing the suspend sequence.
*
@@ -307,6 +309,15 @@
*/
static void suspend_finish(void)
{
+ if (resume_delay) {
+ /* Give kernel threads a head start, such that usb-storage
+ * can detect devices before syslog attempts to write log
+ * messages from the suspend code.
+ */
+ thaw_kernel_threads();
+ pr_debug("PM: Sleeping for %d milliseconds.\n", resume_delay);
+ msleep(resume_delay);
+ }
suspend_thaw_processes();
pm_notifier_call_chain(PM_POST_SUSPEND);
pm_restore_console();
--- linux-3.13.0/kernel/sysctl.c.orig 2014-08-04 08:11:26.000000000 +0200
+++ linux-3.13.0/kernel/sysctl.c 2014-08-03 23:27:23.796278219 +0200
@@ -277,8 +277,17 @@
static int max_extfrag_threshold = 1000;
#endif
+extern unsigned int resume_delay;
+
static struct ctl_table kern_table[] = {
{
+ .procname = "resume_delay",
+ .data = &resume_delay,
+ .maxlen = sizeof(unsigned int),
+ .mode = 0644,
+ .proc_handler = proc_dointvec,
+ },
+ {
.procname = "sched_child_runs_first",
.data = &sysctl_sched_child_runs_first,
.maxlen = sizeof(unsigned int),