Interrupted apt, now cannot use Ubuntu 18.04 LTS
My suspicions were correct; my system wanted to install new nvidia drivers (and it also decided to upgrade to a new kernel while at it I think but not as sure on this one). When I interrupted the installation by forcing shutdown, the installation was in the process of writing to initrd.img using update-initramfs. Hence the "freeze at login screen" and "stuck at initramfs" - they were both signs of what was wrong!
I managed to fix it and gain complete control of my original Ubuntu system including the original filesystem :) here's how I did it:
DISCLAIMER
Try this at your own risk! I did this while not caring if my original Ubuntu filesystem breaks. Having said that, this worked fine for me and I never actually felt as if my filesystem was at risk of breaking. And in the end it didn't break as far as I know - the filesystem was not broken before I did this (see EDIT 1 with fsck in the question) and is not broken after. You should make sure that the filesystem itself is not broken if you want to try this fix.
Fix
In summary, my fix involved using an Ubuntu install USB and use the trial version to access the filesystem and fix its broken apt installations. Please read the full fix before attempting it!
First, I got the Ubuntu bootable USB stick that I used to install Ubuntu in the first place (you can just re-download this from the Ubuntu website if you no longer have it, try to have the same Ubuntu-version though, 18.04 LTS in my case). I inserted it into the computer and, through BIOS, selected the USB stick as the boot option. Then, I was presented with GRUB from the USB stick, and selected the "Try Ubuntu" option. This starts a "trial version" of Ubuntu.
From here, I followed parts of this link to find the filesystem of my original Ubuntu system (the one I want to recover, not the trial version), mount it and prepare it so that I can update apt stuff on that filesystem.
I actually went through all instructions in that tutorial at first, but it did not fix my problem. However, the steps I outlined below are what I did to fix things and they worked for me.
For the sake of completeness, I will summarize the parts I followed below:
- Start a terminal using Ctrl+Alt+T. Go to root by doing
cd /
. - Use
sudo fdisk -l
to find out what your filesystem path is (in my case it was/dev/nvme0n1p5
but it can also be/dev/sdaX
for some number X). - Use
sudo mount <path from step 2> /mnt
(again, in my case the path was/dev/nvme0n1p5
). This mounts the filesystem of the original Ubuntu system so that it is accessible from the trial Ubuntu through/mnt
. - Use
sudo mount --bind /dev /mnt/dev
,sudo mount --bind /proc /mnt/proc
andsudo mount --bind /sys /mnt/sys
. No idea why this is done but I did it to follow the tutorial and it worked for me. - Use
sudo chroot /mnt
to create a temporary environment that has the original filesystem as the root. I would imagine this helps reconfiguration occur properly and not on the trial system.
Now, you should have a terminal prompt #
that waits for further instructions. This means you are in the temporary environment with changed root.
From here, I followed parts of this tutorial to fix my messed up apt installations and initramfs. I did NOT do the stuff from that tutorial that is related to locks. To summarize, I did the following from this tutorial:
- Use
sudo dpkg --configure -a
- Use
sudo apt clean
- Use
sudo apt update --fix-missing
- Use
sudo apt install -f
- Use
sudo dpkg --configure -a
(yes, again according to their tutorial :p) - Use
sudo apt upgrade
- Use
sudo apt dist-upgrade
- If you see any trouble with things not being upgraded or installed, look up what the error related to that trouble is, and how to fix that so you can successfully upgrade.
After doing this, you should be able to get your NVIDIA drivers upgraded and it should also generate a clean initrd.img file which is what caused the "stuck at initramfs" problem for me! Finally, to finish up and reboot, I did the following cleanup described in the first tutorial:
- Use
update-grub
- Use
exit
to exit the temporary environment with changed root; you should now have a "standard" terminal prompt and be out of the environment. - Use
sudo umount /mnt/dev
- Use
sudo umount /mnt/proc
- Use
sudo umount /mnt/sys
- Use
sudo umount /mnt
This unmounts the original filesystem. I then turned off the computer, booted up GRUB for my original Ubuntu system (not the trial Ubuntu bootable USB), and selected the normal first option.
And it worked! No freezes and no being stuck at initramfs. :)
Hope this helps someone, but remember the disclaimer!
With 4.15.0-151 it's 2 days I'm experiencing a lot of troubles too. My Ubuntu 18.04 LTS continues to hang randomly. Then I need to restart in maintenance mode, check filesystems, fix the errors, then reboot normally. After 4 hangs in few minutes, I reverted to 4.15.0-147 that works correctly. There's something broken in build 151 for sure.
This is what I found in journal after a reboot forced by a system hang. It never happened with 4.15.0-147:
lug 23 15:11:17 Lucifer kernel: BUG: unable to handle kernel paging request at ffff913c0fee15bc
lug 23 15:11:17 Lucifer kernel: IP: __kmalloc_node_track_caller+0x142/0x2b0
lug 23 15:11:17 Lucifer kernel: PGD 0 P4D 0
lug 23 15:11:17 Lucifer kernel: Oops: 0000 [#1] SMP PTI
lug 23 15:11:17 Lucifer kernel: Modules linked in: vboxnetadp(OE) vboxnetflt(OE) vboxdrv(OE) ccm cmac rfcomm bnep zram binfmt_misc intel_rapl x86_pkg_tem
lug 23 15:11:17 Lucifer kernel: input_leds serio_raw fb_sys_fops snd_seq snd_seq_device syscopyarea snd_timer shpchp snd sysfillrect soundcore mei_me me
lug 23 15:11:17 Lucifer kernel: CPU: 0 PID: 10386 Comm: Socket Thread Tainted: P W OE 4.15.0-151-generic #157-Ubuntu
lug 23 15:11:17 Lucifer kernel: Hardware name: Acer Aspire E5-771G/EA70_HB, BIOS V1.07 06/04/2014
lug 23 15:11:17 Lucifer kernel: RIP: 0010:__kmalloc_node_track_caller+0x142/0x2b0
lug 23 15:11:17 Lucifer kernel: RSP: 0018:ffffad0500987bb8 EFLAGS: 00010282
lug 23 15:11:17 Lucifer kernel: RAX: ffff913c0fee15bc RBX: 00000000014102c0 RCX: ffffffff9d85a077
lug 23 15:11:17 Lucifer kernel: RDX: 000000000000a81a RSI: 0000000000000000 RDI: 0000000000026180
lug 23 15:11:17 Lucifer kernel: RBP: ffffad0500987bf8 R08: ffff91e25f226180 R09: ffff91e25a802d80
lug 23 15:11:17 Lucifer kernel: R10: 0000000000000000 R11: ffff91e25a802d80 R12: 00000000014102c0
lug 23 15:11:17 Lucifer kernel: R13: 0000000000000800 R14: 00000000ffffffff R15: ffff913c0fee15bc
lug 23 15:11:17 Lucifer kernel: FS: 00007f0e53755700(0000) GS:ffff91e25f200000(0000) knlGS:0000000000000000
lug 23 15:11:17 Lucifer kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
lug 23 15:11:17 Lucifer kernel: CR2: ffff913c0fee15bc CR3: 000000015460e001 CR4: 00000000001606f0
lug 23 15:11:17 Lucifer kernel: Call Trace:
lug 23 15:11:17 Lucifer kernel: ? __alloc_skb+0x87/0x1d0
lug 23 15:11:17 Lucifer kernel: __kmalloc_reserve.isra.43+0x31/0x90
lug 23 15:11:17 Lucifer kernel: ? tcp_v4_md5_lookup+0x13/0x20
lug 23 15:11:17 Lucifer kernel: __alloc_skb+0x87/0x1d0
lug 23 15:11:17 Lucifer kernel: sk_stream_alloc_skb+0x56/0x1f0
lug 23 15:11:17 Lucifer kernel: tcp_sendmsg_locked+0x515/0xec0
lug 23 15:11:17 Lucifer kernel: tcp_sendmsg+0x2c/0x50
lug 23 15:11:17 Lucifer kernel: inet_sendmsg+0x2e/0xb0
lug 23 15:11:17 Lucifer kernel: sock_sendmsg+0x3e/0x50
lug 23 15:11:17 Lucifer kernel: SYSC_sendto+0x13f/0x180
lug 23 15:11:17 Lucifer kernel: ? SyS_futex+0x13b/0x180
lug 23 15:11:17 Lucifer kernel: SyS_sendto+0xe/0x10
lug 23 15:11:17 Lucifer kernel: do_syscall_64+0x73/0x130
lug 23 15:11:17 Lucifer kernel: entry_SYSCALL_64_after_hwframe+0x41/0xa6
lug 23 15:11:17 Lucifer kernel: RIP: 0033:0x7f0e53419a9e
lug 23 15:11:17 Lucifer kernel: RSP: 002b:00007f0e537540b0 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
lug 23 15:11:17 Lucifer kernel: RAX: ffffffffffffffda RBX: 00000000000000ac RCX: 00007f0e53419a9e
lug 23 15:11:17 Lucifer kernel: RDX: 0000000000000060 RSI: 00007f0e0f4a2000 RDI: 00000000000000ac
lug 23 15:11:17 Lucifer kernel: RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
lug 23 15:11:17 Lucifer kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 00007f0e0f4a2000
lug 23 15:11:17 Lucifer kernel: R13: 0000000000000060 R14: 0000000000000000 R15: 0000000000000000
lug 23 15:11:17 Lucifer kernel: Code: 4c 89 df 4c 89 5d c8 e8 bd ae 01 00 49 89 c1 4c 8b 5d c8 4d 85 c9 0f 85 35 ff ff ff 45 31 ff eb 4e 49 63 41 20 49 8
lug 23 15:11:17 Lucifer kernel: RIP: __kmalloc_node_track_caller+0x142/0x2b0 RSP: ffffad0500987bb8
lug 23 15:11:17 Lucifer kernel: CR2: ffff913c0fee15bc
lug 23 15:11:17 Lucifer kernel: ---[ end trace be1d19e661060db7 ]---
lug 23 15:11:17 Lucifer kernel: BUG: unable to handle kernel paging request at ffff913c0fee15bc
lug 23 15:11:17 Lucifer kernel: IP: __kmalloc_node_track_caller+0x142/0x2b0
lug 23 15:11:17 Lucifer kernel: PGD 0 P4D 0
lug 23 15:11:17 Lucifer kernel: Oops: 0000 [#2] SMP PTI
lug 23 15:11:17 Lucifer kernel: Modules linked in: vboxnetadp(OE) vboxnetflt(OE) vboxdrv(OE) ccm cmac rfcomm bnep zram binfmt_misc intel_rapl x86_pkg_tem
lug 23 15:11:17 Lucifer kernel: input_leds serio_raw fb_sys_fops snd_seq snd_seq_device syscopyarea snd_timer shpchp snd sysfillrect soundcore mei_me me
lug 23 15:11:17 Lucifer kernel: CPU: 0 PID: 10382 Comm: IPC I/O Parent Tainted: P D W OE 4.15.0-151-generic #157-Ubuntu
lug 23 15:11:17 Lucifer kernel: Hardware name: Acer Aspire E5-771G/EA70_HB, BIOS V1.07 06/04/2014
lug 23 15:11:17 Lucifer kernel: RIP: 0010:__kmalloc_node_track_caller+0x142/0x2b0
lug 23 15:11:17 Lucifer kernel: RSP: 0018:ffffad0500753a98 EFLAGS: 00010282
lug 23 15:11:17 Lucifer kernel: RAX: ffff913c0fee15bc RBX: 00000000015102c0 RCX: ffffffff9d85a077
lug 23 15:11:17 Lucifer kernel: RDX: 000000000000a81a RSI: 0000000000000000 RDI: 0000000000026180
lug 23 15:11:17 Lucifer kernel: RBP: ffffad0500753ad8 R08: ffff91e25f226180 R09: ffff91e25a802d80
lug 23 15:11:17 Lucifer kernel: R10: ffffad0500753d58 R11: ffff91e25a802d80 R12: 00000000015102c0
lug 23 15:11:17 Lucifer kernel: R13: 0000000000000740 R14: 00000000ffffffff R15: ffff913c0fee15bc
lug 23 15:11:17 Lucifer kernel: FS: 00007f0e53818700(0000) GS:ffff91e25f200000(0000) knlGS:0000000000000000
lug 23 15:11:17 Lucifer kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
lug 23 15:11:17 Lucifer kernel: CR2: ffff913c0fee15bc CR3: 000000015460e003 CR4: 00000000001606f0
lug 23 15:11:17 Lucifer kernel: Call Trace:
lug 23 15:11:17 Lucifer kernel: ? __alloc_skb+0x87/0x1d0
lug 23 15:11:17 Lucifer kernel: __kmalloc_reserve.isra.43+0x31/0x90
lug 23 15:11:17 Lucifer kernel: __alloc_skb+0x87/0x1d0
lug 23 15:11:17 Lucifer kernel: alloc_skb_with_frags+0x56/0x1b0
lug 23 15:11:17 Lucifer kernel: ? wait_woken+0x80/0x80
lug 23 15:11:17 Lucifer kernel: sock_alloc_send_pskb+0x1f2/0x220
lug 23 15:11:17 Lucifer kernel: ? _cond_resched+0x19/0x40
lug 23 15:11:17 Lucifer kernel: ? wait_for_unix_gc+0x37/0xb0
lug 23 15:11:17 Lucifer kernel: unix_stream_sendmsg+0x1b6/0x390
lug 23 15:11:17 Lucifer kernel: sock_sendmsg+0x3e/0x50
lug 23 15:11:17 Lucifer kernel: ___sys_sendmsg+0x2a0/0x2f0
lug 23 15:11:17 Lucifer kernel: ? get_futex_key+0x2f7/0x3b0
lug 23 15:11:17 Lucifer kernel: ? touch_atime+0x36/0xe0
lug 23 15:11:17 Lucifer kernel: ? futex_wake+0x8f/0x180
lug 23 15:11:17 Lucifer kernel: ? do_futex+0x18f/0x4e0
lug 23 15:11:17 Lucifer kernel: __sys_sendmsg+0x54/0x90
lug 23 15:11:17 Lucifer kernel: ? __sys_sendmsg+0x54/0x90
lug 23 15:11:17 Lucifer kernel: SyS_sendmsg+0x12/0x20
lug 23 15:11:17 Lucifer kernel: do_syscall_64+0x73/0x130
lug 23 15:11:17 Lucifer kernel: entry_SYSCALL_64_after_hwframe+0x41/0xa6
lug 23 15:11:17 Lucifer kernel: RIP: 0033:0x7f0e5341a6f7
lug 23 15:11:17 Lucifer kernel: RSP: 002b:00007f0e538136d0 EFLAGS: 00000293 ORIG_RAX: 000000000000002e
lug 23 15:11:17 Lucifer kernel: RAX: ffffffffffffffda RBX: 000000000000007b RCX: 00007f0e5341a6f7
lug 23 15:11:17 Lucifer kernel: RDX: 0000000000000040 RSI: 00007f0e53813770 RDI: 000000000000007b
lug 23 15:11:17 Lucifer kernel: RBP: 00007f0e53813770 R08: 0000000000000000 R09: 00007f0df2bda380
lug 23 15:11:17 Lucifer kernel: R10: 00007f0df723c59c R11: 0000000000000293 R12: 0000000000000040
lug 23 15:11:17 Lucifer kernel: R13: 00007f0e53813750 R14: 00007f0e2a4db1a0 R15: 0000000000000001
lug 23 15:11:17 Lucifer kernel: Code: 4c 89 df 4c 89 5d c8 e8 bd ae 01 00 49 89 c1 4c 8b 5d c8 4d 85 c9 0f 85 35 ff ff ff 45 31 ff eb 4e 49 63 41 20 49 8
lug 23 15:11:17 Lucifer kernel: RIP: __kmalloc_node_track_caller+0x142/0x2b0 RSP: ffffad0500753a98
lug 23 15:11:17 Lucifer kernel: CR2: ffff913c0fee15bc
lug 23 15:11:17 Lucifer kernel: ---[ end trace be1d19e661060db8 ]---
lug 23 15:11:17 Lucifer kernel: BUG: unable to handle kernel paging request at ffff913c0fee15bc
lug 23 15:11:17 Lucifer kernel: IP: __kmalloc_node_track_caller+0x142/0x2b0