login loop after upgrading to 4.4.0-116 kernel: graphical login screen -> black screen -> graphical login screen

Solution 1:

The issue is with gcc version that doesn't support retpoline (What is a retpoline and how does it work?). See Ubuntu bug: 4.4.0-116 Kernel update on 2/21 breaks Nvidia drivers (on 14.04 and 16.04).

In my case, purging ppa:ubuntu-toolchain-r/test to install the default gcc version and rebuilding with DKMS the nvidia module (by reinstalling 4.4.0-116 kernel) fixes the problem. See instructions posted by @cjjefcoat on the bug tracker.

Restore default gcc by purging ppa:ubuntu-toolchain-r/test's version:

$ sudo apt-get install ppa-purge
$ sudo ppa-purge ppa:ubuntu-toolchain-r/test

gcc version (on Ubuntu 16.04) with retpoline support:

$ gcc --version
gcc (Ubuntu 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609

Reinstall kernel:

$ sudo apt-get purge linux-headers-4.4.0-116 linux-headers-4.4.0-116-generic linux-image-4.4.0-116-generic linux-image-extra-4.4.0-116-generic linux-signed-image-4.4.0-116-generic
$ sudo apt-get install linux-generic linux-signed-generic

Check nvidia module:

$ modinfo nvidia_xxx -k 4.4.0-116-generic | grep vermagic
vermagic:       4.4.0-116-generic SMP mod_unload modversions retpoline 

replace _xxx with your version -- just press TAB after modinfo nvidia

retpoline should be in the output.

After that reboot completed successfully.


If you have compatible gcc version already, you could rebuild nvidia module using dkms command without reinstalling the kernel:

# dkms remove nvidia-xxx/yyy.zzz -k 4.4.0-116-generic
# dkms install nvidia-xxx/yyy.zzz -k 4.4.0-116-generic

I've decided to reinstall the kernel instead to update all modules that were re-built with DKMS using a wrong gcc version.

Solution 2:

I don't know whether Ask Ubuntu is a correct place for that shit, but — as I need new g++ and still periodically update kernel — I've written a bash-script that (1) purges ppa:ubuntu-toolchain-r/test, (2) rebuilds all DKMS-modules for chosen kernels, (3) installs g++-7 back — per this answer.

The script is provided "as is", without warranties of any kind.
Please, don't use it unless you understand the meaning of every line.
It's intended for saving time when doing things you are able to do manually (not for doing "magic" you don't understand).

The script:

#!/bin/bash -e

for list in /etc/apt/sources.list.d/ubuntu-toolchain-r*.list; do
    sudo cp -a "$list" "$list.backup"
    echo "Backed up $list to $list.backup"
done
sudo ppa-purge ppa:ubuntu-toolchain-r/test

readarray -t kernels < <(ls -1 /lib/modules)
echo "Kernels: ${kernels[*]}"
for kernel in "${kernels[@]}"; do
    dkms_modules=($(sudo dkms status -k "$kernel" | sed -r 's#^([^,]+), ([^,]+), .*$#\1/\2#'))
    while true; do
        echo
        read -p "Reinstall DKMS-modules (${dkms_modules[*]}) on kernel $kernel? [Y/n] " choice
        if [ "${choice^^}" = N ]; then continue 2; fi
        if [ "${choice^^}" = Y ] || [ -z "$choice" ]; then break; fi
        echo "Expected 'y', 'n' or '', but got '$choice'"
    done
    echo
    for dkms_module in "${dkms_modules[@]}"; do
        sudo dkms remove -k "$kernel" "$dkms_module"
        sudo dkms install -k "$kernel" "$dkms_module"
    done
    echo
    for module in /lib/modules/"$kernel"/updates/dkms/*.ko; do
        vermagic="$(modinfo -F vermagic $module)"
        echo -n "Vermagic for $(basename ${module%.ko}): $vermagic -- "
        fgrep -q retpoline <<<"$vermagic" && echo ok || echo "'retpoline' is missing!!!"
    done
done

for list in /etc/apt/sources.list.d/ubuntu-toolchain-r*.list; do
    sudo mv "$list.backup" "$list"
    echo "Restored $list from $list.backup"
done
sudo apt update
sudo apt install g++-7
sudo apt dist-upgrade