r8168/r8169 ethernet fails to connect to Internet after resume from sleep

My r8168 (using r8169 driver) ethernet card won't connect to the Internet after resuming from sleep. It doesn't appear to get an ipv4 address.

sudo lshw -C network (cable not connected)

  *-network                 
       description: Ethernet interface
       product: RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller
       vendor: Realtek Semiconductor Co., Ltd.
       physical id: 0
       bus info: pci@0000:02:00.0
       logical name: eth0
       version: 0c
       serial: xx:xx:xx:xx:xx:xx
       capacity: 1Gbit/s
       width: 64 bits
       clock: 33MHz
       capabilities: pm msi pciexpress msix vpd bus_master cap_list ethernet physical tp mii 10bt 10bt-fd 100bt 100bt-fd 1000bt-fd autonegotiation
       configuration: autonegotiation=on broadcast=yes driver=r8169 firmware=rtl8168g-2_0.0.1 02/06/13 latency=0 link=no multicast=yes port=MII
       resources: irq:18 ioport:e000(size=256) memory:f7d00000-f7d00fff memory:f0000000-f0003fff

This question shows an exact similar problem, but none of the fixes that I could see, would fix my broken network.

  • the network works fine from Windows 10
  • the network works fine in Ubuntu 19.10, until I resume from sleep
  • rebooting does not fix the broken network
  • powering off the computer does not fix the broken network
  • restarting NetworkManager does not fix the broken network
  • trying netplan, instead of NetworkManager, does not fix the broken network
  • unloading and reloading the r8169 driver does not fix the broken network
  • I tried adding pci=nomsi or pcie_pme=nomsi to the kernel line in GRUB, but it does not fix the broken network
  • laptop motherboard replaced
  • an external USB->Ethernet adapter works fine

Update #1:

[   18.014660] NETDEV WATCHDOG: eth0 (r8169): transmit queue 0 timed out
[   18.014680] WARNING: CPU: 1 PID: 1420 at net/sched/sch_generic.c:447 dev_watchdog+0x258/0x260
[   18.014680] Modules linked in: [modules list redacted]
[   18.014759] CPU: 1 PID: 1420 Comm: clamd Not tainted 5.3.0-23-generic #25-Ubuntu
[   18.014760] Hardware name: TOSHIBA Satellite E55-A/ZEMAA, BIOS 1.50 12/02/2013
[   18.014764] RIP: 0010:dev_watchdog+0x258/0x260
[   18.014767] Code: 85 c0 75 e5 eb 9f 4c 89 ff c6 05 ae 37 eb 00 01 e8 0d f9 fa ff 44 89 e9 4c 89 fe 48 c7 c7 28 d3 e0 92 48 89 c2 e8 03 20 74 ff <0f> 0b eb 80 0f 1f 40 00 0f 1f 44 00 00 55 48 89 e5 41 57 49 89 d7
[   18.014768] RSP: 0000:ffffb71ac001ce30 EFLAGS: 00010286
[   18.014770] RAX: 0000000000000000 RBX: ffff90b1431ed000 RCX: 0000000000000006
[   18.014771] RDX: 0000000000000007 RSI: 0000000000000096 RDI: ffff90b14f897440
[   18.014773] RBP: ffffb71ac001ce60 R08: 00000000000003c8 R09: 0000000000000004
[   18.014774] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000001
[   18.014775] R13: 0000000000000000 R14: ffff90b14ddb6480 R15: ffff90b14ddb6000
[   18.014776] FS:  00007f82ea445d80(0000) GS:ffff90b14f880000(0000) knlGS:0000000000000000
[   18.014778] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   18.014779] CR2: 00007f82d6205000 CR3: 00000004064d6001 CR4: 00000000001606e0
[   18.014780] Call Trace:
[   18.014782]  <IRQ>
[   18.014787]  ? pfifo_fast_enqueue+0x150/0x150
[   18.014792]  call_timer_fn+0x32/0x130
[   18.014795]  __run_timers.part.0+0x177/0x270
[   18.014799]  ? enqueue_hrtimer+0x3d/0x90
[   18.014803]  ? recalibrate_cpu_khz+0x10/0x10
[   18.014806]  ? ktime_get+0x42/0xa0
[   18.014809]  run_timer_softirq+0x2a/0x50
[   18.014812]  __do_softirq+0xe1/0x2d6
[   18.014815]  ? hrtimer_interrupt+0x13b/0x220
[   18.014818]  irq_exit+0xae/0xb0
[   18.014821]  smp_apic_timer_interrupt+0x7b/0x140
[   18.014823]  apic_timer_interrupt+0xf/0x20
[   18.014824]  </IRQ>
[   18.014827] RIP: 0033:0x7f82eceb91e7
[   18.014829] Code: 00 48 81 fa 80 00 00 00 0f 82 9c 02 00 00 c5 fe 6f 0e c5 f5 74 0f c5 fe 6f 56 20 c5 ed 74 57 20 c5 fe 6f 5e 40 c5 e5 74 5f 40 <c5> fe 6f 66 60 c5 dd 74 67 60 c5 ed db e9 c5 dd db f3 c5 cd db ed
[   18.014830] RSP: 002b:00007fff32fd9458 EFLAGS: 00000206 ORIG_RAX: ffffffffffffff13
[   18.014832] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff00000000ffff
[   18.014833] RDX: 000000000000008c RSI: 00007f82d6205c6a RDI: 00007f82da1c92b2
[   18.014834] RBP: 00007f82d6205bd8 R08: 0000000000000046 R09: 0000000000000003
[   18.014835] R10: 0000000000026d38 R11: 0000000000026d1d R12: 00007f82da1c9368
[   18.014836] R13: 0000000000000046 R14: 00007f82da1c9220 R15: 00007f82da1c8130

Solution 1:

After working reliably for years, some Software Update broke my ethernet connection. I believe it was due to changes in the r8169 driver.

After days of troubleshooting, and even replacing my motherboard, it initially worked, and then days later it failed again.

I booted to a Ubuntu Live USB 19.10, and ethernet didn't work there either.

I booted into Windows 10 and it didn't work there, but it had previously. So I used the Windows network troubleshooting process to reset the ethernet adapter, and it started to work again.

Back in Ubuntu, I retried the r8168-dkms driver, which I had tried before without luck, only this time it seems to work. A reboot after installing was required.

Update #1:

This finally fixed the problem. MSI interrupts were enabled for the r8168/r8169, and this script disables this, just for this card. Follow the installation instructions at the beginning of the script.

#!/bin/sh

#
# https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1779817
#

# Attached is a work-around for the in-kernel driver that is as unhacky
# as I can make it.

# filename: r8169_disable_msi

# Drop it in /etc/initramfs-tools/scripts/init-top and chmod a+x it. 
# Add 'r8169_disable_msi' to your kernel command line
# (/etc/default/grub, GRUB_CMDLINE_LINUX_DEFAULT="quiet splash" usually.) 

# Remember to update-initramfs and update-grub as necessary.

# sudo update-initramfs -c -k $(uname -r)
# sudo update-grub
# reboot

# For the moment it disables MSI on everything with the ID  
# 0x10ec:0x8168, as there seems to be no way to get the MAC version 
# from userspace - and certainly not before the driver is loaded. 
# Other PCI IDs may need adding..

PREREQ=""
prereqs()
{
    echo "$PREREQ"
}
case $1 in
# get pre-requisites
prereqs)
    prereqs
    exit 0
    ;;
esac

disable_msi () {
    for i in /sys/bus/pci/devices/*; do 
        if [ $(cat $i/vendor) = "0x10ec" -a $(cat $i/device) = "0x8168" ]; then
            echo 0 >$i/msi_bus
        fi
    done
}

for x in $(cat /proc/cmdline); do
        case ${x} in
        r8169_disable_msi)
        disable_msi
        break
                ;;
        esac
done