Linux 5.9 VLAN interfaces don't receive traffic unless promiscuous mode is enabled

I've just upgraded a NAS running Debian Buster to Debian Bullseye. This included a kernel upgrade from 4.19.0 to 5.9.0, and a systemd upgrade from 241 to 247.1 (the system is using systemd-networkd for configuration).

The network configuration is moderately complex:

  • eno1/eno2: dual-port Intel I210 onboard Gigabit Ethernet (using the igb driver)

  • main: 802.3ad bonding interface using eno1/eno2 as physical links

  • vlan60/vlan63: vlan subinterfaces using main as their base

main, vlan60, and vlan63 all have IPv4 and IPv6 addresses on them, but there are no addresses on eno1 or eno2.

All of the interfaces are using the default MAC address mode, which means all five are using the built-in MAC address of eno1. While attempting to solve this problem I configured locally-administered MAC addresses on vlan60 and vlan63 but that did not help (in fact it broke IPv6 support on main, but I never figured out why).

With the Buster configuration, this all worked fine. After upgrading to Bullseye, main works fine, but vlan60 and vlan63 do not send or receive any traffic. Internal traffic to/from their addresses works fine, but external traffic does not.

While attempting to troubleshoot this, I started tcpdump on vlan63, and immediately noticed that traffic began flowing. Stopping the packet capture caused traffic to stop flowing again.

At the moment I've got the system up and running by executing ip link eno1 set promisc on and ip link eno2 set promisc on, and all the interfaces are happily passing traffic. It is not necessary to enable promiscuous mode on the higher-level interfaces, only the physical interfaces.

Without promiscuous mode enabled, it appears that the VLAN subinterfaces don't receive any broadcast frames from the network, so this results in ARP and NDP being non-functional.

Has there been some behavior change in between these kernel versions (I know, that's a lot of kernel versions...) which could affect this?


Solution 1:

I'm not sure what did kernel changed, but I had the same issue on Proxmox VE 7.0 (Debian 11), with kernel version 5.11.22-1. The issue was gone when I upgraded the kernel version to 5.11.22-3.

It might be the kernel issue. Just upgrade to newer version of kernel and there will be no problem.