Linux Traffic Control: How to prioritize traffic using bridge and qdisc?

I'm trying to prioritize traffic over the Linux-based software bridge in my network. When I generate traffic locally (on the machine hosting the bridge), the traffic is correctly prioritized. However, the "remote" traffic (from other nodes passing through the bridge) is not prioritized (same bandwidth distribution to all senders). Maybe someone knows why?

Bridge is set up as follows for the I350 network adapter (Linux 5.1.8-1-MANJARO #1 SMP PREEMPT Sun Jun 9 20:44:14 UTC 2019 x86_64 GNU/Linux):

brctl addbr br0
ip link set dev enp1s0f0 promisc on
ip link set dev enp1s0f1 promisc on
ip link set dev enp1s0f2 promisc on
ip link set dev enp1s0f3 promisc on

brctl addif br0 enp1s0f0
brctl addif br0 enp1s0f1
brctl addif br0 enp1s0f2
brctl addif br0 enp1s0f3

ip link set dev br0 up

tc qdisc del dev enp1s0f0  root
tc qdisc add dev enp1s0f0  root prio
tc qdisc del dev enp1s0f1  root
tc qdisc add dev enp1s0f1  root prio
tc qdisc del dev enp1s0f2  root
tc qdisc add dev enp1s0f2  root prio
tc qdisc del dev enp1s0f3  root
tc qdisc add dev enp1s0f3  root prio

ip addr add 192.168.1.1/24 dev br0

UDP traffic is generated with iperf3 and by setting the TOS field appropriately e.g.

Low-Prio Sender: iperf3 -c 192.168.1.140 -u -b 100m -S 0x2 -p 5201 -t 30
Hi-Prio Sender : iperf3 -c 192.168.1.140 -u -b 100m -S 0x0 -p 5202 -t 30

Prio map is left with default settings: priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1

Prioritization works for remote traffic if I explicitly classify the traffic:

tc filter add dev enp1s0f1 parent 1: protocol ip prio 10 u32 match ip dst 192.168.1.140 match ip dport 5201 0xffff flowid 1:1
tc filter add dev enp1s0f1 parent 1: protocol ip prio 20 u32 match ip dst 192.168.1.140 match ip dport 5202 0xffff flowid 1:2

but not with default settings.... Maybe it is a Layer 2/ Layer 3 issue?


I've read the source code of the bridging and the prio queue scheduler. And I've got some results:

  • The prio qdisc uses the skb->priority field to classify packets with the priomap.
  • By default the skb-priority field isn't being filled for L2 transit frames.
  • So the right way is add classifier to every bridge port to classify the frames by ToS/DSCP field.
  • By default the priomap looks like 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1.
  • Same mapping between ToS to the queue band can be done with classifier (handle of the root qdisc is 1:0, so child classes will have classid from range 1:1 - 1:3):
tc filter add dev eth0 parent 1: protocol ip flower ip_tos 0x00 classid 1:2
tc filter add dev eth0 parent 1: protocol ip flower ip_tos 0x02 classid 1:3
tc filter add dev eth0 parent 1: protocol ip flower ip_tos 0x04 classid 1:3
tc filter add dev eth0 parent 1: protocol ip flower ip_tos 0x06 classid 1:3
tc filter add dev eth0 parent 1: protocol ip flower ip_tos 0x08 classid 1:2
tc filter add dev eth0 parent 1: protocol ip flower ip_tos 0x0a classid 1:3
tc filter add dev eth0 parent 1: protocol ip flower ip_tos 0x0c classid 1:1
tc filter add dev eth0 parent 1: protocol ip flower ip_tos 0x0e classid 1:1
tc filter add dev eth0 parent 1: protocol ip flower ip_tos 0x10 classid 1:2
tc filter add dev eth0 parent 1: protocol ip flower ip_tos 0x12 classid 1:2
tc filter add dev eth0 parent 1: protocol ip flower ip_tos 0x14 classid 1:2
tc filter add dev eth0 parent 1: protocol ip flower ip_tos 0x16 classid 1:2
tc filter add dev eth0 parent 1: protocol ip flower ip_tos 0x18 classid 1:2
tc filter add dev eth0 parent 1: protocol ip flower ip_tos 0x1a classid 1:2
tc filter add dev eth0 parent 1: protocol ip flower ip_tos 0x1c classid 1:2
tc filter add dev eth0 parent 1: protocol ip flower ip_tos 0x1e classid 1:2

In the meantime I have managed to find the solution :)

Linux bridge (brctl) works as a Layer 2 device.

The TOS marking is part of the IEEE P802.1p (which belongs to the IEEE_802.1Q) and belongs to the IP header (https://en.wikipedia.org/wiki/IEEE_802.1Q)

As Linux bridge works on Layer 2, it seems to ignore this field. (however, according to the OSI model https://en.wikipedia.org/wiki/OSI_model 802.1Q belongs to Layer 2) Consequently, all packets were directed to the same qdisc class (in my setup class 1:2) I figured it out with the following command:

tc -s -s -d c ls dev enp1s0f1  

which allows you to observe queues for different qdisc classes at runtime Later, my "remote" traffic was scheduled as traffic from class 1:2 with other streams (e.g. "local" streams from the machine hosting the bridge) what led to correct results in some usecases..... so be careful! ;)

What worked for me was bridging of network connections with proxy ARP (i.e. enforcing layer 3, https://wiki.debian.org/BridgeNetworkConnectionsProxyArp)

Firstly activate IP forwarding and proxy arp

echo 1 > /proc/sys/net/ipv4/conf/all/proxy_arp
echo 1 > /proc/sys/net/ipv4/ip_forward

and later add routes to your nodes:

ip ro add <node IP>/32 dev <local interface>

example:

    ip ro add 192.168.1.12/32 dev enp1s0f0