Transparent firewall with nftables and VLANs
I want to ask you for best practice advice in transparent firewall build.
I have 2 segments of network and CentOS serv with 2 10G interfaces. I want to filter/monitor/limit/drop traffic between segments. Traffic is tagged. Should I untagg traffic for filtering and tag it back or nftable can handle it tagged?
Now scheme looks like:
PCs--| |--PCs
PCs--|--untag--[Switch]--tag--[Switch]--untag--|--PCs
PCs--| |--PCs
I want:
PCs--| |--PCs
PCs--|--untag--[Switch]--tag--**[Firewall]**--tag--[Switch]--untag--|--PCs
PCs--| |--PCs
TL;DR: nftables, at bridge level, can handle fine both tagged or untagged packets, by using slightly different rules. All the tagging work can be done on the Linux side by making with a vlan-aware bridge, so no change of configuration is needed on the switches whatever the choice made in the firewall for nftables.
A lot of interesting documentation about testing VLANs can be found in these blog series (especially part IV, even if a few informations might not be fully accurate):
Fun with veth-devices, Linux bridges and VLANs in unnamed Linux network namespaces I II III IV V VI VII VIII
Let's put two minimalistic models of the firewall (in a network namespace). trunk100
and trunk200
are linked to the two switches sending vlan 100 tagged packets from left computers and vlan 200 tagged packets from right computers. Note that here VLANs tags are explicitely allowed to appear on the other side either by creating a sub-interface with the other's side VLAN id, either by directly adding the other side's VLAN id to the trunk interface.
-
vlan sub-interfaces putting untagged packets in the bridge
ip link add fw0 type bridge vlan_filtering 1 ip link set fw0 up for trunk in 100 200; do for vlan in 100 200; do ip link add link trunk$trunk name trunk$trunk.$vlan type vlan id $vlan ip link set trunk$trunk.$vlan master fw0 bridge vlan add vid $vlan pvid untagged dev trunk$trunk.$vlan bridge vlan del vid 1 dev trunk$trunk.$vlan ip link set trunk$trunk.$vlan up done done bridge vlan del vid 1 dev fw0 self
For this case the tagged packets arriving through trunk100 and trunk200 are split in per-vlan sub-interfaces and the packets are untagged. The bridge is still internally aware of the VLANs in use, and is applying vlan filtering on sources and destinations.
nft
will add its own restrictions. The outgoing packets will be retagged once arriving on the parent trunk interface. -
tagged packets directly into the bridge
ip link add fw0 type bridge vlan_filtering 1 ip link set fw0 up for trunk in 100 200; do ip link set trunk$trunk master fw0 for vlan in 100 200; do bridge vlan add vid $vlan tagged dev trunk$trunk done bridge vlan del vid 1 dev trunk$trunk ip link set trunk$trunk up done bridge vlan del vid 1 dev fw0 self
For this simpler case, the tagged packets traverse the bridge while retaining their vlan tag.
Here is a single nftables ruleset showing how both cases are handled. iifname
was chosen here instead of iif
so the same set of rules can work in both cases (without having an error due to a missing interface). Normally iif
should be preferred. There are additional counter entries just to check what exactly did or didn't match (with nft list ruleset -a
):
#!/usr/sbin/nft -f
flush ruleset
table bridge filter {
chain input {
type filter hook input priority -200; policy drop;
}
chain forward {
type filter hook forward priority -200; policy drop;
counter
arp operation request counter
arp operation reply counter
vlan type arp arp operation request counter
vlan type arp arp operation reply counter
arp operation request counter accept
arp operation reply counter accept
vlan type arp arp operation request counter accept
vlan type arp arp operation reply counter accept
ip protocol icmp icmp type echo-request counter
ip protocol icmp icmp type echo-reply counter
vlan type ip icmp type echo-request counter
vlan type ip icmp type echo-reply counter
iifname trunk100.100 ip protocol icmp icmp type echo-request counter accept
oifname trunk100.200 ip protocol icmp icmp type echo-reply counter accept
vlan id 100 vlan type ip icmp type echo-request counter accept
vlan id 200 vlan type ip icmp type echo-reply counter accept
}
chain output {
type filter hook output priority 200; policy drop;
}
}
Note that these rules could have been written even more verbosely. Example:
iifname "trunk100.100" ether type ip ip protocol icmp icmp type echo-request
or
ether type vlan vlan id 200 vlan type ip ip protocol icmp icmp type echo-reply
When the first setup is in use (untagged packets through sub-interfaces) only the classical rules will match. When the second setup is in use, only the rules explicitely using vlan will match. So this set of dual rules, allowing basic ARP resolution as well allowing VLAN 100 to ping VLAN 200 but not the other way around, will work in both cases.
This set of rules should be working when used with CentOS' nftables v0.6 (not tested on CentOS' kernel) or current nftables v0.8.3.
Current known limitations:
Nftables as of v0.8.3 cannot use conntrack the way it was possible with ebtables/iptables interactions. It appears there are plans about it, see this PDF: bridge filtering with nftables. So this makes stateful rules very difficult to implement.
Note also that nftables has still (as of 0.8.3) display issues: nft list ruleset -a
will drop vlan
from the "decompiled" rules if none of its options are used. Example, those two rules:
nft add rule bridge filter forward ip protocol icmp counter
nft add rule bridge filter forward vlan type ip ip protocol icmp counter
When displayed back with nft list ruleset -a
(v0.8.3):
ip protocol icmp counter packets 0 bytes 0 # handle 23
ip protocol icmp counter packets 0 bytes 0 # handle 24
It's only with nft --debug=netlink list ruleset -a
that will dump the bytecode, that it's clear that those are indeed two different rules (data are here in little endian):
bridge filter forward 23 22
[ payload load 2b @ link header + 12 => reg 1 ]
[ cmp eq reg 1 0x00000008 ]
[ payload load 1b @ network header + 9 => reg 1 ]
[ cmp eq reg 1 0x00000001 ]
[ counter pkts 0 bytes 0 ]
bridge filter forward 24 23
[ payload load 2b @ link header + 12 => reg 1 ]
[ cmp eq reg 1 0x00000081 ]
[ payload load 2b @ link header + 16 => reg 1 ]
[ cmp eq reg 1 0x00000008 ]
[ payload load 1b @ network header + 9 => reg 1 ]
[ cmp eq reg 1 0x00000001 ]
[ counter pkts 0 bytes 0 ]
CentOS' v0.6 (tested on kernel 4.15) has also its own different "decompile" display problems:
ip protocol icmp icmp type echo-request
is displayed as:
icmp type echo-request counter
which makes a syntax error if tried as is in v0.6 (but is fine in v0.8.3).