Linux host randomly stops answering ipv6 neighbor solicitation requests
I'm at my wits' end, so any help is appreciated.
I have an IPv6 host (Linux 4.15.1-gentoo SMP x86_64) that randomly stops sending neighbour advertisements. Running tcpdump shows a lot of neighbour solicitation requests and almost zero reaction to those requests. Occasionally, the host will still send NA, but only after a couple dozen ignored NS requests. Obviously, this completely breaks IPv6 connectivity.
I don't know if it's relevant, but IPv6 is configured on a bridge interface (a couple lxc containers are running on that bridge as well). The bridge is a typical brctl bridge with STP off.
IPv6 is configured statically (both host and gateway).
Manually flooding the network with unsolicited neighbour advertisements (using ndsend
from vzctl
for example) can mitigate the problem a little, but it's obviously not a solution.
What's even weirder, disabling and re-enabling ipv6 on the interface via procfs (/proc/sys/net/ipv6/conf/br0/disable_ipv6
) and reconfiguring it (ip -6 addr add
, etc) temporarily "fixes" the problem. It happens again in a day or two though.
For the sake of completeness, there's an nftables firewall running on the host, but it explicitly allows all icmpv6 traffic (via ip6 nexthdr ipv6-icmp accept
everywhere). Disabling the firewall when the problem manifests doesn't change anything.
So, here's the question: what can I do to pinpoint the underlying issue?
UPDATE: For me, the problem disappeared after a few kernel updates, but there are reports of similar problems on later kernel versions, particularly with large routing tables and/or a large number of neighbours.
Reportedly, one possible culprit here is the small limit on ipv6 route/neighbour cache size in kernel. If you're having similar issues, try raising net.ipv6.route.max_size
sysctl parameter to a relatively large value (e.g. 1048576
), for instance by running sysctl -w net.ipv6.route.max_size=1048576
and/or by editing /etc/sysctl.conf
. You also will likely want to raise net.ipv6.route.gc_thresh
to avoid running the garbage collector too often. Also, check net.ipv6.neigh.default.gc_thresh1
,
net.ipv6.neigh.default.gc_thresh2
and
net.ipv6.neigh.default.gc_thresh3
if you have particularly many records in the neighbour cache. See https://www.kernel.org/doc/Documentation/networking/ip-sysctl.txt for what all those options do.
Solution 1:
I just discovered there is a bug in multicast_snooping in linux vlan aware bridges. It will not touch router advertisements, but it will block neighbor discovery even if multicast_flooding is on. What happens is that on boot of a system it will do dad, and that dad will stay in the multicast forwarding database. But that expires after 200 or 300s. After that any neighbor discovery multicast packet will be dropped to that port. This only happens with neighbour discovery, not with router adverteisment. You can witness it by doing:
bridge mdb show
If it shows you entries, it will have multicast_snooping turned on. And you might/will experience the bug. In my case it is about 80% of the systems that I set up started blocking neighbor discovery multicast only. Any other multicast is flooded, or correctly snooped.
The solution for now is to turn off multicast_snooping:
echo 0 > /sys/net/devicename/bridge/multicast_snooping
When I have time I will make a test set up. This bug has been biting in my behind for 2 years now, and I finally had the time during emergency maintenance to fully grasp the problem.
Solution 2:
I was having the same issue with a 4.16.2-gentoo kernel. But in my case it turned out to be completely unrelated to the kernel.
The box in question, served as a ipv6 VPN-Gateway and was having a stable connection. Even the subnet-router behind it was perfectly fine, just the routed subnet itself was constantly losing connection.
TL;DR;
firewalld was the culprit in my case. The ipv6 rpfilter setting filtered the neighor-solicitations of my subnet router.
Found out about it by enabling the logging in /etc/firewalld/firewalld.conf
LogDenied=all
which resulted in loglines like (MAC and SRC shortened and obfuscated):
kernel: rpfilter_DROP: IN=enp6s0.100 OUT= MAC=XX:…:XX SRC=fe80:…:beaf DST=ff02:0000:0000:0000:0000:0001:ff00:0001 LEN=72 TC=0 HOPLIMIT=255 FLOWLBL=0 PROTO=ICMPv6 TYPE=135 CODE=0
I've just disabled the ipv6 rpfilter until i'm able to find out why this is happening. The setup is quite simple and everything looks fine to me, but maybe it's an issue with the interface beeing a vlan...