How broken is routing strategy that causes a martian packet (so far only) during tracepath?

Solution 1:

If you receive the martian packet, wireshark should be able to show it.

I also see you've disabled loopback by setting an unreachable route for 127.0.0.0/8. This isn't standards-compliant, and probably isn't that useful to do, but I doubt it has much to do with this problem.

The documentation paragraph simply means that you're likely to see RFC1918 addresses or other unreachable things in the traceroute since these addresses can be used between routers in many cases (eg. within one AS), but will be the address the router gives when the packet exceeds its TTL there. It doesn't mean you should expect martians. I also doubt it has anything to do with this particular packet.

The martian packet may have nothing to do with the traceroute. However, it might. It's often caused by a gateway not doing source nat when it ought to be, but it's also possible that you have a broken NAT rule somewhere translating the destination address of packets outbound from eth1 toward the IP of eth0. This seems most likely given the source of the packet. It also might mean that you're forgetting to do source NAT on outbound packets of yours at your gateway.

You should run a wireshark capture on eth1 and eth0 both, and try and find the packet in eth0 and see if you can correlate it with one from eth1. Also check your NAT rules.

Solution 2:

I believe that the "problem" is that both your interfaces are connected to the same network. At some point you're getting packets with source IP 192.168.3.20 on your eth0 interface which causes the log_martians config entry to come in actino.

I'm pretty sure that you have rp_filter enabled and that this will go away if you disable it (e.g. /proc/sys/net/ipv4/conf/all/rp_filter), but read bellow:

This can be because of two reasons:

  1. You're receiving legitimate packets from your own network from this (i.e. the wrong) interface. In your case, in theory, all packets for subnet 192.168.3.0/27 should be arriving in eth1 unless you have multipath routing, in which case you need to disable rp_filter.
  2. During your tracepath one of the intermediate links uses IP addresses from the subnet 192.168.3.0/27. I.e. imagine that one of the intermediate ISPs between you and the destination uses IP addresses from that subnet on its routers. Because of the way traceroute works you'll be receiving ICMP TTL EXCEEDED packets with that source IP (since the router will be sending them). And since your default route is via eth0, the kernel will complain (because it's expecting everything from 192.168.3.0/27 to arrive in eth1 - because of rp_filter).

There are multiple ways to troubleshoot this and yes you can tcpdump on that (e.g. tcpdump -ni eth0 'host 192.168.3.20').