Where does tcpdump get the source IP from for a TCP packet?

I've been trying to come up with a solution for a routing problem (multiple interfaces connected to a single Docker container, ensuring response packets go out the right interface), and have come across an interesting observation: while using TRACE to log packets only shows the source IP as being the Docker network interface, tcpdump manages to show the actual source IP address of the attached interface. See below. Can someone tell me where this source address comes from? And a bonus question if someone has an idea, how would I match this source address in an iptables rule (if at all possible)?

Oct 23 09:54:43 <hostname> kernel: [145206.331674] TRACE: raw:PREROUTING:policy:3 IN=br-55939cd46cf5 OUT= PHYSIN=<phys> MAC=<mac> SRC=172.23.0.2 DST=<ext ip> LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=TCP SPT=80 DPT=63742 SEQ=515334190 ACK=1161940855 WINDOW=28960 RES=0x00 ACK SYN URGP=0 OPT (020405B40402080A0228C591A136C09A01030307)

10.112.0.103.80 > <external ip>.64710: Flags [S.], cksum 0x839a (incorrect -> 0x3647), seq 3129672596, ack 2031230462, win 28960, options [mss 1460,sackOK,TS val 36559662 ecr 2706000943,nop,wscale 7], length 0

This is not my actual problem, but it will at least help me understand it. Thanks in advance!


Solution 1:

There are many places where a network packet (e.g. TCP/IP packet) can get inspected on a Linux system. When you mention TRACE, I'll assume you mean the TRACE from iptables. tcpdump and iptables look at packets at different times during the packet's flow through the system. So as Michael Hampton commented, "One is before NAT and one is after NAT".

There are a number of useful diagrams depicting packet flow through a Linux system (search "linux network packet flow" on Google). To go a bit more detailed in the answer, have a look at the diagram in the following StackExchange Unix & Linux question:

https://unix.stackexchange.com/questions/281108/understanding-bridge-check-hop-in-packet-flow-in-linux-kernel

Also available in SVG here: https://en.wikipedia.org/wiki/Netfilter#/media/File:Netfilter-packet-flow.svg

In that diagram, I believe that tcpdump (via libpcap) inspects the packet at the step labeled "taps (e.g.AF_PACKET)". Then depending upon where you inserted your TRACE, you might see a different source address. Where did you insert your TRACE (e.g. in the base host or in the docker container)? I should ask that in a comment but I still don't have enough reputation to add a comment to a question in Server Fault.

The Stack Exchange Superuser site also has a similar question with a nice answer: https://superuser.com/questions/925286/does-tcpdump-bypass-iptables