POST of MTU + ~798 bytes gets lost
I have a very weird issue with certain packets not arriving on the destination host. It happens when we transmit a POST that is somewhat larger than the MTU. We can reproduce it with this script:
#!/usr/bin/python
import urllib2
magic_length = 2297
logurl = 'http://www.example.nl/'
data = (magic_length - len(logurl)) * 'X'
headers = {'content-type': 'application/x-www-form-urlencoded', 'User-Agent': 'Fake'}
request = urllib2.Request(logurl, data, headers)
handler = urllib2.build_opener(urllib2.HTTPHandler())
answer = handler.open(request, timeout=5)
The sending party doesn't get ACKs and does retransmissions. The receiving party never sees it.
It is dependent on where you run the script, and where you POST to. My home connection is one that fails (and incidentally, I've had problems with AJAX POSTs not getting through since a few months; since I have a new modem).
If I reduce the MTU of the sending machine by 100, it works again. But, if I reduce magic_length
by 100 too, it fails again. A first theory was that a layer of my ADSL (like PPPoA) adds headers and causes packets to be split erroneously, but that doesn't seem to be it then.
Perhaps something goes wrong with MTU discovery. Some hop down the line blocking all ICMP perhaps? This is the first part of a traceroute to google from my home:
traceroute to google.com (74.125.133.102), 30 hops max, 60 byte packets
1 dsldevice.lan (192.168.2.254) 0.453 ms 0.547 ms 0.636 ms
2 195.190.243.7 (195.190.243.7) 29.836 ms 29.947 ms 29.986 ms
3 nl-zl-dc2-git-cr02.kpn.net (213.75.64.237) 37.004 ms 37.153 ms 37.204 ms
4 nl-rt-dc2-ice-ir02.kpn.net (213.75.64.236) 37.261 ms 37.300 ms 37.339 ms
5 72.14.198.161 (72.14.198.161) 38.351 ms 38.395 ms 38.405 ms
6 209.85.254.92 (209.85.254.92) 37.976 ms 38.103 ms 37.972 ms
7 209.85.253.247 (209.85.253.247) 38.612 ms 72.14.238.153 (72.14.238.153) 33.709 ms 209.85.253.249 (209.85.253.249) 36.890 ms
8 209.85.240.158 (209.85.240.158) 41.052 ms 41.104 ms 209.85.244.102 (209.85.244.102) 41.164 ms
9 209.85.249.12 (209.85.249.12) 38.392 ms 209.85.249.14 (209.85.249.14) 38.247 ms 38.851 ms^C
If I ping 213.75.64.237, I get (I've never actually seen 'packet filtered' as a response on STDOUT...):
PING 213.75.64.237 (213.75.64.237) 56(84) bytes of data.
From 213.75.64.237 icmp_seq=1 Packet filtered
The rest I can ping.
This answer seems similar. However, my script doesn't set the DF (don't fragment) flag (edit: correction, the tcpdmp does show that flag is set on the POST request), nor can I see ICMP requests coming back to me when I run the script on a host that does work. Plus, the packets are already split up by the sender, and sending the second packet fails.
How do I proceed? ISPs NOCs are hard enough to reach as it is, so I need to have proof of what's going on. They're not going to help me figure it out...
Edit: to confirm or deny the ICMP type 4 (fragmentation required) hypotheses, I did this:
$ ping -c 1 -M do -s 1472 host
PING host (1.2.3.4) 1472(1500) bytes of data.
1480 bytes from host (1.2.3.4): icmp_req=1 ttl=50 time=33.8 ms
This works, but I'm a bit confused. Does the "(1500)" mean the total fragment size? I assume so, because 1480 bytes + 20 bytes IP header is 1500 bytes.
If I increase the size of the ping by one:
$ ping -c 1 -M do -s 1473 host
PING host (1.2.3.4) 1473(1501) bytes of data.
From pannekoek.lan (192.168.2.5) icmp_seq=1 Frag needed and DF set (mtu = 1500)
So, this would mean the path between the two hosts does allow packets of 1500 bytes and no fragmentation issues occur. It seems I'm back to square one.
Edit again: I have found something significant. The problem is simply that packets of certain sizes don't arrive. It happens between my modem and the ISP's first gateway:
$ for i in `seq 1025 1030`; do ping -c 1 -M do -s $i 195.190.243.7; done
PING 195.190.243.7 (195.190.243.7) 1025(1053) bytes of data.
1033 bytes from 195.190.243.7: icmp_req=1 ttl=254 time=31.2 ms <- works
--- 195.190.243.7 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 31.273/31.273/31.273/0.000 ms
==========================
PING 195.190.243.7 (195.190.243.7) 1026(1054) bytes of data.
--- 195.190.243.7 ping statistics ---
1 packets transmitted, 0 received, 100% packet loss, time 0ms <- packet loss
==========================
PING 195.190.243.7 (195.190.243.7) 1027(1055) bytes of data.
--- 195.190.243.7 ping statistics ---
1 packets transmitted, 0 received, 100% packet loss, time 0ms <- packet loss
==========================
PING 195.190.243.7 (195.190.243.7) 1028(1056) bytes of data.
--- 195.190.243.7 ping statistics ---
1 packets transmitted, 0 received, 100% packet loss, time 0ms <- packet loss
==========================
PING 195.190.243.7 (195.190.243.7) 1029(1057) bytes of data.
--- 195.190.243.7 ping statistics ---
1 packets transmitted, 0 received, 100% packet loss, time 0ms <- packet loss
==========================
PING 195.190.243.7 (195.190.243.7) 1030(1058) bytes of data.
1038 bytes from 195.190.243.7: icmp_req=1 ttl=254 time=31.1 ms <- works
--- 195.190.243.7 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 31.177/31.177/31.177/0.000 ms
I guess I have to convince them it's their problem.
Solution 1:
Somewhere along the line from point A to point B, a router has been configured with a lower MTU and that is what is breaking things. Have you tried doing a trace to see where exactly the ICMP packets are getting lost?