Server not sending a SYN/ACK packet in response to a SYN packet

Using iptraf, tcpdump and wireshark I can see a SYN packet coming in but only the ACK FLAG is set in reply packet.

I'm running Debian 5 with kernel 2.6.36

I've turned off window_scaling and tcp_timestamps, tcp_tw_recycle and tcp_tw_reuse:

cat /etc/sysctl.conf 



net.ipv4.tcp_tw_recycle = 0
net.ipv4.tcp_tw_reuse = 0
net.ipv4.tcp_window_scaling = 0
net.ipv4.tcp_timestamps = 0

I've attached an image of the wireshark output.

http://imgur.com/pECG0.png

Output to netstat

netstat -natu | grep '72.23.130.104'

tcp        0      0 97.107.134.212:18000    72.23.130.104:42905     SYN_RECV

I've been doing everything possible to find a solution and have yet to figure out the problem, so any help/suggestions are much appreciated.

UPDATE 1: I've set tcp_syncookies = 0 and noticed I am now replying with 1 SYN+ACK for every 50 SYN requests. The host trying to connect is sending a SYN request about once every second.

PCAP FILE


After having the same issue I finally catch the root cause.

On Linux when a socket is on TIME_WAIT and a new SYN append (for the same pair of ip/port src, ip/port dest), the kernel check if the SEQ number of the SYN is < or > than the last SEQ received for this socket.

(PS: in the image of the wireshark output attached to this issue, seq number are shown as relative, if you don't set them as absolute you can't see the issue. The capture would have to show the old session also to be able to compare SEQ numbers)

  • if the SEQ number of the SYN is > than the SEQ number of the previous packet, a new connection is crated and everything works
  • if the SEQ number of the SYN is < than the SEQ number of the previous packet, the kernel will send an ACK related to the previous socket because the kernel think that the SYN received is a delayed packet of the previous socket.

The behaviour is like that because at the beginning of TCP the SEQ number generated by computers where incremental, it was almost impossible to receive a SEQ number < than the SEQ number of a previous socket still in TIME_WAIT.

The increase of bandwidth of computers make this from almost impossible to rare. But the most important things here is that now most system use random ISN (initial SEQ number) to improve security. So nothing prevent the SEQ number a of new socket to be > than the SEQ number of a previous one.

Each OS use different algorithms that are more or less safe to avoid this particular issue http://www.bsdcan.org/2006/papers/ImprovingTCPIP.pdf give a good presentation of the issue.

There is a last tricky things... so the kernel will send a ACK related to the old session, then ? The client OS should receive the ACK (of the previous session), don't understand it because for the client the session is closed, send a RST. When the server receive this RST it will immediately clear the socket (so it's no longer in TIME_WAIT). On his side, the client is waiting for a SYN/ACK, as it don't get it, it will send a new SYN . In the meantime the RST has been send and the session cleared on the server, so this secondary SYN will work and the server will reply SYN/ACK and so on.

So the normal behavior is that the connection should work but be delayed by a second (till the secondary SYN is sent). In Jeff case, he said in a comment he use a Fortinet firewall, these firewall (by default) will drop the ACK related to the old session (because the firewall see no open session related to the ACK), so the client doesn't send any RST and the server can't clear the session from TIME_WAIT state (except of course at the end of the TIME_WAIT timer). The "set anti-replay loose" command on fortinet can allow this ACK packet to be forwarded instead of dropped.


It appears that 97.107.134.212 already believes there is a connection (72.23.130.104:42905, 97.107.134.212:18000).

When 72.23.130.104:42905 sends its SYN packet, its sequence number is 246811966. Next should be a SYN/ACK packet with its own SEQ number and an ACK value of 246811967.

But it's sending an ACK with SEQ=1736793629 and ACK=172352206. Those are probably values from an earlier connection.

Any new connection attempts should be coming FROM a different port number... is that happening? Wireshark points this out in pkt#11: "TCP Port numbers reused".

Looks like the problem is on the sender.

FWIW, I can connect just fine:

1   0.000000    192.168.0.135   97.107.134.212  TCP 45883 > biimenu [SYN] Seq=809402803 Win=14600 Len=0 MSS=1460 SACK_PERM=1 TSV=2319725 TSER=0 WS=7
2   0.022525    97.107.134.212  192.168.0.135   TCP biimenu > 45883 [SYN, ACK] Seq=4293896301 Ack=809402804 Win=14600 Len=0 MSS=1360 SACK_PERM=1
3   0.022553    192.168.0.135   97.107.134.212  TCP 45883 > biimenu [ACK] Seq=809402804 Ack=4293896302 Win=14600 Len=0

The one time I've seen this before it was because the outbound and inbound packets were taking different routes on the network, and there was a stateful connection-tracking device on the inbound leg. Since that device (a load-balancer in my case, but it could just as easily be a firewall) never saw the initial SYN, the SYN-ACK was dropped on the floor as spurious.