Application control of TCP retransmission on Linux

Looks like this was added in Kernel 2.6.37. Commit diff from kernel Git and Excerpt from change log below;

commit dca43c75e7e545694a9dd6288553f55c53e2a3a3 Author: Jerry Chu Date: Fri Aug 27 19:13:28 2010 +0000

tcp: Add TCP_USER_TIMEOUT socket option.

This patch provides a "user timeout" support as described in RFC793. The
socket option is also needed for the the local half of RFC5482 "TCP User
Timeout Option".

TCP_USER_TIMEOUT is a TCP level socket option that takes an unsigned int,
when > 0, to specify the maximum amount of time in ms that transmitted
data may remain unacknowledged before TCP will forcefully close the
corresponding connection and return ETIMEDOUT to the application. If 
0 is given, TCP will continue to use the system default.

Increasing the user timeouts allows a TCP connection to survive extended
periods without end-to-end connectivity. Decreasing the user timeouts
allows applications to "fail fast" if so desired. Otherwise it may take
upto 20 minutes with the current system defaults in a normal WAN
environment.

The socket option can be made during any state of a TCP connection, but
is only effective during the synchronized states of a connection
(ESTABLISHED, FIN-WAIT-1, FIN-WAIT-2, CLOSE-WAIT, CLOSING, or LAST-ACK).
Moreover, when used with the TCP keepalive (SO_KEEPALIVE) option,
TCP_USER_TIMEOUT will overtake keepalive to determine when to close a
connection due to keepalive failure.

The option does not change in anyway when TCP retransmits a packet, nor
when a keepalive probe will be sent.

This option, like many others, will be inherited by an acceptor from its
listener.

Signed-off-by: H.K. Jerry Chu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

I suggest that if the TCP_USER_TIMEOUT socket option described by Kimvais is available, you use that. On older kernels where that socket option is not present, you could repeatedly call the SIOCOUTQ ioctl() to determine the size of the socket send queue - if the send queue doesn't decrease over your timeout period, that indicates that no ACKs have been received and you can close the socket.


After some thinking (and googling), I came to the conclusion that you can't change tcp_retries1 and tcp_retries2 value for a single socket unless you apply some sort of patch to the kernel. Is that feasible for you?

Otherways, you could use TCP_KEEPALIVE socket option whose purpose is to check if a connection is still active (it seems to me that that's exactly what you are trying to achieve, so it has sense). Pay attention to the fact that you need to tweak its default parameter a little, because the default is to disconnect after about 2 hrs!!!