How does server-side `TIME_WAIT` really work?

I know there are quite a few SE questions on this, and I believe I read as many of them as it matters before coming to this point.

By "server-side TIME_WAIT" I mean the state of a server-side socket pair that had its close() initiated on the server side.

I often see these statements that sound contradictory to me:

  1. Server-side TIME_WAIT is harmless
  2. You should design your network apps to have clients initiate close(), therefore having client bear the TIME_WAIT

The reason I find this contradictory is because TIME_WAIT on the client can be a problem -- the client can run of out available ports, so in essence the above is recommending to move the burden of TIME_WAIT to the client side where it can be problem, from the server side where it's not a problem.

Client-side TIME_WAIT is of course only a problem for limited number of use cases. Most of client-server solutions would involve one server and many clients, clients usually don't deal with high enough volume of connections for it to be a problem, and even if they do, there is a number of recommendations to "sanely" (as opposed to SO_LINGER with 0 timeout, or meddling with tcp_tw sysctls) combat client-side TIME_WAIT by avoiding creating too many connections too quickly. But that's not always feasible, for example for class of applications like:

  • monitoring systems
  • load generators
  • proxies

On the other side, I don't even understand how server-side TIME_WAIT is helpful at all. The reason TIME_WAIT is even there, is because it prevents injecting stale TCP fragments into streams they don't any longer belong to. For client-side TIME_WAIT it's accomplished by simply making it impossible to create a connection with the same ip:port pairs that this stale connection could have had (the used pairs are locked out by TIME_WAIT). But for the server side, this can't be prevented since the local address will have the accepting port, and always will be the same, and the server can't (AFAIK, I only have the empirical proof) deny the connection simply because an incoming peer would create the same address pair that already exists in the socket table.

I did write a program that shows that server-side TIME-WAIT are ignored. Moreover, because the test was done on 127.0.0.1, the kernel must have a special bit that even tells it whether it's a server side or a client side (since otherwise the tuple would be the same).

Source: http://pastebin.com/5PWjkjEf, tested on Fedora 22, default net config.

$ gcc -o rtest rtest.c -lpthread
$ ./rtest 44400 s # will do server-side close
Will initiate server close
... iterates ~20 times successfully
^C
$ ss -a|grep 44400
tcp    TIME-WAIT  0      0            127.0.0.1:44400         127.0.0.1:44401   
$ ./rtest 44500 c # will do client-side close
Will initiate client close
... runs once and then
connecting...
connect: Cannot assign requested address

So, for server-side TIME_WAIT, connections on the exact same port pair could be re-established immediately and successfully, and for client-side TIME-WAIT, on the second iteration connect() righteously failed

To summarize, the question is two fold:

  • Does server-side TIME_WAIT really not do anything, and is just left that way because the RFC requires it to?
  • Is the reason the recommendation is for client to initiate close() because the server TIME_WAIT is useless?

In TCP terms server side here means the host that has the socket in LISTEN state.

RFC1122 allows socket in TIME-WAIT state to accept new connection with some conditions

        When a connection is closed actively, it MUST linger in
        TIME-WAIT state for a time 2xMSL (Maximum Segment Lifetime).
        However, it MAY accept a new SYN from the remote TCP to
        reopen the connection directly from TIME-WAIT state, if it:

For exact details on the conditions, please see the RFC1122. I'd expect there also must be a matching passive OPEN on the socket (socket in LISTEN state).

Active OPEN (client side connect call) does not have such exception and must give error when the socket is in TIME-WAIT, as per RFC793.

My guess for the recommendation on client (in TCP terms the host performing active OPEN i.e. connect) initiated close is much the same as yours, that in the common case it spreads the TIME-WAIT sockets on more hosts where there is abundance of resources for the sockets. In the common case clients do not send SYN that would reuse TIME-WAIT sockets on server. I agree that to apply such recommendation still depends on the use case.