Why are connections in FIN_WAIT2 state not closed by the Linux kernel?

I have an issue in a long-lived process called kube-proxy being part of Kubernetes.

The problem is that from time to time a connection is left in FIN_WAIT2 state.

$ sudo netstat -tpn | grep FIN_WAIT2
tcp6       0      0 10.244.0.1:33132        10.244.0.35:48936       FIN_WAIT2   14125/kube-proxy
tcp6       0      0 10.244.0.1:48340        10.244.0.35:56339       FIN_WAIT2   14125/kube-proxy
tcp6       0      0 10.244.0.1:52619        10.244.0.35:57859       FIN_WAIT2   14125/kube-proxy
tcp6       0      0 10.244.0.1:33132        10.244.0.50:36466       FIN_WAIT2   14125/kube-proxy

These connections stack up over time making the process misbehave. I already reported an issue to Kubernetes bug-tracker but I'd like to understand why such connections are not closed by the Linux kernel.

According to its documentation (search for tcp_fin_timeout) connection in FIN_WAIT2 state should be closed by the kernel after X seconds, where X can be read from /proc. On my machine it's set to 60:

$ cat /proc/sys/net/ipv4/tcp_fin_timeout
60

so if I understand it correctly such connections should be closed by 60 seconds. But this is not the case, they are left in such state for hours.

While I also understand that FIN_WAIT2 connections are pretty unusual (it means the host is waiting for some ACK on from the remote end of the connection which might already be gone) I don't get why these connections are not "closed" by the system.

Is there anything I could do about it?

Note that restarting the related process is a last resort.


The kernel timeout only applies if the connection is orphaned. If the connection is still attached to a socket, the program that owns that socket is responsible for timing out the shutdown of the connection. Likely it has called shutdown and is waiting for the connection to shut down cleanly. The application can wait as long as it likes for the shutdown to complete.

The typical clean shutdown flow goes like this:

  1. The application decides to shut down the connection and shuts down the write side of the connection.

  2. The application waits for the other side to shut down its half of the connection.

  3. The application detects the other side's shutdown of the connection and closes its socket.

The application can wait at step 2 for as long as it likes.

It sounds like the application needs a timeout. Once it decides to shut the connection down, it should give up waiting for the other side to do a clean shutdown after some reasonable amount of time.


If the socket is shutdown(), but not close() yet, the socket will stay in FIN_WAIT2 state. And since the application still owns the file descriptor, the kernel wouldn't bother to clean up.