Network error with 65k of TIME_WAIT connections
We had some trouble with one of our image servers last week and need some help. See our munin monitoring graph:
We are runing debian squeeze and we have lots of requests because this is one of our image servers. We do not use keep-alive (maybe we should, but that's another topic)
These numbers are request counts per minute from our log files:
- 17:19: 66516
- 17:20: 64627
- 17:21: 123365
- 17:22: 111207
- 17:23: 58257
- 17:24: 17710
- ... and so on
So you see, we have lots of requests per minute but as most request are served in 0-1ms everything runs fine usually.
Now as you see in our munin graphic munin didn't manage to connect to this server on munin port and ask the relevant numbers. The connection simply failed. As the server is not overloaded by any means (cpu, memory, network). it must has something to do with our firewall/tcp stack. At the time the munin plugin failed to connect we had only 17MBit of incoming and outgoing traffic on a 100MBit connection.
you often here a limit of 65k of tcp connections, but this is normally misleading as it refers to the 16bit tcp header and belongs to 65k per ip/port combination.
our time_wait timeout is set to
net.ipv4.tcp_fin_timeout = 60
we could lower this to drop more TIME_WAIT connections earlier, but first i want to know what limits the network from being reachable.
we are using iptables with state module. But we already raised the max_conntrack parameter.
net.ipv4.netfilter.ip_conntrack_max = 524288
does anybody know what kernel parameters to look at or how to diagnose this problem next week when we have our next peek?
FIN_WAIT (timeout for the FIN request acknowledgement) is not the same as TIME_WAIT (time to ensure that the socket is really not used anymore). And yes, with 65k ports in TIME_WAIT state, you only will be running out of TCP ports if you are using a single requester IP - as might be the case if all your clients are behind a NAT device. You also might be running of resources due to an overly populated transmission control block table - see this excellent even if somewhat dated paper for possible performance implications.
If you are really concerned about your sockets in TIME_WAIT state and do not have stateful firewalls between your clients and your server, you might consider setting /proc/sys/net/ipv4/tcp_tw_recycle, but you would sacrifice RFC compliance and might have interesting side-effects in the future due to that matter.
Ok, i have found the answer by my self. The munin plugin is running quite slowly and is hitting its own timeout value. if the conntrack table is full reading from /proc/net/ip_conntrack gets very very slow. the server seems to be responsive while the munin plugin is not.