MySQL lowering wait_timeout value to lower number of open connections

I run a rather busy site, and during peek hours I see over 10.000 open connections to my database server on my webserver when a run a netstat command. 99% of the connections are in the TIME_WAIT state.

I learned about this mysql variable: wait_timeout http://dev.mysql.com/doc/refman/5.1/en/server-system-variables.html#sysvar_wait_timeout today. Mine is still set at the default 28.800 seconds.

Is lowering this value safe?

Non of my queries usually takes over a second. So it seems silly to keep a connection open for 480 minutes.

I also heard about using mysql_pconnect instead of mysql_connect, but i've been reading nothing but horror stories about it, so I think i'll stay away from that.


Solution 1:

Lowering the value is pretty trivial without a mysql restart

Let's say you want to lower timeouts to 30 seconds

First, add this to my.cnf

[mysqld]
interactive_timeout=30
wait_timeout=30

Then, you can do something like this

mysql -uroot -ppassword -e"SET GLOBAL wait_timeout=30; SET GLOBAL interactive_timeout=30"

All DB Connections after this will timeout in 30 seconds

WARNING

Make sure to use explicitly use mysql_close. I do not trust Apache as most developers do. If not, sometimes, there is a race condition where Apache closes a DB Connection but does not inform mysqld and mysqld hold that connection open until it times out. Even worse, you may see TIME_WAITs more often. Choose your timeout values wisely.

UPDATE 2012-11-12 10:10 EDT

CAVEAT

After applying my posted suggestions, create a script called /root/show_mysql_netstat.sh with the following lines:

netstat | grep mysql > /root/mysql_netstat.txt
cat /root/mysql_netstat.txt | awk '{print $5}' | sed 's/:/ /g' | awk '{print $2}' | sort -u > /root/mysql_netstat_iplist.txt
for IP in `cat /root/mysql_netstat_iplist.txt`
do
        ESCOUNT=`cat /root/mysql_netstat.txt | grep ESTABLISHED | awk '{print $5}' | grep -c "${IP}"`
        TWCOUNT=`cat /root/mysql_netstat.txt | grep TIME_WAIT   | awk '{print $5}' | grep -c "${IP}"`
        IPPAD=`echo "${IP}..................................." | cut -b -35`
        (( ESCOUNT += 1000000 ))
        (( TWCOUNT += 1000000 ))
        ES=`echo ${ESCOUNT} | cut -b 3-`
        TW=`echo ${TWCOUNT} | cut -b 3-`
        echo ${IPPAD} : ESTABLISHED:${ES} TIME_WAIT:${TW}
done
echo ; echo
netstat -nat | awk '{print $6}' | sort | uniq -c | sort -n | sed 's/d)/d/'

When you run this, you should see something like this:

[root@*** ~]# /root/ShowConnProfiles.sh
10.48.22.4......................... : ESTABLISHED:00002 TIME_WAIT:00008
10.48.22.8......................... : ESTABLISHED:00000 TIME_WAIT:00002
10.64.51.130....................... : ESTABLISHED:00001 TIME_WAIT:00000
10.64.51.133....................... : ESTABLISHED:00000 TIME_WAIT:00079
10.64.51.134....................... : ESTABLISHED:00002 TIME_WAIT:00001
10.64.51.17........................ : ESTABLISHED:00003 TIME_WAIT:01160
10.64.51.171....................... : ESTABLISHED:00002 TIME_WAIT:00000
10.64.51.174....................... : ESTABLISHED:00000 TIME_WAIT:00589
10.64.51.176....................... : ESTABLISHED:00001 TIME_WAIT:00570


      1 established
      1 Foreign
     11 LISTEN
     25 ESTABLISHED
   1301 TIME_WAIT

If you still see a lot of mysql TIME_WAITs for any given web server, here are two escalation steps to take:

ESCALATION #1

Login to the offending web server and restart apache as follows:

service httpd stop
sleep 30
service httpd start

If necessary, do this to all the web servers

service httpd stop (on all web servers)
service mysql stop
sleep 120
service mysql start
service httpd start (on all web servers)

ESCALATION #2

You can force the OS to kill TIME_WAITs for mysql or any other app with the following:

SEC_TO_TIMEWAIT=1
echo ${SEC_TO_TIMEWAIT} > /proc/sys/net/ipv4/tcp_tw_recycle
echo ${SEC_TO_TIMEWAIT} > /proc/sys/net/ipv4/tcp_tw_reuse

This will make TIME_WAITs time out in 1 second.

To give credit where credit is due...

  • I got this idea from this post: How to forcibly close a socket in TIME_WAIT?
  • The accepted answer has a pictorial representation of when a TIME_WAIT comes into existence.
  • The answer with the idea that I liked is the one I am now suggesting.

Solution 2:

If you are getting a lot of TIME_WAIT connections on the MySQL Server then that means that MySQL server is closing the connection. The most likely case in this instance would be that a host or several hosts got on a block list. You can clear this by running:

mysqladmin flush-hosts

To get a list of the number of connections you have per IP run:

 netstat -nat | awk {'print $5'} | cut -d ":" -f1 | sort | uniq -c | sort -n

You can also confirm this is happening by going to one of your clients that is having trouble connecting and telnet to port 3306. It will show a message with something like:

telnet mysqlserver 3306
Trying 192.168.1.102...
Connected to mysqlserver.
Escape character is '^]'.
sHost 'clienthost.local' is blocked because of many connection errors; unblock with 'mysqladmin flush-hosts'Connection closed by foreign host.

Solution 3:

If you have lots of TIME_WAIT connections to your MySQL server, it means you code is running lots of queries against your DB, and opening/closing a connection for each query.

In this case, you should use persistent connectivity to your DB server, using the MySQLi extension.

http://php.net/manual/en/mysqli.persistconns.php

If you can't use MySQLi, you should instead use the thread_cache_size parameter in your MySQL configuration.