HAProxy -- pause/queue all traffic without losing requests

I basically have the same problem as mentioned in this thread -- I would like to temporarily suspend all requests to all servers of a certain backend, so that I can upgrade the backend and the database it uses. Since this is a live system, I would like to queue up requests, and send them to the backend servers once they've been upgraded. Since I'm doing a database upgrade with the code change, I have to upgrade all backend servers simultaneously, so I can't just bring one down at a time.

I tried using the tcp-request options combined with removing the static healthcheck file as mentioned in that thread, but had no luck. Setting the default "maxconn" value to 0 seems to pause and queue connections as desired, but then there seems to be no way to increase the value back to a positive number without restarting HAProxy, which kills all requests that had been queued up until that point. (The "hot-reconfiguration" options using -sf and -st start a new process, which doesn't seem to do what I want).

Is what I'm trying to do possible?


I eventually ended up asking this question of Willy Tarreau, the author of HAProxy. He was intrigued by my suggestion, and committed a small change to HAProxy that allows for setting maxconn down to zero via the admin socket (this wasn't possible at the time I asked), which solved my problem. Quoting from the followup email I sent him:

Hi there. That solves my problem pretty well. I issued "set maxconn frontend my_frontend 0", waited a few seconds for connections to drain, and then all subsequent connections are paused. I restarted the server, issued "set maxconn frontend my_frontend 3000", and connections resumed properly, without erroring out on existing requests.

In response to JesseP's answer -- absolutely, most of the time I never want to do this. We generally try and stage our DB migrations in exactly the way you mention, because suspending traffic is risky at best. Some of our users set ridiculously low client-side timeouts, so we generally don't want traffic suspended for more than 15 seconds. But for a recent migration where we had a complex set of code and data migrations to perform simultaneously, having this option available was a lifesaver.

So, to sum up -- not recommending this for everyday use, but the option is there should the need arise.


Even if you accomplish what you're asking, you're going to end up with longer running data migrations, too many servers to update, etc., in the future, and clients will end up timing out while they wait for stuff to come back online.

You should really try to build your updates such that existing servers can still run. This may involve deploying some updated code with existing database, then updating the database, then deploying the actual update.

In the end, you should be able to deploy one server at a time, with out breaking things.