POST request is repeated with nginx loadbalanced server (status 499)

Solution 1:

Short answer: try this for your location block:

location / {
  proxy_read_timeout 120;
  proxy_next_upstream error;
  proxy_pass http://$backend;
}

Longer explanation:

I think I have just encountered exactly the problem you described:

  • I use nginx reverse proxy as a load balancer
  • for long-running requests, the backend receives the same request multiple times
  • the nginx access logs of the upstream nodes show 499 status for these requests, and the same request appears in different upstream nodes

It turns out that this is actually default behaviour for nginx as reverse proxy, and upgrading it to higher versions will therefore not resolve this issue, although it was given as a possible solution here, but this addresses a different issue.

It happens because nginx as a load balancer chooses an upstream node in round-robin fashion. When the chosen node fails, the request is sent to the next node. The important thing to note here is that node failure is by default classed as error or timeout. Since you did not set a proxy_read_timeout, the default is 60 seconds. So after 60 seconds of waiting, nginx picks the next node and sends the same request.

So one solution is to increase this timeout so that your long-running operation can complete, e.g. by setting proxy_read_timeout 120; (or whatever limit suits your needs).

Another option is to stop the reverse proxy from trying to use the next node, in case of timeout, by setting proxy_next_upstream error;. Or you could set both these options, as suggested above.

Solution 2:

From this forum topic we learned that the culprint might be SPDY. For that user it seems a solution to disable it, and we have not had double posts since disabling it either.

The exact problem, other then "SPDY did it", is unknown at this point, side-effects of the proposed solution (disable SPDY) are obviously "no more SPDY", but we can live with that.

Until the bug jumps up again I call this a 'fix' (or at least: a solution to the problem).

edit: We have not (25-02-2014) seen this problem come up anymore, so this seems like a lasting solution indeed.