How many reverse proxies (nginx, haproxy) is too many?

Solution 1:

From a purely performance perspective, let benchmarking make these decisions for you rather than assuming -- using a tool like httperf is invaluable when making architecture changes.

From an architectural philosophy perspective, I'm a little curious why you have both nginx and apache on the application servers. Nginx blazes at static content and efficiently handles most backend frameworks/technologies (Rails, PHP via FastFCGI, etc), so I would drop the final Apache layer. Once again, this comes from a limited understanding of the technologies that you're using, so you may have a need for it that I'm not anticipating (but if that's the case, you could always drop nginx on the app servers and just use apache -- it's not THAT bad at static content when configured properly).

Currently, I use nginx -> haproxy on load balancing servers and nginx on the app servers with much success. As Willy Tarreau stated, nginx and haproxy are a very fast combination, so I wouldn't worry about the speed of having both on the front-end, but keep in mind that adding additional layers increases complexity as well as the number of points of failure.

Solution 2:

Your setup is more and more common. You don't have to worry. Both nginx and haproxy are extremely fast to process and forward HTTP requests. Both combine very well together and do their job very well. No need to choose. Install them both and be happy. That way you will deliver static files very quickly and also ensure smooth scaling of your dynamic servers.

Don't worry for the number of proxies. The problem is often "can I use a proxy". Sometimes it's not practical. If you can have one, you can have two or three. Many complex architectures involve up to 5-6 levels of proxies and still scale very well. You should just be careful about one thing : do not install more of such proxies on a single machine than this machine has of CPU cores, or the proxies will have to share their CPU time under high loads, which will increase response times. But for this to happen with nginx and haproxy on a machine, this would mean loads of tens of thousands of requests per second, which is not everyone's problem of the day.

Also, avoid mixing single-threaded proxies with massively multi-threaded/multi-process software such as apache, java, etc on the same system.

Once you take these rules into account, simply draw the architecture that suits your needs, put names on the boxes, combine them in a sane way and install them.

Solution 3:

Remember complexity can be just as much (if not more) of an impediment to scaling as code/design. As you scatter your implimentation details across more and more services and config-files you create something that is more difficult to scale out, has more of a learning curve to anyone new to the team, requires more software/packages to manage, complicates troubleshooting with more potential failure points, etc. Setting up a 4-proxy-layer stack for a site that would have been fine with just-apache or just-nginx is basically the sysadmin version of "premature optimization".