How well do nginx and memcached work together?

We have a Java EE-based web application running on a Glassfish app server cluster. The incoming traffic will mainly be RESTful requests for XML-based representations of our application resources, but perhaps 5% of the traffic might be for JSON- or XHTML/CSS-based representations.

We're now investigating load-balancing solutions to distribute incoming traffic across the Glassfish instances in the cluster. We're also looking into how to offload the cluster using memcached, an in-memory distributed hash map whose keys would be the REST resource names (eg, "/user/bob", "/group/jazzlovers") and whose values are the corresponding XML representations.

One approach that sounds promising is to kill both birds with one stone and use the lightweight, fast nginx HTTP server/reverse proxy. Nginx would handle each incoming request by first looking its URI up in memcached to see if there's an unexpired XML representation already there. If not, nginx sends the request on to one of the Glassfish instances. The nginx memcached module is described in this short writeup.

What is your overall impression with nginx and memcached used this way, how happy are you with them? What resources did you find most helpful for learning about them? If you tried them and they didn't suit your purposes, why not, and what did you use instead?

Note: here's a related question. Before I knew about ServerFault I asked this on StackOverflow.

Edit: All the answers here so far have been quite helpful, though there was no direct experience. This answer did eventually show up over on StackOverflow, and it was quite bullish on the nginx/memcached setup.


Solution 1:

You really should use a cache server in front of your web servers. I recommend Varnish-cache. We use it at work with the largest and busiest website in scandinavia. We replaced 13 highly loaded Squid boxes with 1 Varnish box, and 1 for spare.

I benchmarked a simple app on my private website, and it went from 9 requests a sec to over 2000.

You decide how long it keeps things in memory, you can do a till the end of time and then just send a http purge request to the cache server when the data changes.

Solution 2:

My personal opinion, from experience, is that if you're using a load balancer, you want to limit that box entirely to load balancing functions. Having your load balancer serve content, even from a cache, degrades the load balancing functionality under high load situations (more connections stay active for longer, reducing overall capacity and throughput).

I'd advise having the app itself do the lookup and serve the cached content and let the load balancer do its job. Having said that, nginx isn't perfect when it comes to load balancing - it only offers a very basic round-robin algorithm. I'd recommend haproxy instead. If you need SSL decryption services out front, nginx works well sitting in front of haproxy, in my experience.

Solution 3:

I think you will go to dead end in case you will need such things as load balancing, high availability and etc.

Also, consider such situation: when user is authed page looks differently, with additional features available and individualized for each user. URLs are same for convinience of linking and etc. For example, site where authed user does not need to enter his name / captcha for comments or site displays your username on top, when u are logged in (Like serverfault). In such cases, nginx will be unusable, because you cannot distinguish authed user from unauthed.

If you don`t need SSL, i would suggest you running Varnish. It has been designed as HTTP Accelerator, not as web server or proxy . If you need SSL, run nginx on top as SSL accelerator and varnish as plain HTTP accelerator, because Varnish cannot deal with SSL.

I think that choice of caching server is application specific and you cannot make generalized comments about that without in-depth analysis of app.