Which reverse-proxies support HTTP/1.1 ETag and If-None-Match headers?
I'm developing a caching system for an ecommerce platform that will use a reverse proxy for caching. I plan to handle invalidation by using proper HTTP/1.1 headers. That is, I will set an ETag on first generation of the content and cache that ETag value in the application. The Cache-Control header will specify "must-revalidate" so the proxy should set If-None-Match header on subsequent requests with the ETag. The application will lookup the cached ETag value and if it matches it will send a 304 response, otherwise it will generate a full 200 response.
I hoped to use nginx but I can't tell for sure that it supports ETags (docs indicate it doesn't but maybe they are out of date?). Varnish is another option but I'm not positive here either..
Which reverse proxy servers out there have full support for ETags? I'd like it to actually cache multiple versions so I can do things like split testing without having to disable the cache. That is, HTTP/1.1 specifies that a client can send If-None-Match with multiple ETag values and the server should respond with which ETag matched (if any). If the reverse proxy kept multiple copies rather than just the last-seen value and let the server specify on each request which to use, that would be ideal.
Solution 1:
I just checked in Varnish source code and even though it support If-Modified-Since
and If-None-Match
headers, it does not support must-revalidate
in Cache-Control
. The only supported attributes in Cache-Control
are max-age
and s-max-age
.
References:
-
bin/varnishd/cache/cache_rfc2616.c
in RFC2616_Do_Cond() -
bin/varnishd/cache/cache_rfc2616.c
in RFC2616_Ttl() include/tbl/http_headers.h
Solution 2:
nginx requires third party modules to support ETag. And there are two of them.
- Static ETags for caching of static content
- Dynamic ETags for caching of dynamic content
Solution 3:
Correct me if I'm wrong, and I know this is an old post - but I'd like to comment for new passers-by. I believe a Reverse Proxy cache doesn't help as much as you'd like when using ETags.
Validation caching mechanisms use the origin server to validate if the ETag (or last-modified date) in the request is still valid (matches or doesn't match the resources etag, depending on which header is used, or has/has not been modified since date given in request).
This means a reverse proxy cache such as Varnish will still pass that request through to the origin server. It may respond with the request rather than have the server handle it, but you didn't save the round trip to the origin server.
Browsers can cache responses and handle a 304 response in any case, so the user's private cache may be better suited to handle this than using a reverse proxy (YMMV, especially at scale, and depending on your use case of course. I don't want to make assumptions about your apps).
From the spec 13.3:
When a cache has a stale entry that it would like to use as a response to a client's request, it first has to check with the origin server (or possibly an intermediate cache with a fresh response) to see if its cached entry is still usable. We call this "validating" the cache entry. Since we do not want to have to pay the overhead of retransmitting the full response if the cached entry is good, and we do not want to pay the overhead of an extra round trip if the cached entry is invalid, the HTTP/1.1 protocol supports the use of conditional methods.
and then note 13.3.4:
An HTTP/1.1 caching proxy, upon receiving a conditional request that includes both a Last-Modified date and one or more entity tags as cache validators, MUST NOT return a locally cached response to the client unless that cached response is consistent with all of the conditional header fields in the request.
So, Varnish can return a response for you, but you still have a round-trip to the server. If you can use a app-cache such as APC or memcache, then that still might be worth it to you. Validation caching is generally better for bandwidth savings over server-resource savings, however.
Validation caching might best be left to the client (browser or api code).
Using the Expiration model for caching is where a reverse-proxy cache really shines. This lets you skip hitting the origin server altogether. Using Expires
, Cache-Control
, Date
, etc, is the best (again, IMO) mechanism for a reverse proxy cache as the cache can return the response, assuming its not stale, without ever hitting the origin server.
Solution 4:
You can look at Apache TrafficServer, which seems to have what you need.