Forcing CloudFront to pass-through the latest HTML file from S3

Solution 1:

Firstly, the point of Cloudfront is to serve cached content - if you try to serve uncached content from Cloudfront it is slower than serving it directly from S3, in almost all cases (something like streaming content would be the exception). Consider for a moment what needs to happen to serve content from Cloudfront - it needs to be retrieved from the origin server to a location that is geographically close to the user - which means that for a request where Cloudfront has to retrieve content from the origin server, you add extra latency into the request, and the user receives content slower. It is only once the content is available at the edge location that subsequent requests are faster.

The best approach to this problem is to change your filenames when you update a page - this will force Cloudfront to retrieve the new content. Again, keep in mind that Cloudfront is typically used for media files (including images) and style/javascript - and not so much for html. Esssentially, you would have your HTML on S3, and your images on Cloudfront - with any changes you make, you can change the name of the file on Cloudfront (e.g. file-v1.jpg, file-v2.jpg, etc). Another common way is including a query string with version information.

Also, keep in mind that Cloudfront does not serve gzipped content - which may result in a slower response than from a regular server (although, in your case, S3 doesn't identify gzip capable browsers either).

Finally, if you want to, you can use invalidation to force Cloudfront to discard its existing copy and fetch a new one from the origin server. Note, however, that Cloudfront gives you only 1000 free invalidations per month, after which the cost is $0.005/invalidation.

The lowest time Cloudfront will keep content is 1hr, although, the default is 24hr. I'd therefore try to set the max-age to at least 3600. Consider also an s-maxage header (for shared - i.e. proxied content). Amazon recommends this caching tutorial.

There was a recent problem with this, rectified a few days ago

Solution 2:

I believe the answers so far, while correct at the time, are now out of date, as Cloudfront now supports a minimum TTL of 0, and the OP's original attempt to use cache-age=0 should now work.

You would want to look into whether to use those other cache-control headers, in terms of whether they will produce the result you are looking for - you may only need max-age. What you probably want is for Cloudfront to check S3 to see if the HTML file has changed. If it has, Cloudfront can fetch and return the new file. If not, it can serve the client from its existing cache (conserving S3 bandwidth, and serving the client faster, and more locally).

The point of Cloudfront is to serve cached content, yes, but now this includes content that sometimes changes, but can be cached if it has not changed.

P.s. query strings also work with Cloudfront now (if you configure a 'behaviour' for the relevant origin - another new feature), however some proxies may still fail to cache any files with query strings.

Amazon Developer Guide: Expiration1