No Cache-Control Header for files from AWS CloudFront with S3 Origin

We just migrated to Amazon AWS. We currently have an EC2 instance that's working well. It's running Nginx in front and Apache in the back-end. That's running well also. All sites are launched properly and includes the Cache-Control header for files that are served from the EC2.

The problem is with ALL static files we placed in Amazon S3 that's being accessed through CloudFront CDN. We can access the files fine (and no issue with CORS), but apparently CloudFront doesn't serve files with Cache-Control header. We want to leverage on browser caching.

The way I see it, the EC2 instance doesn't play a role here as the static files are being served directly by S3+CloudFront, the request does not go to the Web Server in EC2.

I'm at a complete lost.

Question: 1) How do I set the Cache-Control in this case? 2) Is it possible to set the Cache-Control? From S3 or CloudFront?

Note: I've hit a few pages in Google where you can set the Header in S3 for individual objects. That's really not a productive way to do it specially since in my case we are talking of several objects.

Thanks!


Solution 1:

I've hit a few pages in Google where you can set the Header in S3 for individual objects. That's really not a productive way to do it specially since in my case we are talking of several objects.

Well, "productive" or not, that is how it actually is designed to work.

CloudFront does not add Cache-Control: headers.

CloudFront passes-through (and also respects, unless otherwise configured) the Cache-Control: headers provided by the origin server, which in this case is S3.

To get Cache-Control: headers provided by S3 when an object is fetched, they must be provided when the object is uploaded into S3, or added to the object's metadata by a subsequent put+copy operation, which can be used to internally copy an object into itself in S3, modifying the metadata in the process. This is what the console does, behind the scenes, if you edit object metadata.

There is also (in case you are wondering) no global setting in S3 to force all objects in a bucket to return these headers -- it's a per-object attribute.


Update: Lambda@Edge is a new feature in CloudFront that allows you to fire triggers against requests and/or responses, between viewer and cache and/or cache and origin, running code written in Node.js against a simple request/response object structure exposed by CloudFront.

One of the main applications for this feature is manipulating headers... so while the above is still accurate -- CloudFront itself does not add Cache-Control -- it is now possible for a Lambda function to add them to the response that is returned from CloudFront.

This example adds Cache-Control: public, max-age=86400 only if there is no Cache-Control header already present on the response.

Using this code in an Origin Response trigger would cause it to fire every time CloudFront fetches an object from the origin, and modify the response before CloudFront caches it.

'use strict';

exports.handler = (event, context, callback) => {
    const response = event.Records[0].cf.response;

    if(!response.headers['cache-control'])
    {
        response.headers['cache-control'] = [{ 
            key:   'Cache-Control', 
            value: 'public, max-age=86400' 
        }];
    }

    callback(null, response);
};

Update (2018-06-20): Recently, I submitted a feature request to the CloudFront team to allow configuration of static origin response headers as origin attributes, similar to the way static request headers can be added, now... but with a twist, allowing each header to be configured to be added conditionally (only if the origin didn't provide that header in the response) or unconditionally (adding the header and overwriting the header from then origin, if present).

With feature requests, you typically don't receive any confirmation of whether they are actually considering implementing the new feature... or even whether they might have already been working on it... it's just announced when they are done. So, I have no idea if these will be implemented. There is an argument to be made that since this capability is already available via Lambda@Edge, there's no need for it in the base functionality... but my counter-argument is that the base functionally is not feature-complete without the ability to do simple, static response header manipulation, and that if this is the only reason a trigger is needed, then requiring Lambda triggers is an unnecessary cost, financially and in added latency (even though neither is necessarily an outlandish cost).