How can I mount an S3 bucket to an EC2 instance and write to it with PHP?
Solution 1:
Have you ever mounted an S3 bucket on a local linux filesystem?
No. It's fun for testing, but I wouldn't let it near a production system. It's much better to use a library to communicate with S3. Here's why:
- It won't hide errors. A filesystem only has a few errors codes it can send you to indicate a problem. An S3 library will give you the exact error message from Amazon so you understand what's going on, log it, handle corner cases, etc.
- A library will use less memory. Filesystems layers will cache lots of random stuff that you many never use again. A library puts you in control to decide what to cache and not to cache.
- Expansion. If you ever need to do anything fancy (set an ACL on a file, generate a signed link, versioning, lifecycle, change durability, etc), then you'll have to dump your filesystem abstraction and use a library anyway.
- Timing and retries. Some fraction of requests randomly error out and can be retried. Sometimes you may want to retry a lot, sometimes you would rather error out quickly. A filesystem doesn't give you granular control, but a library will.
The bottom line is that S3 under FUSE is a leaky abstraction. S3 doesn't have (or need) directories. Filesystems weren't built for billions of files. Their permissions models are incompatible. You are wasting a lot of the power of S3 by trying to shoehorn it into a filesystem.
Two random PHP libraries for talking to S3:
https://github.com/KnpLabs/Gaufrette
https://aws.amazon.com/sdkforphp/ - this one is useful if you expand beyond just using S3, or if you need to do any of the fancy requests mentioned above.
Solution 2:
Quite often, it is advantageous to write files to the EBS volume, then force subsequent public requests for the file(s) to route through CloudFront CDN.
In that way, if the app must do any transformations to the file, it's much easier to do on the local drive & system, then force requests for the transformed files to pull from the origin via CloudFront.
e.g. if your user is uploading an image for an avatar, and the avatar image needs several iterations for size & crop, your app can create these on the local volume, but all public requests for the file will take place through a cloudfront origin-pull request. In that way, you have maximum flexibility to keep the original file (or an optimized version of the file), and any subsequent user requests can either pull an existing version from cloud front edge, or cloud front will route the request back to the app and create any necessary iterations.
An elementary example of the above would be WordPress, which creates multiple sized/cropped versions of any graphic image uploaded, in addition to keeping the original (subject to file size restrictions, and/or plugin transformations). CDN-capable WordPress plugins such as W3 Total Cache rewrite requests to pull through CDN, so the app only needs to create unique first-request iterations. Adding browser caching URL versioning (http://domain.tld/file.php?x123) further refines and leverages CDN functionality.
If you are concerned about rapid expansion of EBS volume file size or inodes, you can automate a pruning process for seldom-requested files, or aged files.