How can I set up an SFTP server backed by S3 (or similar)

I answered this same question on Stack Overflow.

s3fs is indeed a reasonable solution, and in my case, I've coupled it with proftpd with excellent results, in spite of the theoretical/potential problems.

At the time I wrote the answer, I had only set this up for one of my consulting clients... but since then, I've also started drinking my own kool-aid and am using it in production at my day job. Companies we exchange data with upload and download files all day long on my sftp server, which is storing everything directly on S3. As a bonus, my report exporting system -- which writes excel spreadsheets directly to S3 -- can export reports "to the FTP server" by simply putting them directly into the ftp server's bucket, with appropriate metadata to show the uid, gid, and mode of each file. (s3fs uses x-amz-meta-uid, -gid, and -mode headers to emulate filesystem permissions). When the client logs on to the server, the report files are just... there.

I do think the ideal solution would probably be an sftp to S3 gateway service, but I still haven't gotten around to designing one, since this solution works really well... with some caveats, of course:

Not all of the default values for s3fs are sane. You will probably want to specify these options:

-o enable_noobj_cache   # s3fs has a huge performance hit for large directories without this enabled
-o stat_cache_expire=30 # the ideal time will vary according to your usage
-o enable_content_md5   # it's beyond me why this safety check is disabled by default

It's probably best to use a region other than US-Standard, because that's the only region that doesn't offer read-after-write consistency on new objects. (Or, if you need to use US-Standard, you can use the almost undocumented hostname your-bucket.s3-external-1.amazonaws.com from the us-east-1 region to prevent your requests from being geo-routed, which may improve consistency.)

I have object versioning enabled on the bucket, which s3fs is completely unaware of. The benefit of this is that even if a file should get "stomped," I can always go to bucket versioning to recover the "overwritten" file. Object versioning in S3 was brilliantly designed in such a way that S3 clients that are unaware of versioning are in no way disabled or confused, because if you don't make versioning-aware REST calls, the responses S3 returns are compatible with clients that have no concept of versioning.

Note also that transferring data into S3 is free of data transfer charges. You pay only the per-request pricing. Transferring data out of S3 into EC2 within a region is also free of data transfer charges. It's only when you transfer out of S3 to the Internet, to Cloudfront, or to another AWS region that you pay transfer charges. If you want to use the lower-priced reduced-redundancy storage, s3fs supports that with -o use_rrs.

As an amusing aside, you'll always get a warm fuzzy feeling when you see the 256 terabytes of free space (and 0 used, since a real calculation of sizes is impractical because of the fact that S3 is an object store, not a filesystem).

$ df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/xvda1      7.9G  1.4G  6.2G  18% /
s3fs            256T     0  256T   0% /srv/s3fs/example-bucket

Of course, you can mount the bucket anywhere. I just happen to have it in /srv/s3fs.


Check out the SFTP Gateway on the AWS Marketplace.

We experienced reliability issues with s3fs, so we developed a custom solution specifically for this purpose. We've been using it in production for several years without issue and have recently released it to the AWS Marketplace.


There are two options. You can use a native managed SFTP service recently added by Amazon (which is easier to set up). Or you can mount the bucket to a file system on a Linux server and access the files using the SFTP as any other files on the server (which gives you greater control).

Managed SFTP Service

  • In your Amazon AWS Console, go to AWS Transfer for SFTP and create a new server.

  • In SFTP server page, add a new SFTP user (or users).

    • Permissions of users are governed by an associated AWS role in IAM service (for a quick start, you can use AmazonS3FullAccess policy).

    • The role must have a trust relationship to transfer.amazonaws.com.

For details, see my guide Setting up an SFTP access to Amazon S3.

Mounting Bucket to Linux Server

As @Michael already answered, just mount the bucket using the s3fs file system (or similar) to a Linux server (Amazon EC2) and use the server's built-in SFTP server to access the bucket.

Here are basic instructions:

  • Install the s3fs

  • Add your security credentials in a form access-key-id:secret-access-key to the /etc/passwd-s3fs

  • Add a bucket mounting entry to the fstab:

      <bucket> /mnt/<bucket> fuse.s3fs rw,nosuid,nodev,allow_other 0 0
    

For details, see my guide Setting up an SFTP access to Amazon S3.

Use S3 Client

Or use any free "FTP/SFTP client", that's also an "S3 client", and you do not have setup anything on server-side. For example, my WinSCP or Cyberduck.