Restore a versioned S3 bucket to a particular point in time
Let's say I've got S3 versioning enabled for my bucket: http://docs.aws.amazon.com/AmazonS3/latest/dev/Versioning.html
Then, let's say someone (for example, junior employee) messes up the S3 bucket (deletes some files accidentally, etc.)
How can I then restore the entire versioned bucket to a particular point in time? I believe this should be possible given S3's API, but I'd rather not have to write such a script myself, for fear of missing something (I'm not an AWS expert).
Are there are good solution to this problem? I'm using the S3 bucket as an image store for my Rails app, so something Ruby-based that I could use as a rake task would be ideal.
You can use s3-pit-restore
S3 Point in Time Restore is a tool you can use exactly to restore a bucket or a subset of a bucket to a given point in time, like this:
s3-pit-restore --bucket my-bucket --dest my-restored-bucket --timestamp "06-17-2016 23:59:50 +2"
What s3-pit-restore actually offers:
- Restore of all files with timestamp less than the given one
- Restore of a whole bucket or a bucket prefix
- Parallel download of multiple files with a great overall speed
- Customization of parallel workers count to optimize bandwidth usage
- Restore from s3 bucket versions or from glacier if enabled
If I understand the documentation correctly, when you have versioning enabled deleting the file simply reverts the "latest" version back one version number. This however does not give the ability to restore an entire bucket. This makes the previous versions in S3 not suitable for your needs (i.e, recovery from deletion).
Keep a backup someplace else as well just in case. Stack Overflow has a question/answer on this using s3cmd
. I'm sure you could find a Ruby-based script somewhere or ask on that site for help if you need it.