How do I back up an AWS S3 Bucket without versioning the source bucket [closed]

Another approach is to enable S3 versioning on your bucket. You can then restore deleted files etc. See the S3 documentation for how to enable this

Using third party tools like BucketExplorer makes working with versioning pretty trivial (vs calling the API yourself directly).

You can also enable multi-factor authentication delete for your S3 buckets - which makes "accidental deletion" that little bit harder ;)

More on Multi Factor Authentication Delete
More on Deleting Objects

You could use s3cmd http://s3tools.org/s3cmd

So to backup a bucket called mybucket

s3cmd mb s3://mybucket_backup
s3cmd --recursive cp s3://mybucket s3://mybucket_backup

One possible solution could be to just create a "backup bucket" and duplicate your sensitive info there. In theory your data is safer in S3 than in your hard drive.

Also, I'm not sure if accidental deletions are a real problem because you'll need to accidentally delete all your bucket keys before you could delete the bucket.

Another possible solution is to replicate your bucket to the Europe zone in S3. This may persist the bucket after your accidental deletion long enough to recover.

This isn't a cheap solution, but if your buckets really are critical, here's how you do it: boot an Amazon EC2 instance and sync the content there periodically.

Amazon EC2 is their virtualization hosting provider. You can spin up instances of Linux, Windows, etc and run anything you want. You pay by the hour, and you get a pretty big storage space locally for that server. For example, I use the "large" size instance, which comes with 850GB of local disk space.

The cool part is that it's on the same network as S3, and you get unlimited transfers between S3 and EC2. I use the $20 Jungle Disk software on a Windows EC2 instance, which lets me access my S3 buckets as if they were local disk folders. Then I can do scheduled batch files to copy stuff out of S3 and onto my local EC2 disk space. You can automate it to keep hourly backups if you want, or if you want to gamble, set up JungleDisk (or its Linux equivalents) to sync once an hour or so. If someone deletes a file, you've got at least a few minutes to get it back from EC2. I'd recommend the regular scripted backups though - it's easy to keep a few days of backups if you're compressing them onto an 850GB volume.

This is really useful for SQL Server log shipping, but I can see how it'd accomplish your objective too.

How do I back up an AWS S3 Bucket without versioning the source bucket [closed]

Related

Recent Posts