Automatically delete old items from s3 bucket
Is there an easy way to set up a bucket in s3 to automatically delete files older than x days?
Amazon now has the ability to set bucket policies to automatically expire content:
https://docs.aws.amazon.com/AmazonS3/latest/userguide/how-to-set-lifecycle-configuration-intro.html
Amazon has meanwhile introduced S3 lifecycles (see the introductory blog post Amazon S3 - Object Expiration), where you can specify a maximum age in days for objects in a bucket - see Object Expiration for details on its usage via the S3 API or the AWS Management Console.
You can use s3cmd to write a script to run through your bucket and delete files based on a precondition.
You'll need to write some code (bash, python) on top of it.
You can download s3cmd from http://s3tools.org/s3cmd
shell script to delete old buckets using s3cmd utility
source :
http://shout.setfive.com/2011/12/05/deleting-files-older-than-specified-time-with-s3cmd-and-bash/
#!/bin/bash
# Usage: ./deleteOld "bucketname" "30 days"
s3cmd ls s3://$1 | while read -r line; do
createDate=`echo $line|awk {'print $1" "$2'}`
createDate=`date -d"$createDate" +%s`
olderThan=`date -d"-$2" +%s`
if [[ $createDate -lt $olderThan ]]
then
fileName=`echo $line|awk '{$1=$2=$3=""; print $0}' | sed 's/^[ \t]*//'`
echo $fileName
if [[ $fileName != "" ]]
then
s3cmd del "$fileName"
fi
fi
done;
WINDOWS / POWERSHELL
If the lifecycle indication does not suit you, then on Windows Server this can be done by writing a simple PowerShell script
#set a bucket name
$bucket = "my-bucket-name"
#set the expiration date of files
$limit_date = (Get-Date).AddDays(-30)
#get all the files
$files = aws s3 ls "$($bucket)"
#extract the file name and date
$parsed = $files | ForEach-Object { @{ date = $_.split(' ')[0] ; fname = $_.split(' ')[-1] } }
#filter files older than $limit_date
$filtred = $parsed | Where-Object { ![string]::IsNullOrEmpty($_.date) -and [datetime]::parseexact($_.date, 'yyyy-MM-dd', $null) -ge $limit_date }
#remove filtered files
$filtred | ForEach-Object { aws s3 rm "s3://$($bucket)/$($_.fname)" }
This script can fit into one command. Just replace my-bucket-name with the name of your bucket.
aws s3 ls my-backet-name | ForEach-Object { @{ date = $_.split(' ')[0] ; fname = $_.split(' ')[-1] } }| Where-Object { ![string]::IsNullOrEmpty($_.date) -and [datetime]::parseexact($_.date, 'yyyy-MM-dd', $null) -ge $limit_date } | ForEach-Object { aws s3 rm s3://my-backet-name/$_.fname }
Note that this script will only delete files from the root directory but not recursively. If you need to remove data from a subdirectory, then specify it before /$_.fname