Automatically delete old items from s3 bucket

Is there an easy way to set up a bucket in s3 to automatically delete files older than x days?


Amazon now has the ability to set bucket policies to automatically expire content:

https://docs.aws.amazon.com/AmazonS3/latest/userguide/how-to-set-lifecycle-configuration-intro.html


Amazon has meanwhile introduced S3 lifecycles (see the introductory blog post Amazon S3 - Object Expiration), where you can specify a maximum age in days for objects in a bucket - see Object Expiration for details on its usage via the S3 API or the AWS Management Console.


You can use s3cmd to write a script to run through your bucket and delete files based on a precondition.

You'll need to write some code (bash, python) on top of it.

You can download s3cmd from http://s3tools.org/s3cmd


shell script to delete old buckets using s3cmd utility
source : http://shout.setfive.com/2011/12/05/deleting-files-older-than-specified-time-with-s3cmd-and-bash/

#!/bin/bash
# Usage: ./deleteOld "bucketname" "30 days"
s3cmd ls s3://$1 | while read -r line;  do

createDate=`echo $line|awk {'print $1" "$2'}`
createDate=`date -d"$createDate" +%s`
olderThan=`date -d"-$2" +%s`
if [[ $createDate -lt $olderThan ]]
  then 
    fileName=`echo $line|awk '{$1=$2=$3=""; print $0}' | sed 's/^[ \t]*//'`
    echo $fileName
    if [[ $fileName != "" ]]
      then
        s3cmd del "$fileName"
    fi
fi
done;

WINDOWS / POWERSHELL

If the lifecycle indication does not suit you, then on Windows Server this can be done by writing a simple PowerShell script

#set a bucket name
$bucket = "my-bucket-name"

#set the expiration date of files
$limit_date = (Get-Date).AddDays(-30)

#get all the files
$files = aws s3 ls "$($bucket)"

#extract the file name and date
$parsed = $files | ForEach-Object { @{ date = $_.split(' ')[0] ; fname = $_.split(' ')[-1] } }

#filter files older than $limit_date
$filtred = $parsed | Where-Object { ![string]::IsNullOrEmpty($_.date) -and [datetime]::parseexact($_.date, 'yyyy-MM-dd', $null) -ge  $limit_date }

#remove filtered files
$filtred | ForEach-Object { aws s3 rm "s3://$($bucket)/$($_.fname)" }

This script can fit into one command. Just replace my-bucket-name with the name of your bucket.

aws s3 ls my-backet-name | ForEach-Object { @{ date = $_.split(' ')[0] ; fname = $_.split(' ')[-1] } }| Where-Object { ![string]::IsNullOrEmpty($_.date) -and [datetime]::parseexact($_.date, 'yyyy-MM-dd', $null) -ge  $limit_date } | ForEach-Object { aws s3 rm s3://my-backet-name/$_.fname }

Note that this script will only delete files from the root directory but not recursively. If you need to remove data from a subdirectory, then specify it before /$_.fname