Quick way to list all files in Amazon S3 bucket?

I have an amazon s3 bucket that has tens of thousands of filenames in it. What's the easiest way to get a text file that lists all the filenames in the bucket?


I'd recommend using boto. Then it's a quick couple of lines of python:

from boto.s3.connection import S3Connection

conn = S3Connection('access-key','secret-access-key')
bucket = conn.get_bucket('bucket')
for key in bucket.list():
    print(key.name.encode('utf-8'))

Save this as list.py, open a terminal, and then run:

$ python list.py > results.txt

AWS CLI

Documentation for aws s3 ls

AWS have recently release their Command Line Tools. This works much like boto and can be installed using sudo easy_install awscli or sudo pip install awscli

Once you have installed, you can then simply run

aws s3 ls

Which will show you all of your available buckets

CreationTime Bucket
       ------------ ------
2013-07-11 17:08:50 mybucket
2013-07-24 14:55:44 mybucket2

You can then query a specific bucket for files.

Command:

aws s3 ls s3://mybucket

Output:

Bucket: mybucket
Prefix:

      LastWriteTime     Length Name
      -------------     ------ ----
                           PRE somePrefix/
2013-07-25 17:06:27         88 test.txt

This will show you all of your files.


s3cmd is invaluable for this kind of thing

$ s3cmd ls -r s3://yourbucket/ | awk '{print $4}' > objects_in_bucket


Be carefull, amazon list only returns 1000 files. If you want to iterate over all files you have to paginate the results using markers :

In ruby using aws-s3

bucket_name = 'yourBucket'
marker = ""

AWS::S3::Base.establish_connection!(
  :access_key_id => 'your_access_key_id',
  :secret_access_key => 'your_secret_access_key'
)

loop do
  objects = Bucket.objects(bucket_name, :marker=>marker, :max_keys=>1000)
  break if objects.size == 0
  marker = objects.last.key

  objects.each do |obj|
      puts "#{obj.key}"
  end
end

end

Hope this helps, vincent


Update 15-02-2019:

This command will give you a list of all buckets in AWS S3:

aws s3 ls

This command will give you a list of all top-level objects inside an AWS S3 bucket:

aws s3 ls bucket-name

This command will give you a list of ALL objects inside an AWS S3 bucket:

aws s3 ls bucket-name --recursive

This command will place a list of ALL inside an AWS S3 bucket... inside a text file in your current directory:

aws s3 ls bucket-name --recursive | cat >> file-name.txt