Listing files in a specific "folder" of a AWS S3 bucket

Solution 1:

While everybody say that there are no directories and files in s3, but only objects (and buckets), which is absolutely true, I would suggest to take advantage of CommonPrefixes, described in this answer. So, you can do following to get list of "folders" (commonPrefixes) and "files" (objectSummaries):

ListObjectsV2Request req = new ListObjectsV2Request().withBucketName(bucket.getName()).withPrefix(prefix).withDelimiter(DELIMITER);
ListObjectsV2Result listing = s3Client.listObjectsV2(req);
for (String commonPrefix : listing.getCommonPrefixes()) {
        System.out.println(commonPrefix);
}
for (S3ObjectSummary summary: listing.getObjectSummaries()) {
    System.out.println(summary.getKey());
}

In your case, for objectSummaries (files) it should return (in case of correct prefix):
users/user-id/contacts/contact-id/file1.txt
users/user-id/contacts/contact-id/file2.txt

for commonPrefixes:
users/user-id/contacts/contact-id/

Reference: https://docs.aws.amazon.com/AmazonS3/latest/API/API_ListObjectsV2.html

Solution 2:

Everything in S3 is an object. To you, it may be files and folders. But to S3, they're just objects.

Objects that end with the delimiter (/ in most cases) are usually perceived as a folder, but it's not always the case. It depends on the application. Again, in your case, you're interpretting it as a folder. S3 is not. It's just another object.

In your case above, the object users/<user-id>/contacts/<contact-id>/ exists in S3 as a distinct object, but the object users/<user-id>/ does not. That's the difference in your responses. Why they're like that, we cannot tell you, but someone made the object in one case, and didn't in the other. You don't see it in the AWS Management Console because the console is interpreting it as a folder and hiding it from you.

Since S3 just sees these things as objects, it won't "exclude" certain things for you. It's up to the client to deal with the objects as they should be dealt with.

Your Solution

Since you're the one that doesn't want the folder objects, you can exclude it yourself by checking the last character for a /. If it is, then ignore the object from the response.

Solution 3:

you can check the type. s3 has a special application/x-directory

bucket.objects({:delimiter=>"/", :prefix=>"f1/"}).each { |obj| p obj.object.content_type }

Solution 4:

If your goal is only to take the files and not the folder, the approach I made was to use the file size as a filter. This property is the current size of the file hosted by AWS. All the folders return 0 in that property. The following is a C# code using linq but it shouldn't be hard to translate to Java.

var amazonClient = new AmazonS3Client(key, secretKey, region);
var listObjectsRequest= new ListObjectsRequest
            {
                BucketName = 'someBucketName',
                Delimiter = 'someDelimiter',
                Prefix = 'somePrefix'
            };
var objects = amazonClient.ListObjects(listObjectsRequest);
var objectsInFolder = objects.S3Objects.Where(file => file.Size > 0).ToList();