How to ensure that S3 upload triggers a lambda function, but copying data within the same bucket does not trigger the lambda function anymore?

Required procedure:

  1. Someone does an upload to an S3 bucket.
  2. This triggers a Lambda function that does some processing on the uploaded file(s).
  3. Processed objects are now copied into a "processed" folder within the same bucket.

The copy-operation in Step 3 should never re-trigger the initial Lambda function itself.
I know that the general guidance is to use a different bucket for storing the processed objects in a situation like this (but this is not possible in this case).

So my approach was to set up the S3 trigger to only listen to PUT/POST-Method and excluded the COPY-Method. The lambda function itself uses python-boto (S3_CLIENT.copy_object(..)). The approach seems to work (the lambda function seems to not be retriggered by the copy operation)

However I wanted to ask if this approach is really reliable - is it?


Solution 1:

You can filter which events trigger the S3 notification.

There are 2 ways to trigger lambda from S3 event in general: bucket notifications and EventBridge.

Notifications: https://docs.aws.amazon.com/AmazonS3/latest/userguide/notification-how-to-filtering.html

EB: https://aws.amazon.com/blogs/aws/new-use-amazon-s3-event-notifications-with-amazon-eventbridge/

In your case, a quick search doesn't show me that you can setup a "negative" rule, so "everything which doesn't have processed prefix". But you can rework your bucket structure a bit and dump unprocessed items into unprocessed and setup filter based on that prefix only.