Importing data from AWS Athena to RDS instance

Currently I’m listening events from AWS Kinesis and writing them to S3. Then I query them using AWS Glue and Athena.

Is there a way to import that data, possibly with some transformation, to an RDS instance?


There are several general approaches to take with regards to that task.

  1. Read data from and Athena query into a custom ETL script (using a JDBC connection) and load into the database
  2. Mount the S3 bucket holding the data to a file system (perhaps using s3fs-fuse), read the data using a custom ETL script, and push it to the RDS instance(s)
  3. Download the data to be uploaded to the RDS instance to a filesystem using the AWS CLI or the SDK, process locally, and then push to RDS
  4. As you suggest, use AWS Glue to import the data to from Athena to the RDS instance. If you are building an application that is tightly coupled with AWS, and if you are using Kinesis and Athena you are, then such a solution makes sense.

When connecting GLUE to RDS a couple of things to keep in mind (mostly on the networking side:

  1. Ensure that DNS Hostnames are enabled the VPC hosting the target RDS instance
  2. You'll need to setup a self-referencing rule in the Security Group associated with the target RDS instance

For some examples of code targetting a relational database, see the following tutorials


One approach for Postgres:

  1. Install the S3 extension in Postgres: psql=> CREATE EXTENSION aws_s3 CASCADE;

  2. Run the query in Athena and find the CSV result file location in S3 (S3 output location is in Athena settings) (You can also inspect the "Download results" button to get the S3 path)

  3. Create your table in Postgres

  4. Import from S3:

SELECT aws_s3.table_import_from_s3(
    'newtable', '', '(format csv, header true)',
    aws_commons.create_s3_uri('bucketname', 'reports/Unsaved/2021/05/10/aa9f04b0-d082-328g-5c9d-27982d345484.csv', 'us-east-1')
);

If you want to convert empty values to null, you can use this: (format csv, FORCE_NULL (columnname), header true)

https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/PostgreSQL.Procedural.Importing.html