Export data from DynamoDB

This will export all items as jsons documents

aws dynamodb scan --table-name TABLE_NAME > export.json

This script will read from remote dynamodb table and import into the local the full table.

TABLE=YOURTABLE
maxItems=25
index=0

DATA=$(aws dynamodb scan --table-name $TABLE --max-items $maxItems)
((index+=1)) 
echo $DATA | jq ".Items | {\"$TABLE\": [{\"PutRequest\": { \"Item\": .[]}}]}" > inserts.jsons
aws dynamodb batch-write-item --request-items file://inserts.jsons --endpoint-url http://localhost:8000


nextToken=$(echo $DATA | jq '.NextToken')
while [[ "${nextToken}" != "" ]]
do
  DATA=$(aws dynamodb scan --table-name $TABLE --max-items $maxItems --starting-token $nextToken)
  ((index+=1))
  echo $DATA | jq ".Items | {\"$TABLE\": [{\"PutRequest\": { \"Item\": .[]}}]}" > inserts.jsons
  aws dynamodb batch-write-item --request-items file://inserts.jsons --endpoint-url http://localhost:8000
  nextToken=$(echo $DATA | jq '.NextToken')
done

Here are a version of the script using files to keep the exported data on disk.

TABLE=YOURTABLE
maxItems=25
index=0
DATA=$(aws dynamodb scan --table-name $TABLE --max-items $maxItems)
((index+=1))
echo $DATA | cat > "$TABLE-$index.json"

nextToken=$(echo $DATA | jq '.NextToken')
while [[ "${nextToken}" != "" ]]
do
  DATA=$(aws dynamodb scan --table-name $TABLE --max-items $maxItems --starting-token $nextToken)
  ((index+=1))
  echo $DATA | cat > "$TABLE-$index.json"
  nextToken=$(echo $DATA | jq '.NextToken')
done

for x in `ls *$TABLE*.json`; do
  cat $x | jq ".Items | {\"$TABLE\": [{\"PutRequest\": { \"Item\": .[]}}]}" > inserts.jsons
  aws dynamodb batch-write-item --request-items file://inserts.jsons --endpoint-url http://localhost:8000
done

There is a tool named DynamoDBtoCSV

that can be used for export all the data to a CSV file. However, for the other way around you will have to build your own tool. My suggestion is that you add this functionality to the tool, and contribuite it to the Git repository.

Another way is use AWS Data Pipeline for this task (you will save all the costs of reading the data from outside AWS infraestructure). The approach is similar:

Build the pipeline for output
Download the file.
Parse it with a custom reader.

Here is a way to export some datas (oftentime we just want to get a sample of our prod data locally) from a table using aws cli and jq. Let's assume we have a prod table called unsurprisingly my-prod-table and a local table called my-local-table

To export the data run the following:

aws dynamodb scan --table-name my-prod-table \
| jq '{"my-local-table": [.Items[] | {PutRequest: {Item: .}}]}' > data.json

Basically what happens is that we scan our prod table, transform the output of the scan to shape into the format of the batchWriteItem and dump the result into a file.

To import the data in your local table run:

aws dynamodb batch-write-item \
--request-items file://data.json \
--endpoint-url http://localhost:8000

Note: There are some restriction with the batch-write-item request - The BatchWriteItem operation can contain up to 25 individual PutItem and DeleteItem requests and can write up to 16 MB of data. (The maximum size of an individual item is 400 KB.).

Export it from the DynamoDB interface to S3.

Then convert it to Json using sed:

sed -e 's/$/}/' -e $'s/\x02/,"/g' -e $'s/\x03/":/g' -e 's/^/{"/' <exported_table> > <exported_table>.json

Source

Export data from DynamoDB

Related

Recent Posts