Export data from DynamoDB
This will export all items as jsons documents
aws dynamodb scan --table-name TABLE_NAME > export.json
This script will read from remote dynamodb table and import into the local the full table.
TABLE=YOURTABLE
maxItems=25
index=0
DATA=$(aws dynamodb scan --table-name $TABLE --max-items $maxItems)
((index+=1))
echo $DATA | jq ".Items | {\"$TABLE\": [{\"PutRequest\": { \"Item\": .[]}}]}" > inserts.jsons
aws dynamodb batch-write-item --request-items file://inserts.jsons --endpoint-url http://localhost:8000
nextToken=$(echo $DATA | jq '.NextToken')
while [[ "${nextToken}" != "" ]]
do
DATA=$(aws dynamodb scan --table-name $TABLE --max-items $maxItems --starting-token $nextToken)
((index+=1))
echo $DATA | jq ".Items | {\"$TABLE\": [{\"PutRequest\": { \"Item\": .[]}}]}" > inserts.jsons
aws dynamodb batch-write-item --request-items file://inserts.jsons --endpoint-url http://localhost:8000
nextToken=$(echo $DATA | jq '.NextToken')
done
Here are a version of the script using files to keep the exported data on disk.
TABLE=YOURTABLE
maxItems=25
index=0
DATA=$(aws dynamodb scan --table-name $TABLE --max-items $maxItems)
((index+=1))
echo $DATA | cat > "$TABLE-$index.json"
nextToken=$(echo $DATA | jq '.NextToken')
while [[ "${nextToken}" != "" ]]
do
DATA=$(aws dynamodb scan --table-name $TABLE --max-items $maxItems --starting-token $nextToken)
((index+=1))
echo $DATA | cat > "$TABLE-$index.json"
nextToken=$(echo $DATA | jq '.NextToken')
done
for x in `ls *$TABLE*.json`; do
cat $x | jq ".Items | {\"$TABLE\": [{\"PutRequest\": { \"Item\": .[]}}]}" > inserts.jsons
aws dynamodb batch-write-item --request-items file://inserts.jsons --endpoint-url http://localhost:8000
done
There is a tool named DynamoDBtoCSV
that can be used for export all the data to a CSV file. However, for the other way around you will have to build your own tool. My suggestion is that you add this functionality to the tool, and contribuite it to the Git repository.
Another way is use AWS Data Pipeline for this task (you will save all the costs of reading the data from outside AWS infraestructure). The approach is similar:
- Build the pipeline for output
- Download the file.
- Parse it with a custom reader.
Here is a way to export some datas (oftentime we just want to get a sample of our prod data locally) from a table using aws cli and jq.
Let's assume we have a prod table called unsurprisingly my-prod-table
and a local table called my-local-table
To export the data run the following:
aws dynamodb scan --table-name my-prod-table \
| jq '{"my-local-table": [.Items[] | {PutRequest: {Item: .}}]}' > data.json
Basically what happens is that we scan our prod table, transform the output of the scan to shape into the format of the batchWriteItem and dump the result into a file.
To import the data in your local table run:
aws dynamodb batch-write-item \
--request-items file://data.json \
--endpoint-url http://localhost:8000
Note: There are some restriction with the batch-write-item
request - The BatchWriteItem operation can contain up to 25 individual PutItem and DeleteItem requests and can write up to 16 MB of data. (The maximum size of an individual item is 400 KB.).
Export it from the DynamoDB interface to S3.
Then convert it to Json using sed:
sed -e 's/$/}/' -e $'s/\x02/,"/g' -e $'s/\x03/":/g' -e 's/^/{"/' <exported_table> > <exported_table>.json
Source