What is the best AWS infrastructure to run a migration of data from one SQL server to mongoDB?

I have some terabytes of data in our legacy system which runs SQL server. Our newer version runs on MongoDB. We are migrating this data to MongoDB. We have python scripts written and verfied, all data movement happens properly.

we did this on a lower machine which 4 cores, if we do it on bigger machine, its going to be very expensive. AWS Lambda has 15 minutes processing time, this takes more than 24 hrs for one iteration to finish. AWS step functions promises it, but not sure if it is the right one.


Solution 1:

Can you not do "mongoexport" locally, export to S3 (or a physical AWS Snowcone device), use an EC2 instance to "mongoimport", then run your script to do any updates since the dump?

As for how to run it, you would probably get away with using a spot EC2 instance, particularly if you use it outside peak hours for the region - perhaps a weekend. If your job can't be interrupted then on-demand EC2. An m5.xlarge with 4 cores / 16GB RAM is $0.20 per hour, a couple of days of that is $10.

I'll also point out that say 3TB at 100Mbps will take 2.6 days to send, but at 800Mbps will take 7 hours - but sustaining that bandwidth may be difficult without DirectConnect. You might be best off using an AWS Snowcone which is a physical device you copy data to then ship to AWS.

I would suggest using AWS Database Migration Service to migrate from MongoDB to AWS DocumentDB, which is their version of MongoDB with a different name. DMS will migrate the data, then you just point your application at the new instance and turn the old one off.