Production deployment to EC2 with minimal downtime

I have a simple web application deployed on a large instance with EC2. I now want to deploy the latest code to this server but I want to do this in a way which minimizes downtime and is a smooth as possible for the end user. Here is my plan:

  1. Fire up another large instance
  2. Install all the software layers on that instance
  3. Restore and attach an EBS drive to the instance
  4. Deploy our latest production ready code on the new instance
  5. Run all tests (including manual testing of the application)
  6. (If tests pass) Put a "Site Under Maintenance" notice on the live site.
  7. Backup the EBS instance on the live site
  8. Detach the EBS instance from the new server and replace with the latest backup
  9. Use ec2-associate-address to move the IP address to the new instance
  10. Sit back and wait for traffic to start flowing though the new instance
  11. Terminate the old instance

Does this seem like a good strategy? Are there any tutorials or books that might cover this topic? I have already read Cloud Application Architectures by George Reese, which is an excellent book, but does not cover deployment. Additionally, I know that there are tools that can help with this like RightScale or enStratus which I will use when I start using more than one instance.


This looks like a good overall approach. You could cut out step 2, and thus bring down the launch time, by creating a custom AMI that includes all the software layers you need; having said that, I would still update all of the packages at startup to make sure that you get all the latest security updates.

You might also want to think about using an EBS-backed instance - that way you could have the boot volume, the software stack and your application all on EBS, which would cut out a few of the steps above.


OK, this was asked a while ago, but I'll chime in with my 2 cents anyways. I think you're missing out on the benefits of cloud computing.

First off, you should separate your application code and your persistent data out on 2 different virtual machines. This will cost you a little in inter-VM communication latency, but should make your administration much simpler. Remember, having 2 small VMs instead of 1 large VM isn't more expensive; so choose the number of hosts that matches your needs best.

If possible you want your application servers to be "stateless" in the sense that they shouldn't have persistent data, and you should be able to spawn a new instance with a minimum of manual work.

Second, you should consider if some of the Amazon managed services like SimpleDB or Relational Database Service (hosted MySQL) are a good fit for your persistent data store.

The ideal flow looks something like this:

  1. Change the "rearmost" backend system first. For example, if your change requires adding a column to a database table, then add this using normal MySQL tools on a running RDS instance. (This assumes that your architecture allows your data store to change while keeping backwards compatibility, or that you first update your app server code so that it's forward compatible.)
  2. Bring up a new application server instance, using a customized ready-to-use AMI that you have prepared in advance.
  3. Install your updated code on the new app server, i.e. the new code that uses the new column and has the new functionality.
  4. Test.
  5. Bring over some or all traffic, i.e. move over IP address / switch over Elastic Load Balancing to the new app server. (In an ideal world you would only move over a small percentage, say 5% of your traffic at first, and then watch for any problems. AFAIK Elastic Load Balancing does not support weighted sticky routing yet, so you probably should not do this. Gradual switch over can also be achieved by having 2 execution paths in your code, but that's time-consuming and annoying to do -- weighted sticky load balancing is simpler.)
  6. Keep the old app server instance around for a few days, in case the new code has regressions and you need to roll-back.