What are common strategies for configuration management on EC2?
Without using a 3rd party cloud configuration service, what are common patterns/strategies or whitepapers written on the subject of configuration management with EC2? Specifically on how to configure and provision new instances at startup, code deployment strategies, etc...
Solution 1:
I've strapped together our method for bootstrapping EC2 from dozens of blogposts. Still a work in progress, but we use cloud-init for connecting the instance to our Puppet master, Puppet for configuring all packages to the role this instance will play, and Webistrano (a GUI for Capistrano) to deploy our code to the servers.
If you build your own machine images you can pretty much build whatever system you desire, but we wanted to go with the publicly available official Ubuntu images, which don't have configuration management software installed.
So, we use cloud-init to bootstrap an instance. Cloud-init is a package that is present on Ubuntu and Amazon Linux AMI's. It allows for various kinds of data to be passed to an instance as it is created, via EC2's 'user-data' metadata-option. The data passed in via user-data is executed by cloud-init as the instance has booted and can take several forms, such as shell scripts, cloud-config yaml, etc.
This post shows an example of using cloud-init, similar to the way we do it: http://www.atlanticdynamic.com/you-should-be-using-cloud-init/
And here's our version:
#cloud-config
apt_update: true
apt_upgrade: true
packages:
- puppet
puppet:
conf:
agent:
server: "puppet.example.com"
certname: "%i.web.cluster1.eu-west-1.ec2"
As soon as the instance boots it will install Puppet and connect it to our Puppet master. As soon as you've permitted it to connect to the master (signed its certificate) the instance will automatically start configuring itself. The master will use a regex in the nodes.pp file to match the instance's certname, thereby assigning it a role. The master is then able to send a catalog to the puppet agent, which uses this to configure itself.
A few minutes after booting, the instance is ready for use. If we need to deploy any code to the node, we don't use Puppet for that but Webistrano. For the moment we manually add the node to Webistrano's config but we intend to use MCollective to do it automatically based on node metadata.
Solution 2:
Boto and Fabric are great for this. Here's an article about it at SeoMoz. My current strategy has been similar but uses Fabric to run the Amazon shell commands locally instead.
EC2 instances are also able to query metadata that was assigned to them during deployment, which in combination with a custom AMI can be very powerful.
(I'm limited on the number of links I can post; Boto is simple enough to google)