How do you administer your EC2 Linux boxes?

I have a few EC2 Linux images that do nightly processing jobs for one of my projects. From time to time, I'll need to get in, make some code changes, configure some things, and re-bundle the image.

My toolset for these operations is painfully sparse (SSH into the box, edit files in VIM, WGET remote files that I need), and I suspect there is a much better way to do it. I'm curious to hear what other people in my position are doing.

  • Are you using some form of Windowing system and remote-desktop equivalent to access the box, or is it all command line? Managing EC2 Windows boxes is trivial, since you can simply remote desktop in and transfer files over the network. Is there an equivalent to this in the Linux world?

  • Are you doing your config changes/script tweaks directly on the machine? Or do you have something set up on your local box to edit these files remotely? Or are you simply editing them remotely then transferring them at each save?

  • How are you moving files back and forth between EC2 and your local environment? FTP? Some sort of Mapped Drive via VPN?

I'd really need to get some best practices in place for administering these boxes. Any suggestions to remove some of the pain would be most welcome!

EDIT: Evidently, I wasn't clear above, since the first two responses revolved around managing and configuring EC2 Instances. I just want to know how to remote desktop into a running Linux Server so that moving files around and editing them is less painful.


Solution 1:

I don't do much manual system administration anymore. I view my infrastructure as a programmable entity, and treat it as such, by configuring systems with tools that automate configuration management, EC2 node maintenance, etc. Tools in my toolbox:

  • Ruby (my favorite scripting/tool language)
  • Git (version control)
  • Opscode's Chef (written in Ruby) (1)
  • Capistrano (ad hoc mass-maintenance)
  • Amazon's EC2 API tools for instance and image maintenance.
  • Rightscale's AWS gem (Ruby bindings for EC2)

(1) - Disclosure, I work for Opscode. Other tools fill this space like Reductive Lab's Puppet.

I do bundle up an AMI when I've got a node built the way I need for a specific function. For example, if I'm building a Rails app server, I'll get all the prerequisite packages installed to save time on build.

When all else fails, I log into systems with SSH. I did manual system administration for many many years, this is old hat.

Are you using some form of Windowing system and remote-desktop equivalent to access the box, or is it all command line?

I don't install any GUI on servers unless a package has a dependency and one gets auto-installed.

Is there an equivalent to this in the Linux world? (transferring files)

I normally do two types of file transfer/file maintenance.

  • Package installation
  • Configuration files

For packages native to the platform, I use the standard package management tool like APT or YUM. For source installs (something.tar.gz) I generally download via wget.

Configuration files are typically ERB templates managed by Chef.

I use SSH and SCP/SFTP to transfer files manually.

Are you doing your config changes/script tweaks directly on the machine? Or do you have something set up on your local box to edit these files remotely? Or are you simply editing them remotely then transferring them at each save?

I keep everything related to managing systems in a software control repository. Here's my typical workflow when updating configuration on one or more systems. I start from my local workstation.

  • Pull from master Git repository for others' changes.
  • Edit file(s) locally (like, update a configuration file).
  • Commit the change, push to master.
  • On Chef server (logged in via SSH), pull the latest change I just committed.
  • Deploy the configuration to the appropriate place on the Chef server (I use Rake for this).
  • Chef clients run on an interval, so they will pick up changes every 30 minutes. If I need something immediately, I run chef-client manually.
  • Verify the change!

How are you moving files back and forth between EC2 and your local environment? FTP? Some sort of Mapped Drive via VPN?

There's a few locations where files I use on EC2 nodes might be stored.

  • Chef server. Configuration templates mainly, some small packages too.
  • GitHub. We store our code (open source projects) on GitHub. EC2 nodes can get to this easily (such as for a checkout of the latest version of something).
  • Amazon S3 buckets. Some things get stored in a bucket.

I do a lot of work in EC2, primarily testing environments and changes. As a result of my tools and workflow, I spend more time working on things I actually care about and less on dealing with individual files and thinking about specific configurations.

Solution 2:

All of our software is deployed via RPM. Each EC2 instance type is described by a kickstart file (which lists the RPMs to be installed...). The kickstart setup means a working machine of each instance type can be built from scratch in about 10 minutes.

We then have a program that invokes anaconda (the Red Hat installer) to take a kickstart file, install the system into a directory, then bundle up the directory and push it to S3 as an Amazon Machine Image. This is all one step, so I just type:

kickstart2ami webserver.ks

Since a machine can be completely rebuilt, uploaded and running in about 40 minutes, it's easier to build new machine images than perform sysadmin on the actual (throwaway) EC2 instances. Therefore, no sysadmin is actually performed on EC2 instances.

Solution 3:

I like NX for remote gui access. Very well documented too.

Solution 4:

I use nautilus for quite a bit of file management and ssh for commands. It plugs right into your system like you were on it physically in the data center. If you're doing it from a windows box, then this sort of connectivity won't work as vfs in windows are limited.