Managing a Linux cluster

I'm interested in learning about tools and techniques used to manage many Linux machines. (That is, deploying and maintaining updates.)

One way that I thought of to do this is write a Bash script that uploads another script to the servers and executes the script for each server sequentially. For example:

foreach server
{
     connect to server and scp update_script.sh to ~/scripts
     ssh user@server -e "sh ~/scripts/update_script.h"
}

And update_script would use apt-get/aptitude or yum, or whatever to update packages on the server.

Are there better ways to do things like this?


Solution 1:

Try puppet

Another excellent (truly excellent) tool is Webmin, if you add several servers running webmin together (in the webmin interface) you can push updates and view package configurations in its cluster pages.

An alternative, which is more geared to rolling out images, is SystemImager

Solution 2:

ClusterSSH is what you're looking for. It provides a way to broadcast commands to all nodes in a cluster. Think of it like BashReduce sans Reduce.

Solution 3:

Someone else already mentioned Puppet.

In the same vein, I can recommend Cfengine. The learning curve can be a little steep, but once you get the hang of it, it's great. I use it to manage about 50 servers and can't believe I ever got along without it.

Solution 4:

Try Capistrano. It works just like your bash foreach loop above but it is based on Ruby instead of bash. Capistrano is used for operational tasks (a la: put server in maintenance mode, take out of maintenance mode)

+1 for Puppet. It's a good fit for idempotent operations that leave a system in a known state.