What's the best practice of mass Linux system deployment?

If you are trying to install 500 Linux system through network installation at the same time, the bottleneck would be the NFS/HTTP/FTP or whatever server holds files you need for installation.

IMO, this can only be solved by adding more installation servers and then round-robin them.

Is there any better solution to this problem? Something like "P2P Linux installation"?

UPDATE: I need to describe my situation more specificly. Currently I'm deploying RHEL using kickstart+NFS. When I try to deploy 500 RHEL concurrently, the NFS server will have a huge traffic and makes every install process slow. Setting up more NFS servers is a solution but I don't think it's a good one.


Solution 1:

This is usually where Multicast imaging comes along. Something like Clonezilla or ghost supports sending the data multicast which would let you push out the image to all 500 systems at once at basically the same speed as pushing the image out to 1 system.

Solution 2:

The Avalanche installer of the rocks linux cluster distro, is bittorrent based and scales nicely. It also takes you from PXE boot to running system. Although, you're tied to using rocks (CentOS based) and doing things the rocks way.

Solution 3:

SystemImager can also use bit-torrent for faster mass deployment.

Solution 4:

I would not use multicast because this makes things more complicated. First, try to minimise NFS traffic, that means get the packages you need to install via HTTP. If your web server for the package repository gets overloaded, use two of them and distribute the load by assigning different servers to each client (for example ip address modulo 2).

Your NFS server may be faster if more nfsd daemons will be startet. Often only 8 of them are started.

I just measured the traffic of an Debian installation (via PXE, NFS, HTTP) using FAI. When installing 4.2GB of software, 1.3 GB of HTTP (all the packages) and 100MB NFS traffic (the nfsroot during installation) were send over the network. This was for one install client. So I guess reducing the NFS traffic and distributing the HTTP traffic will help a lot.

A 10 GB NIC in your server or bonding serveral NIC's would also help. And, I think it's not need to install all the machine at the same time, but more in a short time frame.

But anyway, first you have to analyse what your bottleneck will be. So make some tests unsig 20 machines for e.g.