SAN + MySQL replication- is that what I want for my load-balanced Drupal cluster?

Solution 1:

I don't know drupal specifically, but you can generally lump things into three datastore catagories when approaching this problem.

1.) the database - this one is "easy" in that mysql replication is a mature well documented solution. Replication or DRBD can offer HA, but to take advantage of multiple servers at once for performance scaling your application will need the built-in ability to split read (select) and write (insert/update/delete) queries across the master and slave.

2.) the filesystem - this one is trickier. "normal" filesystems (ext3/ntfs) aren't designed for multi-host scaling, and the ones that are (gfs/ocfs) are almost always more complicated than they're worth (especially if you have to ask here). The most common solution is a NAS based approach (nfs on unix, cifs on windows), but that introduces a single point of failure, so its not an availability solution. Its usually not even a performance solution as you're reliant on the performance of the one fileserver. Its main value is in providing coherent read-write access from multiple hosts. If your application is cpu bottlenecked, then NAS will help you scale because your servers will spend their time waiting for the cpu to finish, not for the files to load.

3.) code & configuration - usually this is done on the filesystem, in the database, or both. I separate it here because its usually much smaller in scope and more the sysadmin's problem than the more content-oriented datastores of #1 and #2. Often you can get away with just manual (or scripted) copying of the files.

So with all that in mind you need to evaluate how drupal handles those three catagories, and how you can replicate them. Odds are you'll start with just a NAS and a loadbalancer. Its very unlikely you want a SAN yet.