How to create a bootable redundant Debian system with a 3 or 4 (or more) disk software raid10?

I'll describe the odd one here, that is a software raid10 built from 3 disks (knowledgeable people have been known to be in disbelief about the 3 disk raid10). Let's say you have a 1U server with 4 drive bays and you'd like to have one bay free, for a storage disk or as a hot spare. The disks are all of the same size, although it's not necessary as long as you create the partitions according to the smallest size disk.

You stick in a Debian CD or USB stick and begin installing the system. When you arrive at the part where you start partitioning the disks do the following...

Each disk that is part of the raid should have a bootable partition of about 1 GB that is NOT part of the raid. Create these partitions as normal, they have to be exactly the same size. Mark them as bootable, the mountpoint on one of the disks should be /boot, you can leave the others as unmounted.

/dev/sda1 - /boot
/dev/sdb1 - not mounted
/dev/sdc1 - not mounted

If you prefer (I do) to create separate partitions for the usual locations you can do this:

/dev/sd[abc]2 - swap  (Yes we have redundant swap, why not, it ought to be faster than swap outside the raid10)
/dev/sd[abc]3 - /
/dev/sd[abc]4 - /usr
/dev/sd[abc]5 - /tmp
/dev/sd[abc]6 - /var
/dev/sd[abc]7 - /opt
/dev/sd[abc]8 - /home

Otherwise just create one partition for swap and one large partition on each disk. Note, you can not partition a softraid (mdadm), that's why you create the partitions first. (Edit: Since kernel 2.6.28 it is possible to partition a raid like any other block device, though I prefer the above method.)

Create raids out of each partition except the first. For example:

mdadm --create /dev/md0 --level=10 --raid-devices=3 /dev/sd[abc]2

and so forth.

In the Debian installation you will use the appropriate menu options instead of the mdadm command, it was just to illustrate. If you have a 4th disk in the system either add it as hot spare, the 4th member of the raid or as storage, unless you do the latter make sure it shares the same partition table and bootable properties as the other disks. I'll leave that up to you.

By the way the installation menu can be a bit confusing with regards to creating the partitions and raids, just start over from scratch if you get lost or the menu system starts cursing at you. :-)

Just install Debian as usual. Once you arrive at the grub install stage you have to do bit more than usual.

We assume /dev/sda1 is mounted at /boot. Make sure the MBR is saved to both /dev/sda, /dev/sdb and /dev/sdc. So we tell grub that all 3 disks are boot disks.

Once the whole system has been installed you should be able to boot the system and you'll have a working bootable Debian system on a 3 disk raid10. However it is not yet fully redundant in case a disk fails, meaning it won't magically boot from another disk. In order to accomplish that you have to make exact copies of the boot partition on /dev/sda1 to the other disks.

Use dd for that (bs=500M will speed up dd a lot, adjust 500M to about 2/3 of your system's memory):

dd bs=500M if=/dev/sda1 of=/dev/sdb1
dd bs=500M if=/dev/sda1 of=/dev/sdc1

Now make sure that your bios is configured to attempt to boot from all 3 disks, order doesn't matter. As long as the bios will try to boot from any disk then in case one of the disks fails the system will automagically boot from the other disk because the UUIDs are exactly the same.

There is a small catch, don't forget to sometimes repeat the dd command if /boot has changed, say with a kernel upgrade. Make it a weekly cron job if you like.

This is always fun, test your redundant system by changing the bios boot priority. And if you feel lucky test it by yanking out one disk while it's running. :-) Actually I think you have to do that to make absolutely sure it's fully redundant, why else go through the trouble. Though, it's a fun exercise regardless. If you have done everything correctly (and I wrote it down correctly) your system WILL still boot when the raid becomes degraded. Just like if you would be using a hardware raid. I tested it on various 1U and 2U servers with 2, 3, 4 and more disks.

This will also work with a raid1.

By the way you have to use a boot partition that's not part of the raid because otherwise your system can't boot. There has to be a way for the raid to be started and since it is a softraid the kernel first has to be loaded in order for the raid to be recognised.


Late reply, but couldn't you do what I have been doing for some time?

I use 2-drive RAID1 installs for most of my servers.

The way they are set up is md0 is mounted as /boot and is a roughly 250MB raid1, while md1 is mounted as / and is the remainder of the drive's capacity excluding a swap area on each drive.

Like this any changes to /boot are mirrored accordingly to both drives, even though at boot time the OS will use the one the bios told it to.

Just have to remember to re-run grub-install /dev/sd* for each drive to contain a valid /boot. The OS will automatically keep /boot in proper sync with one another.