RAID (mdadm) - What happens if drives are mismatched in size?
I came across this answer by mistake, but in case anyone is curious, here's an answer supported by experiments.
The Short Version
Bonus Question: can I create an md(4)
RAID array out of block devices of unequal size? Yes, but the RAID array will have the size of the smallest block device (plus some overheads for its own housekeeping). If device sizes aren't within 1% of each other, you get a warning.
Question 1: can I add to an existing md(4)
RAID array a device smaller than the smallest current member? Nope, sorry. mdadm
will flat out refuse to do that to protect your data.
Question 2: can you resize an existing md array? Yes (read the mdadm
manpge!), but it may not be worth the effort. You'll have to back everything up, then resize the contents of the RAID device, then resize the device itself — all of this is quite prone to errors, miscalculations, and other things that'll cost you your data (painful experience talking).
It's not worth the risk and effort. If you have a new, blank disk, here's how to resize it and also keep between one and two copies of all your data intact at all times (assuming you have 2-disk RAID1):
- Create a new
md(4)
array on it (with one disk missing). - Recreate the structure of the array contents (Crypto, LVM, partitions tables, any combination thereof, whatever floats your boat).
- Copy the data from the existing disk to the new one.
- Reboot, using the new disk.
- Wipe the old disk's partition table (or zero the
md(4)
superblock). If necessary, create the required partitions to match the scheme on he new disk. - Add the old disk to the new array.
- Wait for the array members to sync. Have some coffee. Fly to Latin America and pick your own coffee beans, for that matter. :) (If you live in Latin America, fly to Africa instead).
Note: yes, this is the same technique 0xC0000022L described in his answer.
Question 3. What if the drive is 1G short? :) Don't worry about it. Chances are your replacement drive will be bigger. In fact, with a strategy like above it pays to get cheaper larger drives whenever one fails (or for a cheaper upgrade). You can get a progressive upgrade.
Experimental Proof
Experimental Setup
First, let's fake some block devices. We'll use /tmp/sdx
and /tmp/sdy
(each 100M), and /tmp/sdz
(99M).
cd /tmp
dd if=/dev/zero of=sdx bs=1M count=100
sudo losetup -f sdx
dd if=/dev/zero of=sdy bs=1M count=100
sudo losetup -f sdy
dd if=/dev/zero of=sdz bs=1M count=99 # Here's a smaller one!
sudo losetup -f sdz
This sets up three files as three loopback block devices: /dev/loop0
, /dev/loop1
and /dev/loop2
, mapping to sdx
, sdy
and sdz
respectively. Let's check the sizes:
sudo grep loop[012] /proc/partitions
7 0 102400 loop0
7 1 102400 loop1
7 2 101376 loop2
As expected, we have two loop devices of exactly 100M (102400 KiB = 100 MiB) and one of 99M (exactly 99×1024 1K blocks).
Making a RAID Array out of Identically-Sized Devices
Here goes:
sudo mdadm --create -e 1.2 -n 2 -l 1 /dev/md100 /dev/loop0 /dev/loop1
mdadm: array /dev/md100 started.
Check the size:
sudo grep md100 /proc/partitions
9 100 102272 md100
This is precicely what we expect: one look at the mdadm manual reminds us that version 1.2 metadata take up 128K: 128 + 102272 = 102400. Now let's destroy it in preparation for the second experiment.
sudo mdadm --stop /dev/md100
sudo mdadm --misc --zero-superblock /dev/loop0
sudo mdadm --misc --zero-superblock /dev/loop1
Making a RAID Array out of Unequally Sized Devices
This time we'll use the small block device.
sudo mdadm --create -e 1.2 -n 2 -l 1 /dev/md100 /dev/loop0 /dev/loop2
mdadm: largest drive (/dev/loop0) exceeds size (101248K) by more than 1%
Continue creating array? y
mdadm: array /dev/md100 started.
Well, we got warned, but the array was made. Let's check the size:
sudo grep md100 /proc/partitions
9 100 101248 md100
What we get here is 101,248 blocks. 101248 + 128 = 101376 = 99 × 1024. The usable space is that of the smallest device (plus the 128K RAID metadata). Let's bring it all down again for our last experiment:
sudo mdadm --stop /dev/md100
sudo mdadm --misc --zero-superblock /dev/loop0
sudo mdadm --misc --zero-superblock /dev/loop2
And Finally: Adding a smaller Device to a Running Array
First, let's make a RAID1 array with just one of the 100M disks. The array will be degraded, but we don't really care. We just want a started array. The missing
keywords is a placeholder that says ‘I don't have a device for you yet, start he array now and I'll add one later’.
sudo mdadm --create -e 1.2 -n 2 -l 1 /dev/md100 /dev/loop0 missing
Again, let's check the size:
sudo grep md100 /proc/partitions
9 100 102272 md100
Sure enough, it's 128K short of 102400 blocks. Adding the smaller disk:
sudo mdadm --add /dev/md100 /dev/loop2
mdadm: /dev/loop2 not large enough to join array
Boom! It won't let us, and the error is very clear.
There are several ways to set up mdX
devices. The method would be to use gdisk
(or sgdisk
if you prefer the command-line only version) to partition this as GPT. If you want to boot from the array create a "BIOS Boot Partition", type code ef02
. This is only necessary if you want to boot off this array, otherwise no need to care. Then, create a partition the same size or smaller than the smallest disk to be added to the array. Last but not least, copy the GPT data over to the other disk (expert menu in gdisk
, using x
, and then u
and specify the target device). This is a destructive process.
It should be possible - if the file system allows for it - to resize an existing partition to something smaller and then use the same method to copy the GPT data. However, this gets you into a bit of a kerfuffle. Because now you have two disks, but still no mdX
device. One of them has to be prepared as mdX
, either partition-wise (which I implied above) or disk-wise) and then the data must be moved from the existing disk to that.
So:
- big disk (
/dev/sda
) contains data, data is smaller than 3001 GB, partitions are not - smaller disk
/dev/sdb
gets added to the system - you partition
/dev/sdb
withgdisk
- you create an array from each respective partition (
mdadm -C /dev/md2 -l 1 -n 1 /dev/sdb2
) - you create file systems on the new arrays
- you copy all data over, making sure that your system will be prepared to run off a GPT disk and making GRUB2 understand the implications (see below)
- you copy the GPT partitioning data over from
/dev/sdb
to/dev/sda
- you add the "raw" partitions from
/dev/sda
into the existing arrays - you wait for
/proc/mdstat
to show you that the synching is done
If you followed all steps you should now be able to boot into the new system off the mdX arrays. However, keep a rescue CD or a PXE boot option handy, just in case.
GRUB2 will not be able to recognize the setup off hand. So you need some "magic". Here's a one-liner:
for i in /dev/disk/by-id/md-uuid-*; do DEV=$(readlink $i); echo "(${DEV##*/}) $i"; done|sort|tee /boot/grub/devicemap
Or let's be more verbose:
for i in /dev/disk/by-id/md-uuid-*
do
DEV=$(readlink $i)
echo "(${DEV##*/}) $i"
done|sort|sudo tee /boot/grub/devicemap
This creates (or overwrites) the default /boot/grub/devicemap
with one that tells GRUB2 where to find each respective disk. The result would be something like this list:
(md0) /dev/disk/by-id/md-uuid-...
(md2) /dev/disk/by-id/md-uuid-...
(md3) /dev/disk/by-id/md-uuid-...
(md4) /dev/disk/by-id/md-uuid-...
If you use legacy GRUB, you also need to create the "BIOS Boot Partition" with meta-data version 0.9, using mdadm -e 0 ...
and the process will differ. I haven't done that, though.