Can I "atomically" swap a raid5 drive in Linux software raid?

Solution 1:

Individually, those commands will not do what you desire.

mdadm /dev/md0 -a /dev/sdd1 
cat /proc/mdstat; #(you should now have a spare drive in raid5)
mdadm /dev/md0 -f /dev/sdc1
cat /proc/mdstat; #(you should now see a rebuild occuring to sdd1)

A test of the actual command does indeed cause a rebuild to occur.

Alas, I don't believe you can do what you desire right now.

As an aside, I often reference the linux raid wiki, and experiment on what I see there using loopback files.

dd if=/dev/zero of=loopbackfile.0 bs=1024k count=100
losetup /dev/loop0 loopbackfile.0

That gives you 100 meg file that is available as /dev/loop0. Create another couple of them, and you can use mdadm (e.g. "mdadm --create /dev/md0 --level=5 --raid-devices=3 /dev/loop0 /dev/loop1 /dev/loop2) without affecting real drives or data.


Note I formerly had said that

mdadm /dev/md0 -a /dev/sdd1
mdadm --grow /dev/md0 --raid-disks=4

would grow your array to a raid6. This is false. This will simply add a fourth disk to your array, which does not put you in any better of a position than you are currently in.


Solution 2:

Test software raid in a sandbox!

I would suggest you play arround within a sandbox.
As mdadm can work with image-files and not just with devicefiles like
i.e. /dev/sda or /dev/mapper/vg00/lv_home - why don`t you test your migration
within a second softwarerraid on your machine :?)

Linux OS

I'm doing this under debian/lenny and bash:

# cat /etc/debian_version && uname -r && bash --version
5.0.2
2.6.26-2-amd64
GNU bash, version 3.2.39(1)-release (x86_64-pc-linux-gnu)
Copyright (C) 2007 Free Software Foundation, Inc.

Step 1

As root create 4x128MB diskimages like this (you need 512 MB free diskspace on /)

sudo su 
mkdir -p ~/raidtest/{root,home} && cd ~/raidtest
for i in sd{a,b,c,d} ; do
  dd if=/dev/zero bs=128 count=1M of=$i
done

Lets see whats happened:

# ls -hon --time-style=+
total 512M
drwxr-xr-x 2 0 4,0K  home
drwxr-xr-x 2 0 4,0K  root
-rw-r--r-- 1 0 128M  sda
-rw-r--r-- 1 0 128M  sdb
-rw-r--r-- 1 0 128M  sdc
-rw-r--r-- 1 0 128M  sdd

Step 2

partitioning the files

I create 3 partitions (20MB, 40MB and 56MB) for swap,/ and /home on sda through an loop-device:

# losetup /dev/loop0 sda
# ! echo "n
p
1

+20M
t
fd
n
p
2

+40M
t
2
fd
n
p
3


t
3
fd
w" | fdisk /dev/loop0

Ok, look whats happened:

# fdisk -l /dev/loop0
    Disk /dev/loop0: 134 MB, 134217728 bytes
255 heads, 63 sectors/track, 16 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0xe90aaf21

      Device Boot      Start         End      Blocks   Id  System
/dev/loop0p1               1           3       24066   fd  Linux raid autodetect
/dev/loop0p2               4           9       48195   fd  Linux raid autodetect
/dev/loop0p3              10          16       56227+  fd  Linux raid autodetect

Copy this partitionscheme to loop{1,2,3} ^= sd{b,c,d}

# losetup /dev/loop1 sdb
# sfdisk -d /dev/loop0 | sfdisk /dev/loop1
# losetup /dev/loop2 sdc
# sfdisk -d /dev/loop0 | sfdisk /dev/loop2
# losetup /dev/loop3 sda
# sfdisk -d /dev/loop0 | sfdisk /dev/loop3

Optional: If you have installed parted, run partprobe on the devices to update the kernels table

# partprobe /dev/loop0
# partprobe /dev/loop1
# partprobe /dev/loop2
# partprobe /dev/loop3

Step 3

Use kpartx to created the per partition devices under /dev/mapper/

aptitude install kpartx dmsetup
# kpartx -av /dev/loop0
add map loop0p1 (254:3): 0 48132 linear /dev/loop0 63
add map loop0p2 (254:4): 0 96390 linear /dev/loop0 48195
add map loop0p3 (254:5): 0 112455 linear /dev/loop0 144585
# kpartx -av /dev/loop1
add map loop1p1 (254:6): 0 48132 linear /dev/loop1 63
add map loop1p2 (254:7): 0 96390 linear /dev/loop1 48195
add map loop1p3 (254:8): 0 112455 linear /dev/loop1 144585
# kpartx -av /dev/loop2
add map loop2p1 (254:9): 0 48132 linear /dev/loop2 63
add map loop2p2 (254:10): 0 96390 linear /dev/loop2 48195
add map loop2p3 (254:11): 0 112455 linear /dev/loop2 144585
# kpartx -av /dev/loop3
add map loop3p1 (254:12): 0 48132 linear /dev/loop3 63
add map loop3p2 (254:13): 0 96390 linear /dev/loop3 48195
add map loop3p3 (254:14): 0 112455 linear /dev/loop3 144585

Step 4

create your raid5 and watch the status
We are still root! On my workstation I don`t use raid, just LVM, so I have to load the kernel module and install the package mdadm.

# modprobe raid5
# aptitude install mdadm
# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4]
unused devices: <none>

I use md{10,11,12} for this test. Watch out, that they are not used on your system (this would be abnormal)!
--force and -x 0 are used, because otherwise mdadm puts one partition als spare:

## the 20MB Partition
# mdadm --create --force -l 5 -n3 -x 0 /dev/md10 /dev/mapper/loop0p1 /dev/mapper/loop1p1 /dev/mapper/loop2p1
mdadm: array /dev/md10 started.
## the 40MB Partition
# mdadm --create --force -l 5 -n3 /dev/md11-x 0 /dev/mapper/loop0p2 /dev/mapper/loop1p2 /dev/mapper/loop2p2
mdadm: array /dev/md11 started.
## the 56MB Partition
# mdadm --create --force -l 5 -n3 /dev/md12-x 0 /dev/mapper/loop0p3 /dev/mapper/loop1p3 /dev/mapper/loop2p3
mdadm: array /dev/md12 started.

How it looks like now:

# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4]
md12 : active raid5 dm-11[2] dm-8[1] dm-5[0]
      112256 blocks level 5, 64k chunk, algorithm 2 [3/3] [UUU]

md11 : active raid5 dm-10[2] dm-7[1] dm-4[0]
      96256 blocks level 5, 64k chunk, algorithm 2 [3/3] [UUU]

md10 : active raid5 dm-9[2] dm-6[1] dm-3[0]
      48000 blocks level 5, 64k chunk, algorithm 2 [3/3] [UUU]

unused devices: <none>

Info
the output isn't nice. mdstat shows just dm-3 .. dm-11, meaing /dev/mapper/loop*
but ls -lsa /dev/disk/by-id shows you the current mapping.

My output on md10 begins with dm-9 (meaning /dev/mapper/loop0p1), because of tests I did while writing this article and my LVM uses dm-{0,1,2}.
You can also use mdadm --examine --scan or more detailed infos via mdadm -Q --detail /dev/md10 /dev/md11 /dev/md12

Step 5

As root create silently filesystems and swap

# mkswap /dev/md10 > /dev/null 2>&1
# mke2fs -m0 -Lroot /dev/md11 -F > /dev/null 2>&1
# mke2fs -m0 -Lhome /dev/md12 -F > /dev/null 2>&1

Mount your new raiddevices:

# swapon /dev/md10
# mount /dev/md11 root/
# mount /dev/md12 home/

Have a look at the structure and if /dev/md10 is a valid swap-partition:
(my workstation also uses /dev/mapper/vg00-swap, therefore the higher priority)

# \tree
.
|-- home
|   `-- lost+found
|-- root
|   `-- lost+found
|-- sda
|-- sdb
|-- sdc
`-- sdd

# cat /proc/swaps
Filename                                Type            Size    Used    Priority
/dev/mapper/vg00-swap                   partition       9764856 53688   -1
/dev/md10                               partition       47992   0       -2

wow, much work for the sandbox - but its woth it, when your wanna play with mdadm - use it!

Now you have a running raid5 and can test the migration
I think there are some excellent answers here - test them carefully on your system!

Last step

After finishing your tests, shut down your mds and delete your /dev/loop*

# mdadm --stop /dev/md10
# mdadm --stop /dev/md11
# mdadm --stop /dev/md12
# kpartx -dv /dev/loop0
# kpartx -dv /dev/loop1
# kpartx -dv /dev/loop2
# kpartx -dv /dev/loop3

bringing it up again after a reboot

sudo su
cd ~/raidtest
# connecting the files to /dev/loop*
losetup /dev/loop0 sda
losetup /dev/loop1 sdb
losetup /dev/loop2 sdc
losetup /dev/loop3 sdd

# access to the partions in /dev/loop*
kpartx -av /dev/loop0
kpartx -av /dev/loop1
kpartx -av /dev/loop2
kpartx -av /dev/loop3

# start the raid again
mdadm --assemble /dev/md10 /dev/mapper/loop0p1 /dev/mapper/loop1p1 /dev/mapper/loop2p1
mdadm --assemble /dev/md11 /dev/mapper/loop0p2 /dev/mapper/loop1p2 /dev/mapper/loop2p2
mdadm --assemble /dev/md12 /dev/mapper/loop0p3 /dev/mapper/loop1p3 /dev/mapper/loop2p3

# show active raids
cat /proc/mdstat

After testing: copy partitiontable to /dev/sdd

Your tests went fine?
Ok, then you have to copy the partition from /dev/sda to /dev/sdd as we did in the sandbox with our files:

sfdisk -d /dev/sda | sfdisk /dev/sdd

Now you can add /dev/sdd to your raid

Info
If this fails, becaus of different harddisk vendors/models, you have to play with -uS (sectors), -uB (blocks), -uC (cylinders) or -uM (megabytes) - consult man sfdisk!

Some of my real-life raidcombos where P-ATA <-> P-ATA but even SCSCI <-> P-ATA works fine, unless the new devices size is equal or bigger then other harddisks.
Softwareraid ist so much flexible!

Update your /etc/mdadm/mdadm.conf

If you have an /etc/mdadm/mdadm.conf please look into and update it! mdadm can help you displaying the correct syntax:

mdadm --detail --scan

Good luck!