System does not boot from root LVM RAID1 partition when any of the underlying PV is missing
I have LVM RAID 1 (not mdadm RAID 1, but exactly LVM2 RAID 1) for my root partition. I would like to make my system boot without either of the HDDs where the underlying PVs of the root LVM partition reside.
The kernel launch line in grub.cfg (automatically generated by GRUB2) looks like:
linux /kernel-genkernel-x86-3.11.7-hardened-r1-3 root=/dev/mapper/vg-root ro dolvm
It looks perfectly with both disks enabled and the system is HDD fault-tolerant runtime, i.e. it works properly if either of HDDs are down during runtime.
However, if I try to boot without one of HDDs, I am getting
Refusing activation of partial LV root. Use --partial to override.
during boot and kernel panic. From one side it seems reasonable since it's not a normal behavior for lvm to launch when one of PV is absent. However, it's absolutely necessary in order to boot server. The potential workaround I can think of is adding some additional option to kernel launch line.
Do you know how to make the LVM RAID 1 work for root partition when on of HDDs does not work?
Solution 1:
I haven't tried this before, and I'm not sure it's the best solution, but it should get you up and going... I'd strongly recommend you at least browse through the sources at the bottom to sanity check what I've got here... :)
Basically, you'll need to boot off rescue media, rescan the PVs, VGs, and LVs.
lvm pvscan
lvm vgscan
lvm lvscan
Then, you should be able to force the VG activation:
lvm vgchange -ay --partial VolGroup00
Once that's done, you can remove the missing mirror copy from the VG:
lvm vgreduce --removemissing --force VolGroup00
Once that's done, you should be good to reboot into a non-mirrored config.
Once you get back up and happy, and you've fixed/replaced the bad drive(s), you'll need to add them back into the system, and then do something like the following (assuming that /dev/sdb is the drive that failed, and you've created an LVM partition on it using fdisk as /dev/sdb1, and that /dev/sda1 is where the good mirror is):
pvcreate /dev/sdb1
lvm vgextend VolGroup00 /dev/sdb1
Then you'll need to recreate each of the LV mirrors with something like:
lvm lvconvert -m 1 /dev/VolGroup00/usr /dev/sda1 /dev/sdb1
lvm lvconvert -m 1 /dev/VolGroup00/var /dev/sda1 /dev/sdb1
lvm lvconvert -m 1 /dev/VolGroup00/root /dev/sda1 /dev/sdb1
...
Sources:
- http://pleasedonttouchthescreen.blogspot.com/2011/11/mirroring-root-filesystem-with-lvm.html
- https://www.centos.org/docs/5/html/Cluster_Logical_Volume_Manager/mirrorrecover.html
- http://www.datadisk.co.uk/html_docs/redhat/rh_lvm.htm
Solution 2:
The problem was solved in the LVM v. 2.02.108 in June, 2014.
A degraded LV activation mode had been added to the LVM, which became the default activation mode. Basically, it activates a LV if it is possible without the data loss, but even if the LV is incomplete (like in the case of RAID1 LVM without one leg).
More thorough description can be found here: LVM: activation: Add "degraded" activation mode.