How to install Ubuntu 14.04/16.04 64-bit with a dual-boot RAID 1 partition on an UEFI/GPT system?
Solution 1:
UPDATE: I have verified that the description below also works for Ubuntu 16.04. Other users have reported working on 17.10 and 18.04.1.
NOTE: This HOWTO will not give you LVM. If you want LVM too, try Install Ubuntu 18.04 desktop with RAID 1 and LVM on machine with UEFI BIOS instead.
After days of trying, I now have a working system! In brief, the solution consisted of the following steps:
- Boot using a Ubuntu Live CD/USB.
- Partitions the SSDs as required.
- Install missing packages (mdadm and grub-efi).
- Create the RAID partitions.
- Run the Ubiquity installer (but do not boot into the new system).
- Patch the installed system (initramfs) to enable boot from a RAIDed root.
- Populate the EFI partition of the first SSD with GRUB and install it into the EFI boot chain.
- Clone the EFI partition to the other SSD and install it into the boot chain.
- Done! Your system will now have RAID 1 redundancy. Note that nothing special needs to be done after e.g. a kernel update, as the UEFI partitions are untouched.
A key component of step 6 of the solution was a delay in the boot sequence that otherwise dumped me squarely to the GRUB prompt (without keyboard!) if either of the SSDs were missing.
Detailed HOWTO
1. Boot
Boot using EFI from the USB stick. Exactly how will vary by your system. Select Try ubuntu without installing.
Start a terminal emulator, e.g. xterm
to run the commands below.
1.1 Login from another computer
While trying this out, I often found it easier to login from another, already fully configured computer. This simplified cut-and-paste of commands, etc. If you want to do the same, you can login via ssh by doing the following:
On the computer to be configured, install the openssh server:
sudo apt-get install openssh-server
Change password. The default password for user ubuntu
is blank. You can probably pick a medium-strength password. It will be forgotten as soon as you reboot your new computer.
passwd
Now you can log into the ubuntu live session from another computer. The instructions below are for linux:
ssh -l ubuntu <your-new-computer>
If you get a warning about a suspected man-in-the-middle-attack, you need to clear the ssh keys used to identify the new computer. This is because openssh-server
generates new server keys whenever it is installed. The command to use is typically printed and should look like
ssh-keygen -f <path-to-.ssh/known_hosts> -R <your-new-computer>
After executing that command, you should be able to login to the ubuntu live session.
2. Partition disks
Clear any old partitions and boot blocks. Warning! This will destroy data on your disks!
sudo sgdisk -z /dev/sda
sudo sgdisk -z /dev/sdb
Create new partitions on the smallest of your drives: 100M for ESP, 32G for RAID SWAP, rest for RAID root. If your sda drive is smallest, follow Section 2.1, otherwise Section 2.2.
2.1 Create partition tables (/dev/sda is smaller)
Do the following steps:
sudo sgdisk -n 1:0:+100M -t 1:ef00 -c 1:"EFI System" /dev/sda
sudo sgdisk -n 2:0:+32G -t 2:fd00 -c 2:"Linux RAID" /dev/sda
sudo sgdisk -n 3:0:0 -t 3:fd00 -c 3:"Linux RAID" /dev/sda
Copy partition table to other disk and regenerate unique UUIDs (will actually regenerate UUIDs for sda).
sudo sgdisk /dev/sda -R /dev/sdb -G
2.2 Create partition tables (/dev/sdb is smaller)
Do the following steps:
sudo sgdisk -n 1:0:+100M -t 1:ef00 -c 1:"EFI System" /dev/sdb
sudo sgdisk -n 2:0:+32G -t 2:fd00 -c 2:"Linux RAID" /dev/sdb
sudo sgdisk -n 3:0:0 -t 3:fd00 -c 3:"Linux RAID" /dev/sdb
Copy partition table to other disk and regenerate unique UUIDs (will actually regenerate UUIDs for sdb).
sudo sgdisk /dev/sdb -R /dev/sda -G
2.3 Create FAT32 file system on /dev/sda
Create FAT32 file system for the EFI partition.
sudo mkfs.fat -F 32 /dev/sda1
mkdir /tmp/sda1
sudo mount /dev/sda1 /tmp/sda1
sudo mkdir /tmp/sda1/EFI
sudo umount /dev/sda1
3. Install missing packages
The Ubuntu Live CD comes without two key packages; grub-efi and mdadm. Install them. (I'm not 100% sure grub-efi is needed here, but to maintain symmetry with the coming installation, bring it in as well.)
sudo apt-get update
sudo apt-get -y install grub-efi-amd64 # (or grub-efi-amd64-signed)
sudo apt-get -y install mdadm
You may need grub-efi-amd64-signed
instead of grub-efi-amd64
if you have secure boot enabled. (See comment by Alecz.)
4. Create the RAID partitions
Create the RAID devices in degraded mode. The devices will be completed later. Creating a full RAID1 did sometimes give me problems during the ubiquity
installation below, not sure why. (mount/unmount? format?)
sudo mdadm --create /dev/md0 --bitmap=internal --level=1 --raid-disks=2 /dev/sda2 missing
sudo mdadm --create /dev/md1 --bitmap=internal --level=1 --raid-disks=2 /dev/sda3 missing
Verify RAID status.
cat /proc/mdstat
Personalities : [raid1]
md1 : active raid1 sda3[0]
216269952 blocks super 1.2 [2/1] [U_]
bitmap: 0/2 pages [0KB], 65536KB chunk
md0 : active raid1 sda2[0]
33537920 blocks super 1.2 [2/1] [U_]
bitmap: 0/1 pages [0KB], 65536KB chunk
unused devices: <none>
Partition the md devices.
sudo sgdisk -z /dev/md0
sudo sgdisk -z /dev/md1
sudo sgdisk -N 1 -t 1:8200 -c 1:"Linux swap" /dev/md0
sudo sgdisk -N 1 -t 1:8300 -c 1:"Linux filesystem" /dev/md1
5. Run the installer
Run the ubiquity installer, excluding the boot loader that will fail anyway. (Note: If you have logged in via ssh, you will probably want to to execute this on you new computer instead.)
sudo ubiquity -b
Choose Something else as the installation type and modify the md1p1
type to ext4
, format: yes, and mount point /
. The md0p1
partition will automatically be selected as swap.
Get a cup of coffee while the installation finishes.
Important: After the installation has finished, select Continue testing as the system is not boot ready yet.
Complete the RAID devices
Attach the waiting sdb partitions to the RAID.
sudo mdadm --add /dev/md0 /dev/sdb2
sudo mdadm --add /dev/md1 /dev/sdb3
Verify all RAID devices are ok (and optionally sync'ing).
cat /proc/mdstat
Personalities : [raid1]
md1 : active raid1 sdb3[1] sda3[0]
216269952 blocks super 1.2 [2/1] [U_]
[>....................] recovery = 0.2% (465536/216269952) finish=17.9min speed=200000K/sec
bitmap: 2/2 pages [8KB], 65536KB chunk
md0 : active raid1 sdb2[1] sda2[0]
33537920 blocks super 1.2 [2/2] [UU]
bitmap: 0/1 pages [0KB], 65536KB chunk
unused devices: <none>
The process below may continue during the sync, including the reboots.
6. Configure the installed system
Set up for to enable chroot into the install system.
sudo -s
mount /dev/md1p1 /mnt
mount -o bind /dev /mnt/dev
mount -o bind /dev/pts /mnt/dev/pts
mount -o bind /sys /mnt/sys
mount -o bind /proc /mnt/proc
cat /etc/resolv.conf >> /mnt/etc/resolv.conf
chroot /mnt
Configure and install packages.
apt-get install -y grub-efi-amd64 # (or grub-efi-amd64-signed; same as in step 3)
apt-get install -y mdadm
If you md devices are still sync'ing, you may see occasional warnings like:
/usr/sbin/grub-probe: warning: Couldn't find physical volume `(null)'. Some modules may be missing from core image..
This is normal and can be ignored (see answer at bottom of this question).
nano /etc/grub.d/10_linux
# change quick_boot and quiet_boot to 0
Disabling quick_boot
will avoid the Diskfilter writes are not supported bugs. Disabling quiet_boot
is of personal preference only.
Modify /etc/mdadm/mdadm.conf to remove any label references, i.e. change
ARRAY /dev/md/0 metadata=1.2 name=ubuntu:0 UUID=f0e36215:7232c9e1:2800002e:e80a5599
ARRAY /dev/md/1 metadata=1.2 name=ubuntu:1 UUID=4b42f85c:46b93d8e:f7ed9920:42ea4623
to
ARRAY /dev/md/0 UUID=f0e36215:7232c9e1:2800002e:e80a5599
ARRAY /dev/md/1 UUID=4b42f85c:46b93d8e:f7ed9920:42ea4623
This step may be unnecessary, but I've seen some pages suggest that the naming schemes may be unstable (name=ubuntu:0/1) and this may stop a perfectly fine RAID device from assembling during boot.
Modify lines in /etc/default/grub
to read
#GRUB_CMDLINE_LINUX_DEFAULT="quiet splash"
GRUB_CMDLINE_LINUX=""
Again, this step may be unnecessary, but I prefer to boot with my eyes open...
6.1. Add sleep script
(It has been suggested by the community that this step might be unnecessary and can be replaced using GRUB_CMDLINE_LINUX="rootdelay=30"
in /etc/default/grub
. For reasons explained at the bottom of this HOWTO, I suggest to stick with the sleep script even though it is uglier than using rootdelay. Thus, we continue with our regular program...)
Create a script that will wait for the RAID devices to settle. Without this delay, mounting of root may fail due to the RAID assembly not being finished in time. I found this out the hard way - the problem did not show up until I had disconnected one of the SSDs to simulate disk failure! The timing may need to be adjusted depending on available hardware, e.g. slow external USB disks, etc.
Enter the following code into /usr/share/initramfs-tools/scripts/local-premount/sleepAwhile
:
#!/bin/sh
echo
echo "sleeping for 30 seconds while udevd and mdadm settle down"
sleep 5
echo "sleeping for 25 seconds while udevd and mdadm settle down"
sleep 5
echo "sleeping for 20 seconds while udevd and mdadm settle down"
sleep 5
echo "sleeping for 15 seconds while udevd and mdadm settle down"
sleep 5
echo "sleeping for 10 seconds while udevd and mdadm settle down"
sleep 5
echo "sleeping for 5 seconds while udevd and mdadm settle down"
sleep 5
echo "done sleeping"
Make the script executable and install it.
chmod a+x /usr/share/initramfs-tools/scripts/local-premount/sleepAwhile
update-grub
update-initramfs -u
7. Enable boot from the first SSD
Now the system is almost ready, only the UEFI boot parameters need to be installed.
mount /dev/sda1 /boot/efi
grub-install --boot-directory=/boot --bootloader-id=Ubuntu --target=x86_64-efi --efi-directory=/boot/efi --recheck
update-grub
umount /dev/sda1
This will install the boot loader in /boot/efi/EFI/Ubuntu
(a.k.a. EFI/Ubuntu
on /dev/sda1
) and install it first in the UEFI boot chain on the computer.
8. Enable boot from the second SSD
We're almost done. At this point, we should be able to reboot on the sda
drive. Furthermore, mdadm
should be able to handle failure of either the sda
or sdb
drive. However, the EFI is not RAIDed, so we need to clone it.
dd if=/dev/sda1 of=/dev/sdb1
In addition to installing the boot loader on the second drive, this will make the UUID of the FAT32 file system on the sdb1
partition (as reported by blkid
) match that of sda1
and /etc/fstab
. (Note however that the UUIDs for the /dev/sda1
and /dev/sdb1
partitions will still be different - compare ls -la /dev/disk/by-partuuid | grep sd[ab]1
with blkid /dev/sd[ab]1
after the install to check for yourself.)
Finally, we must insert the sdb1
partition into the boot order. (Note: This step may be unnecessary, depending on your BIOS. I have gotten reports that some BIOS' automatically generates a list of valid ESPs.)
efibootmgr -c -g -d /dev/sdb -p 1 -L "Ubuntu #2" -l '\EFI\ubuntu\grubx64.efi'
I did not test it, but it is probably necessary to have unique labels (-L) between the ESP on sda
and sdb
.
This will generate a printout of the current boot order, e.g.
Timeout: 0 seconds
BootOrder: 0009,0008,0000,0001,0002,000B,0003,0004,0005,0006,0007
Boot0000 Windows Boot Manager
Boot0001 DTO UEFI USB Floppy/CD
Boot0002 DTO UEFI USB Hard Drive
Boot0003* DTO UEFI ATAPI CD-ROM Drive
Boot0004 CD/DVD Drive
Boot0005 DTO Legacy USB Floppy/CD
Boot0006* Hard Drive
Boot0007* IBA GE Slot 00C8 v1550
Boot0008* Ubuntu
Boot000B KingstonDT 101 II PMAP
Boot0009* Ubuntu #2
Note that Ubuntu #2 (sdb) and Ubuntu (sda) are the first in the boot order.
Reboot
Now we are ready to reboot.
exit # from chroot
exit # from sudo -s
sudo reboot
The system should now reboot into Ubuntu (You may have to remove the Ubuntu Live installation media first.)
After boot, you may run
sudo update-grub
to attach the Windows boot loader to the grub boot chain.
Virtual machine gotchas
If you want to try this out in a virtual machine first, there are some caveats: Apparently, the NVRAM that holds the UEFI information is remembered between reboots, but not between shutdown-restart cycles. In that case, you may end up at the UEFI Shell console. The following commands should boot you into your machine from /dev/sda1
(use FS1:
for /dev/sdb1
):
FS0:
\EFI\ubuntu\grubx64.efi
The first solution in the top answer of UEFI boot in virtualbox - Ubuntu 12.04 might also be helpful.
Simulating a disk failure
Failure of either RAID component device can be simulated using mdadm
. However, to verify that the boot stuff would survive a disk failure I had to shut down the computer and disconnecting power from a disk. If you do so, first ensure that the md devices are sync'ed.
cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md1 : active raid1 sdb3[2] sda3[0]
216269952 blocks super 1.2 [2/2] [UU]
bitmap: 2/2 pages [8KB], 65536KB chunk
md0 : active raid1 sda2[0] sdb2[2]
33537920 blocks super 1.2 [2/2] [UU]
bitmap: 0/1 pages [0KB], 65536KB chunk
unused devices: <none>
In the instructions below, sdX is the failed device (X=a or b) and sdY is the ok device.
Disconnect a drive
Shutdown the computer. Disconnect a drive. Restart. Ubuntu should now boot with the RAID drives in degraded mode. (Celebrate! This is what you were trying to achieve! ;)
cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md1 : active raid1 sda3[0]
216269952 blocks super 1.2 [2/1] [U_]
bitmap: 2/2 pages [8KB], 65536KB chunk
md0 : active raid1 sda2[0]
33537920 blocks super 1.2 [2/1] [U_]
bitmap: 0/1 pages [0KB], 65536KB chunk
unused devices: <none>
Recover from a failed disk
This is the process to follow if you have needed to replace a faulty disk. If you want to emulate a replacement, you may boot into a Ubuntu Live session and use
dd if=/dev/zero of=/dev/sdX
to wipe the disk clean before re-rebooting into the real system. If you just tested the boot/RAID redundancy in the section above, you can skip this step. However, you must at least perform steps 2 and 4 below to recover full boot/RAID redundancy for your system.
Restoring the RAID+boot system after a disk replacement requires the following steps:
- Partition the new drive.
- Add partitions to md devices.
- Clone the boot partition.
- Add an EFI record for the clone.
1. Partition the new drive
Copy the partition table from the healthy drive:
sudo sgdisk /dev/sdY -R /dev/sdX
Re-randomize UUIDs on the new drive.
sudo sgdisk /dev/sdX -G
2. Add to md devices
sudo mdadm --add /dev/md0 /dev/sdX2
sudo mdadm --add /dev/md1 /dev/sdX3
3. Clone the boot partition
Clone the ESP from the healthy drive. (Careful, maybe do a dump-to-file of both ESPs first to enable recovery if you really screw it up.)
sudo dd if=/dev/sdY1 of=/dev/sdX1
4. Insert the newly revived disk into the boot order
Add an EFI record for the clone. Modify the -L label as required.
sudo efibootmgr -c -g -d /dev/sdX -p 1 -L "Ubuntu #2" -l '\EFI\ubuntu\grubx64.efi'
Now, rebooting the system should have it back to normal (the RAID devices may still be sync'ing)!
Why the sleep script?
It has been suggested by the community that adding a sleep script might be unnecessary and could be replaced by using GRUB_CMDLINE_LINUX="rootdelay=30"
in /etc/default/grub
followed by sudo update-grub
. This suggestion is certainly cleaner and does work in a disk failure/replace scenario. However, there is a caveat...
I disconnected my second SSD and found out that with rootdelay=30
, etc. instead of the sleep script:
1) The system does boot in degraded mode without the "failed" drive.
2) In non-degraded boot (both drives present), the boot time is reduced. The delay is only perceptible with the second drive missing.
1) and 2) sounded great until I re-added my second drive. At boot, the RAID array failed to assemble and left me at the initramfs
prompt without knowing what to do. It might have been possible to salvage the situation by a) booting to the Ubuntu Live USB stick, b) installing mdadm
and c) re-assembling the array manually but...I messed up somewhere. Instead, when I re-ran this test with the sleep script (yes, I did start the HOWTO from the top for the nth time...), the system did boot. The arrays were in degraded mode and I could manually re-add the /dev/sdb[23]
partitions without any extra USB stick. I don't know why the sleep script works whereas the rootdelay
doesn't. Perhaps mdadm
gets confused by two, slightly out-of-sync component devices, but I thought mdadm
was designed to handle that. Anyway, since the sleep script works, I'm sticking to it.
It could be argued that removing a perfectly healthy RAID component device, re-booting the RAID to degraded mode and then re-adding the component device is an unrealistic scenario: The realistic scenario is rather that one device fails and is replaced by a new one, leaving less opportunity for mdadm
to get confused. I agree with that argument. However, I don't know how to test how the system tolerates a hardware failure except to actually disable some hardware! And after testing, I want to get back to a redundant, working system. (Well, I could attach my second SSD to another machine and swipe it before I re-add it, but that's not feasible.)
In summary: To my knowledge, the rootdelay
solution is clean, faster than the sleep script for non-degraded boots, and should work for a real drive failure/replace scenario. However, I don't know a feasible way to test it. So, for the time being, I will stick to the ugly sleep script.
Solution 2:
My suggestion is for Debian OS, but I think it would work also for Ubuntu and others.
One possible way to solve a problem that occurs with lot of motherboards not correctly handling the UEFI entries (Debian doesn't boot even if you made the correct entry efibootmgr -c -g -d /dev/sda -p 1 -w -L "debian" -l /EFI/debian/grubx64.efi
, UEFI BIOS shows a "debian" bootable disk but it wouldn't boot from it) , is to use instead the generic entry /boot/efi/EFI/boot/bootx4.efi
.
For example Asus Z87C doesn't like /EFI/debian/grubx64.efi
.
So, if you mounted the efi partition /dev/sda1
to /boot/efi
path:
mkdir /boot/efi/EFI/boot
cp /boot/efi/EFI/debian/grubx64.efi /boot/efi/EFI/boot/bootx4.efi
Then reboot.
UEFI BIOS will see a "UEFI OS" generic disk, and also any other entry previously created with efibootmgr, but it would boot from the "UEFI OS" generic without any trouble.