Monday 27 November 2017

EFI System Partition in soft RAID1

One reason you might want to put the EFI System Partition (ESP) in a RAID1 array on a computer with Linux soft RAID is to have redundancy when booting. If one disk fails, you want the boot to continue from the other disk.

At first I thought this wasn't possible since a RAID1 partition wouldn't have the specific FAT filesystem and GUID required by the specification. However the fact that the CentOS 7 install media offered the choice of putting the ESP on a RAID1 array and that it actually works, made me doubt my hypothesis.

The key to this that the CentOS 7 installer uses RAID metadata format 1.0, which is located at the end of the partition. Thus it doesn't clash with the beginning of the partition, which is where the BIOS will check to see if the partition is an ESP. However most Linux partition tools will detect it first as a RAID member so it's not immediately obvious that it's an ESP.

There are some caveats to this scheme. All writing of the ESP must be done while it's mounted as a RAID array so that there is no discrepancy between the two members. If the only OS on the disks is Linux, this won't be a problem. But don't use this scheme if the ESP also boots other operating systems that don't know about Linux RAID.

For CentOS when you look at the choice of boot devices in the BIOS, you should see two disk boot candidates, both labelled CentOS.

On the machines I used, HP z230 workstations, I found that I had to disable Legacy Boot or errors reading the boot sectors would be triggered.

The bottom line is I now have workstations with soft RAID1 whose disks are fully redundant. If one disk fails, the other will continue to boot and run with degraded arrays for each of the partitions.

No comments:

Post a Comment