mdadm RAID5 RAID6 how to check consistency on running array
Run (replace md125
with your actual array):
echo "check" > /sys/block/md125/md/sync_action
It will read all the drives, compute parity stripes and check if they're correct. For RAID6 it will also correct a single-mismatch errors (when just one drive is went out of sync) by using all the rest drives, thanks to dual parity which enables detection of dual errors and correction of single errors, include those could have happened due to the disk bit error rate. This is important for modern very large disks.
It'll report any important messages to kernel log readable via dmesg
. You can monitor status via /proc/mdstat
file or mdadm --detail /dev/md125
.
It is very useful to run the check periodically as it not only will correct misswrites, but also detect and kick out of array dying devices early, so better set up this check to be invoked via system scheduler (cron or systemd timers). Some Linux distros (e.g. Debian) do this by default.
While first parity syndrome is really simply XOR, the second one is not. A second syndrome is calculated using quite sophisticated mathematics called Galois field. Linux software RAID uses a field that enables RAID6 with no more than 257 active devices (not counting hot spares). This calculation is quite intensive for the CPU, so it's better to run this check when your system doesn't have much load. You can also limit its load by limiting the check rate by setting /sys/block/md125/md/sync_speed_max
with some arbitrary value (200000
, meaning 200 MB/sec is the default). The Linux also tests and reports optimal algorithm for RAID redundancy syndrome calculation for your system on boot, so you can check which one it will use and how fast it'll perform by reading boot logs.
You can also interrupt the running check by sending idle
:
echo "idle" > /sys/block/md125/md/sync_action