How to fix my broken raid10 array
Solution 1:
Big fat warning:
Anything you do with your array (including stuff I suggest) may lead to a complete data loss. If there is a really valuable (expensive to regain) not-backed-up data, let someone experienced handle the situation for you. Including making binary copies of all four drives.
From your output it seems you have
Device Role : Active device 3
twice in your --examine
output. That would point to an attempt at recovery, but done wrong.
From /proc/mdstat
it looks like your array gets assembled, but not run. There are some very weird device numbers (4,5,1
), while your drives should be 0,1,2,3
. That as well suggests, there are discrepancies in the metadata.
Another point of interest is the Events
counters in the drives metadata. Those aggree for sd[b-d]
, but seems to be behind on sde
. Are you sure that sdd
was the drive that dropped out? As this would rather point to sde
being out of the array for some time.
You could try assembling the array without the dropped out drive (mdadm -A -R /dev/md127 /dev/sd[bcd]
or mdadm -A --force -R /dev/md127 /dev/sd[bce]
). Doing so could prevent the conflict. If that works, and even if it works, do not write anything to the array, backup your data, and then try adding sdd
back as a hot spare.
If it does not work, you might try updating your question with output of mdadm -D /dev/md127
after assembling the array (both suggested ways actually).