mdadm raid5 recover double disk failure - with a twist (drive order)

Solution 1:

To answer your questions,

Can it be restored?
- First thing's first - STOP, sit back and just think a little. Yes, algorithm, chunk size and disk order is vital to getting whatever filesystem that was present, to properly re-assemble. But since you've overwritten the superblocks, you're now left with trial and error.
- Second, is there any way you can retrieve the previous disk layout? I always do an mdadm --detail > backupfile just to keep that disk layout somewhere safe. Check dmesg, /var/log for any evidence of how the disks were configured in the raid.
- Lastly, if you match the previous chunk size and disk order, you may have damaged the ext4 superblock - there are ways to quckly scan for other superblocks (and there's a nifty program called TestDisk that scans for superblocks of existing filesystems and tries to browse them manually: http://www.cgsecurity.org/wiki/Main_Page)
Since sdc is new, I would continue to try and assemble manually via the missing clause, and yes, sde must be in the correct order for it to assemble in degraded mode. Once you find the correct layout - copy all data off the array and start again, documenting the layout (so you don't run in to this issue again).

Good Luck

Solution 2:

Before you do ANYTHING else, capture an 'mdadm --examine /dev/sdX1' for each of the drives that WERE in your array, and an 'mdadm --detail /dev/md0' from that, you should be able to determine the exact layout.

I just had to do this myself to recover a Synology array in a separate question:

How to recover an mdadm array on Synology NAS with drive in "E" state?

Edit: Sorry, just saw that you said you lost the superblocks on all the drives.

Your later commands LOOK correct. Simplest option might be to run the creates with each possible ordering, and then see if you can mount and access the filesystem on them read-only.

Solution 3:

This question is old and I'm sure nobody can help you now, but for others reading:

the most dangerous mistake you made is not one you numbered, which was to run:

mdadm --create ...

on the original disks, before you were prepared knowing what to do. This has overwritten the metadata, so you have no record of drive order, data offset, chunk size, etc.

To recover from this, you need to overwrite those again with the correct values. The easiest way to know this is to look at the metadata, but you destroyed that already. The next way is to guess. Guess at the different combinations of a command like this, with different values for any of the options except what you know (4 devices, level 5), and also different disk order:

mdadm --create /dev/md0 --assume-clean --metadata=1.2 --raid-devices=4 --level=5 --layout=... --chunk=512 --data-offset=128M /dev/sdb1 missing /dev/sdd1 /dev/sde1

But since you DO NOT know the correct result, again, you should not run that on the old disks destroying them further, making the same fatal mistake. Instead, use an overlay; for example this procedure should work to keep the originals safe.

Once you have found some arguments that produce a working array that you can fsck or mount and verify (eg. check the checksum of a file large enough to span across all the raid members like an iso which you should have stored with its checksum/pgp signature, or unzip -t or gunzip -t a large archive)

mdadm raid5 recover double disk failure - with a twist (drive order)

Solution 1:

Solution 2:

Solution 3:

Related

Recent Posts