Find files affected by bad blocks with md-raid5 and LVM
I've been doing a lot of research on this topic over the last few weeks - and I think I'm close to completing my recovery, as much as is possible at least. To make a long story short, I'll just describe the problem without filling in every tiny technical detail.
Assume you have multiple RAID-5 arrays, each with 8 disks, and have then spanned those together into a single LVM logical volume. One of the disks then dies in one of the arrays, and during rebuild you encounter an unrecoverable read error on a second disk in that array. And of course, there are no backups.
I've already ddrescue'd the data from the drive with the URE onto a new drive, only 5K of data is damaged all grouped into a very small area of disk. I am also assuming that once I reassemble that MD device using the ddrescue'd copy, that I will multiply the size of my data loss by the number of non-parity drives in my array (so 35K of data loss), as the parity calculations for the stripes using those blocks will be incorrect.
I've read and understand the procedure's at http://smartmontools.sourceforge.net/badblockhowto.html for determining which files would be corrupted by a situation like this, but my problem is in figuring out exactly what blocks will be corrupt after the md rebuild to use as input to debugfs. Figuring out all of the offsets where md and lvm store metadata isn't going to be fun either, but I think I can handle that part.
Can I just multiply all of my bad-block numbers by 7 and then assume that the following 6 blocks after each of those will also be bad, and then follow the LVM instructions in the guide linked above?
And to be clear - I'm not concerned with repairing or re-mapping the bad blocks as the guide describes, I've replaced the disk and will be letting md handle that kind of thing. I just want to know what files on the ext4 filesystem have been affected.
You still looking for help on this? One way you might find affected files is to tar the filesystem to /dev/null. Any file containing an error will be complained about by tar. Something like:
tar cf /dev/null /file/system/to/check
might do it for you.