Need help with recovering RAID array

Solution 1:

Filesystem layers in Linux (starting in reverse order, from the physical drive to the filesystem):

  1. physical device
    • /dev/sdi
    • /dev/sdj
    • /dev/sdk
    • /dev/sdl
  2. special md partition type on each drive (when used)

    • this may or may not be present. While it is recommended that you simply make single partitions on each drive that spans the entire size of the drive it is on, it is possible to specify the entire drive by using the device name directly. Note that this can cause some partition tools to become confused about what they are dealing with (because the partition table simply "goes away"), so I don't recommend it.

    In your case, the entire drives are specified, so there are no partitions to see. You won't have to worry about this.

  3. md driver (when used)

    • /dev/md2

    Your output from both the detail and /proc report that the array is up on all drives and no drives are in a failed state. This means the array is healthy!

  4. LVM (when used)

    • Type the following into a shell while logged in as root:

    pvscan && vgscan && lvscan

    If there are any volumes to be found, they should be here. Note that the volume scan process is controlled by a file that can choose certain devices to ignore when it performs the scan. You'll want to make sure that you scan /dev/md2 explicitly. Each LVM volume has a GUID imprinted into it; if this is lost or corrupted, it can cause some of the issues you are seeing. The goal here is to get your LVM volumes recognized. Once they are healthy, you'll be in good shape.

  5. filesystem

    I think you know the drill here.

From here you should be able to recover your filesystem(s).

Solution 2:

usually LVM is done 'on top of' MD. maybe you setup LVM using command line and not your distro's tools? if so, maybe the startup scripts don't know about the LVM.

first do a 'vgscan' and see if it comes up. if so, it's just an issue of untangling the scripts.

Solution 3:

You may be challenged to get a "do this to fix it answer" on this largely because any good sysadmin is uber paranoid about data loss, including guiding someone else in a situation that may lead to data loss.

From what you've provided I'll summarize what I see an maybe where you can start.

  • /dev/md2 is a RAID 5 device with (4) 1.5TB drives
  • The entire underlying drive is being used by the raid module - no partitions on the drive
  • Your /dev/md2 device is now reporting a normal / happy status

Start by posting the results of: pvdisplay and vgscan

Do you have a "lvm-raid" file located in /etc/lvm/backup/ ?

Solution 4:

The first thing I would do in this kind of situation, if at all possible: Make an exact copy of both disks forming /dev/md2 (with dd or something like that). This may takes ages, but if you frack up things even more while attempting to repair this, you can go back to where you started.