Odd mdadm output: --examine shows array state failed, --detail shows everything clean

The setup: 8 disks in a mdadm-managed RAID5 array (/dev/md0, made from /dev/sdc through /dev/sdj). One disk (/dev/sdh) is showing SMART errors (increasing pending sector count) so I'm looking to replace it. Additionally, the machine boots from a Revodrive SSD in a PCIe slot that's configured with a RAID0 stripe.

The oddness: mdadm --detail output shows the array as clean, and everything looks to be running well (I can mount, read, write the array without problems). mdadm --examine output for every disk shows an array state of failed.

root@saturn:/backup# cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md0 : active raid5 sdi1[6] sdj1[8] sdh1[5] sdg1[4] sdf1[3] sde1[2] sdd1[1] sdc1[0]
      20511854272 blocks super 1.0 level 5, 64k chunk, algorithm 2 [8/8] [UUUUUUUU]

unused devices: <none>

The proc table only shows the mdadm managed array of SATA drives, not the revodrive, which I'd expect as the revodrive RAID should be managed by its own hardware controller.

root@saturn:/backup# mdadm --detail /dev/md0
mdadm: metadata format 01.00 unknown, ignored.
/dev/md0:
        Version : 01.00
  Creation Time : Wed Apr 20 10:14:05 2011
     Raid Level : raid5
     Array Size : 20511854272 (19561.63 GiB 21004.14 GB)
  Used Dev Size : 5860529792 (5589.04 GiB 6001.18 GB)
   Raid Devices : 8
  Total Devices : 8
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Mon Sep 19 13:42:21 2011
          State : clean
 Active Devices : 8
Working Devices : 8
 Failed Devices : 0
  Spare Devices : 0

         Layout : left-symmetric
     Chunk Size : 64K

           Name : saturn:0  (local to host saturn)
           UUID : e535a44b:b319927e:4a574c20:39fc3f08
         Events : 45

    Number   Major   Minor   RaidDevice State
       0       8       33        0      active sync   /dev/sdc1
       1       8       49        1      active sync   /dev/sdd1
       2       8       65        2      active sync   /dev/sde1
       3       8       81        3      active sync   /dev/sdf1
       4       8       97        4      active sync   /dev/sdg1
       5       8      113        5      active sync   /dev/sdh1
       6       8      129        6      active sync   /dev/sdi1
       8       8      145        7      active sync   /dev/sdj1

Obviously, there's a metadata format error in the first line, from an auto-generated metadata flag in mdadm.conf, but this is mdadm v2.6.7.1 running on Ubuntu, and I've chalked it down to a known issue

root@saturn:/backup# mdadm --examine /dev/sdc1
mdadm: metadata format 01.00 unknown, ignored.
/dev/sdc1:
          Magic : a92b4efc
        Version : 1.0
    Feature Map : 0x0
     Array UUID : e535a44b:b319927e:4a574c20:39fc3f08
           Name : saturn:0  (local to host saturn)
  Creation Time : Wed Apr 20 10:14:05 2011
     Raid Level : raid5
   Raid Devices : 8

 Avail Dev Size : 5860529904 (2794.52 GiB 3000.59 GB)
     Array Size : 41023708544 (19561.63 GiB 21004.14 GB)
  Used Dev Size : 5860529792 (2794.52 GiB 3000.59 GB)
   Super Offset : 5860530160 sectors
          State : clean
    Device UUID : 1b508410:b129e871:d92c7979:30764611

    Update Time : Mon Sep 19 13:52:58 2011
       Checksum : 2e68592 - correct
         Events : 45

         Layout : left-symmetric
     Chunk Size : 64K

    Array Slot : 0 (0, 1, 2, 3, 4, 5, 6, failed, 7)
   Array State : Uuuuuuuu 1 failed

But in the --examine output, the Array state is failed. Each disk seems to show itself as the failed member - /dev/sdd shows uUuuuuuu, /dev/sde shows uuUuuuuu, etc - but all show the mystery 9th "failed" slot between slots 6 and 7 on the previous line.

I'm guessing the disk superblocks are screwy, despite everything being functional. I'd like to get this fixed before proceeding with the replacement of the suspect disk, as I'm a little concerned about how the disks might behave if I failed a drive. What's the best way for me to proceed?


You need to update mdadm to at least version v3.1.1. This bug describes the problem that you were having and how updating mdadm shows that the new superblock format is now correctly interpreted.