I've got a Smart Array 200i which seems to have some bad slots (slot 3 and slot 5). It doesn't matter what HD I put in these slots, it keeps telling me the drive is bad.

My question is two-fold:

  1. Is it perhaps just something I'm doing wrong? I was under the impression that all you had to do was take out the bad drive and put in the new. Am I mistaken on this?

  2. If the slots are indeed bad, can I replace the whole controller (which contains the boot drive of the OS on a raid five) without losing access to this drive's data after the swap?

On a slightly separate issue, this Smart Array has 2 SATA arrays, one with the OS and one which I believe is no longer in use. I would like to delete the second array and use the slots for the first array if the slot really is bad, but I'm not sure how to be 100% sure that it's not in use for one of the logical drives. Below I've illustrated as best I could the configuration as displayed in the Array Configuration Utility I will try to lay out the configuration in text form here (my reputation not being high enough and I cannot post an image):

Smart array E200i in embedded slot
    SATA Array A
         [+] Logical Drive 1 (953816 MB, RAID5)
         [ ] Unused Space, ??? 
    SATA Array B
         [X] Logical Drive 2 (1907675 MB, RAID 5) - Failed
         [ ] Unused Space, ???

UPDATE:

Response from hpacucli:

Smart Array E200i in Slot 0          (sn: QT91MP3908     )

   array A (SATA, Unused Space: 0 MB)

      logicaldrive 1 (931.5 GB, RAID 5, Interim Recovery Mode)

      physicaldrive 1I:1:1 (port 1I:box 1:bay 1, SATA, 500 GB, OK)
      physicaldrive 1I:1:2 (port 1I:box 1:bay 2, SATA, 500 GB, OK)
      physicaldrive 1I:1:3 (port 1I:box 1:bay 3, SATA, 1000.2 GB, Failed)

   array B (SATA, Unused Space: 0 MB)

      logicaldrive 2 (1.8 TB, RAID 5, Failed)

      physicaldrive 1I:1:4 (port 1I:box 1:bay 4, SATA, 2000.3 GB, OK)
      physicaldrive 2I:1:5 (port 2I:box 1:bay 5, SATA, 0 MB, Failed)
      physicaldrive 2I:1:6 (port 2I:box 1:bay 6, SATA, 2000.3 GB, OK)

Here's a list of things to check.

  • Are these HP disks? Are they generic disks installed in HP drive carriers? For the 500GB drive that appears to be 1TB, if it's HP-branded, HP replace older smaller capacities with larger disks, depending on product availability. 500GB SATA disks aren't made anymore, so that could be a reason.

  • It is unlikely that your controller is bad or needs replacement. Plus, being an E200i, it's on your motherboard, so it's not easily replaced. Your disks are connected to a backplane, so a bad slot == a bad backplane, not controller.

enter image description here

  • Check your firmware. Old versions of HP server BIOS and RAID controller firmware have bugs. If you're on a particularly-old version of the Smart Array E200i firmware, things may not work as expected.

Here's the firmware page for your server/OS combination.

  • If you have backups of your data and can afford downtime, shut the machine down and remove power. Power on and follow the prompts at the BIOS array screen. If prompted, choose the F2 option to "reenable all logical drives".

  • You can use the hpacucli tool to view the progress of RAID rebuilds as well as the point-in-time disk status.

  • Smart Array controllers will perform recovery operations on one logical drive at a time. In the case of the output you've presented, "logicaldrive 2" has completely failed. Is there a chance you removed the wrong disk at some point and booted the system without heeding any of the BIOS prompts?

  • An additional complication here is RAID5 on SATA disks. There's a chance your "logicaldrive 1" has unrecoverable read errors (URE). If you see the status of "Waiting for Rebuild" after replacing a disk, the array may not be recoverable.