HP Proliant DL180 G6 - Smart Array P410, bay 11 error

I've been trying to search over ther internet for a solution to fix this. I'm the new IT person for my organization and our previous IT have not kept any records on certain things. I do understand, that its a bad practice but I'm now making all these on a documentation for any future reference.

Having said that; recenty I came across an issue on our server. We're using an HP Proliant DL180 Gen6 server with ESXi 5.0 ... The issue is; that I'm unable to power up certain VM's as it gave me I/O error. Below seen was the error;

Reason: 0 (Input/output error). Cannot open the disk '/vmfs/volumes/4e7a4edb-08851e40-0c1e-1cc1de700f23/EON-GATEWAY (192.168.0.1 )/EON-GATEWAY ( 192.168.0.1 )-000001.vmdk' or one of the snapshot disks it depends on.

So to speak, I powered down all the VM's and restarted the host to jump into BIOS for an observation on the RAID. I do not know what type of RAID that the server is on as it shows something like;

Error on SLOT1 : bay 11 -- (as I remember)

Is there a way for me to check what exactly the issue is.. Because, I can see that the effected hard disk still flashes green color LED. Out of 12 bay.. bay 1 shows an orange color LED & bay 4 shows nothing at all.

I'm pretty much confused how to get this sorted. If anyone can guide me what exactly I need to do to get this sorted or may be a hint on how to check the RAID / array info.??

Update

Below seen images are from smart array controller...

enter image description here

enter image description here

enter image description here

Here's a video link to the server HDD's. I'm still curious as now the bay 1 flashes blue & amber while others bays are in blue (on the smart array screen as seen above)..


Solution 1:

This could be a VMware issue or a locking problem on the virtual disk. Can you capture the full error message? Do other virtual machines power on without problems?

Despite that, it appears you have a physical storage issue, too.

Here's what the HP Smart Array P410 configuration output on a DL180 G6 looks like:

      physicaldrive 1I:1:1 (port 1I:box 1:bay 1, SAS, 2 TB, OK)
      physicaldrive 1I:1:2 (port 1I:box 1:bay 2, SAS, 2 TB, OK)
      physicaldrive 1I:1:3 (port 1I:box 1:bay 3, SAS, 2 TB, OK)
      physicaldrive 1I:1:4 (port 1I:box 1:bay 4, SAS, 2 TB, OK)
      physicaldrive 2I:1:5 (port 2I:box 1:bay 5, SAS, 2 TB, OK)
      physicaldrive 2I:1:6 (port 2I:box 1:bay 6, SAS, 2 TB, OK)

Are you sure that you're not mistaking the drive designation of 1I:1:1, which means (port 1I:box 1:bay 1) for "SLOT1 : Bay 11"? That would explain the amber/orange light in the first drive bay.

Given that this server was not documented well, there's a high probability that it was also configured with RAID5 (mean? probably).

  • Does the server boot?
  • What error messages do you see at POST?
  • Do you have to press any keys on the keyboard to allow the system to boot? (e.g. F1)
  • What capacity and type of disks are installed in the server?

If the server is on, you can view the RAID configuration from within ESXi. Do this by navigating to: Hardware Status > Sensors > Storage.

If your ESXi was installed using an HP-specific VMware image, you will see the RAID configuration there.

enter image description here

If you don't see anything inside of VMware, you will need to reboot and view the RAID configuration at the BIOS level.

When the system is powered on, you want to hit the F8 key when prompted to enter the Smart Array P410 configuration utility.

Once inside, select "View Logical Drives".

enter image description here

This will show you the RAID health status and you can hit Enter for details. This will tell you conclusively which disks are good/bad/missing in the array.

Solution 2:

I may be wrong but I think you have two problems here.

Yes you appear to have a physical disk issue, if you can avoid the downtime then boot up off the HP SPP/ACU image and go into ACU, run the diagnostics and replace parts as needed.

The first error however suggests the datastore is IP-based, i.e. NFS or iSCSI, rather than a local SAS/SATA disk such as the ones you're having real problems with. Have you got other IP based datastores? If so I'd look at where they're based and see if something's been switched off or deleted.

Solution 3:

If you're lucky, your predecessor might have installed a proper HP image of the esxi server on the box, in which case you should be able to access the HP System Management Homepage remotely:

https:// ipofyourserver :2381

This should be able to tell you a little bit more about the general health of the server (which also includes the arrays).

If not, you should reboot the server and hit F8 after the P410i controller is done initializing. That will get you into the ORCA (Option Rom Configuration of Arrays). Select "Show logical drives". This should give you a list of the local logical drives, and will also say whether the array(s) are healthy or not. Note that you might have to "press any key" to actually see the P410i initialization messages, after the HP logo has appeared.

One last thing: I've seen on several occasions that something goes haywire in the internal workings of the storage box in the server, which will either render the LEDs in the drives mute (off), or scramble them so a healthy drive can be blinking amber instead of green. Just a fair warning, not to take the drive activity LEDs too serious :)