3rd party SSD drives in HP Proliant server - monitoring drive health
As discussed in a previous question, we have 6 OWC Mercury Extreme SATA SSD drives installed in our HP Proliant DL360 G7 server (using a P410i RAID controller). They work great, and are very fast. However, I'm aware that SSD drives unfortunately don't last forever, and the HP ACU utility, not surprisingly, won't monitor the health of any of the drives:
Does anyone know of any Windows (Server 2008R2) software or utilities that will allow monitoring of the health of each individual drive in the array, so that we can proactively pick up on any potential issues?
You can use smartctl to peek at individual drives behind a cciss RAID controller like so:
smartctl -a -l ssd /dev/sda -d cciss,1
or:
smartctl -a -l ssd /dev/sda -d sat+cciss,1
(you may need to remove -l ssd
if your smartctl is too old)
Don't bother... Really.
You have an enterprise server with an enterprise RAID controller and hot-swappable drives (with a 5-year warranty), presumably in a RAID 1+0 setup. Do you care why a drive fails beyond the fact that it fails? I don't. I wouldn't care why a spinning disk died either (S.M.A.R.T. errors, bearing failure, overheating, etc.)
High-end (SAS) HP Solid State drives do provide some additional health information. But if you're using RAID and know where to get a spare, I don't think this information is tremendously helpful. You get temperature readings and an "Estimated Life Remaining" figure.
That is all.
physicaldrive 1I:1:4
Port: 1I
Box: 1
Bay: 4
Status: OK
Drive Type: Unassigned Drive
Interface Type: Solid State SAS
Size: 400 GB
Firmware Revision: HPD9
Serial Number: 00197356
Model: HP MO0400FBRWC
Current Temperature (C): 29
Maximum Temperature (C): 43
Usage remaining: 99.57%
Power On Hours: 6418
Estimated Life Remaining based on workload to date: 61922 days
SSD Smart Trip Wearout: False
PHY Count: 2
PHY Transfer Rate: 6.0Gbps, Unknown