Replacing Hard Drives [closed]

I was wondering if it is a good idea to replace a hard drive in a (fairly) system-critical database server after a certain number of years of use, before it dies.

For example, I was thinking of replacing a hard drive after 3 years of use. Since I have many hard drives across servers, I could stagger which hard drives are replaced.

Is this a good idea, or do people just wait for the failure?


Google did a study on disk drives and found very little correlation between disk age and failure. SMART tests also do not show failures.

My local observations (>500 servers) is similar. I have new disks fail quickly while old ones still chug along.

My general rule is if we seen disk issues (SMART or system errors) we replace it immediately. If not, then the drives get cycled out when the server does.

Google Study http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en/us/archive/disk_failures.pdf


No.

One of the biggest problems with replacing a hard drive on an active production server is that doing so will trigger a rebuild. Especially if you are using RAID5, and especially if you are using large drives, forcing a rebuild creates a very significant risk of an unrecoverable failure. The risk of losing the array during a rebuild is far greater than the risk involved in leaving a 3-year-old drive in place.

Taking an extreme example, if you successively replace every disk in a 6-disk RAID5 array comprised of 2TB disks, your theoretical risk of an unrecoverable read error during one of the rebuilds is in the neighborhood of 58% (according to my napkin math; please do your own and compare notes). In other words: your "preventive" disk replacement is, in effect, nothing less than an act of sabotage.

The only time when I would consider refreshing drives in an old server would be in the course of "refurbishing" it, e.g. after having been decommissioned from one task and before putting it back into service with a new role. Even at that point, capacity and performance requirements would be far more important than the age of the drives.