Hard drive read errors that... stop?

Solution 1:

If one specific physical region of the drive surface goes bad, then until those sectors can be successfully mapped out, you'll get unrecovered read errors when you try to read any data that was written to that area. The drive knows that the sectors are bad (after the failures to access the sectors) but cannot remap the sectors because they already hold data. If you format the drive or overwrite the "bad" sectors, then the drive will have an opportunity to map out the bad sectors.

Once the bad sectors are mapped out, and as long as more of the drive surface does not fail, you're in good shape.

I don't know enough about drive failure models of current drives to know if there's much correlation between one part of the media surface going bad and the problem spreading or occurring again. If there is no correlation, then once the bad sectors get mapped out, you're in good shape. If there is a correlation, then this is the beginning of the end for the drive.

Solution 2:

Most modern drives will "vector out" a block that has gone bad. The drive has a pool of spare blocks and the firmware uses these to replace any blocks that are known to the drive to be bad. The drive cannot do this re-mapping when it fails to READ a block because it cannot supply the correct data. It just returns "read error". It does MARK the block as bad, so if the block ever does read correctly then the block is vectored out and the correct data written to the replacement block. If the OS ever WRITES to a block that is in a "vector out pending" state then the block is vectored out and the data written to the replacement block.

Linux software raid will, on getting a read error from a device, get the correct data from other elements in the array and then it tries to WRITE the bad block again. SO, if the write works OK then the data is safe, if not, the drive just does the above, vectors the block and then perform the write. SO, the drive has, with the help of the raid system, just repaired itself!

Assuming such events are reasonably rare, it is probably safe to carry on. If too many replacement blocks are being used then the drive may have a problem. There is a limit to how many replacement blocks can be vectored to spare blocks and that is a function of the drive.

Solution 3:

Yes, I have seen this as well, and under very similar circumstances. In my case, it was a "consumer-grade" Western Digital 1TB "Green" drive (WD10EARS) that pulled that stunt on me. The SMART Current_Pending_Sector raw value went from zero to 6, and then to 8, prompting the SMART monitoring daemon to send me some ominous emails.

I mdadm --failed and --removed the drive from the array and ran a non-destructive pass of badblocks over it, and yes, there were apparently over 2 dozen bad blocks. When the replacement drive arrived about a day later, I fixed the array, and life went on.

Shortly thereafter, out of sheer boredom, I reran badblocks on the "failed" drive to see if it had worsened. On the contrary, the drive had completely "repaired" itself: zero bad blocks! Shaking my head, I wiped it and set it aside to be recycled or donated.

The lesson: Don't use consumer-grade drives in servers, unless you are willing and able to put up with all manner of weirdness and unreliability. Corollary: Don't cheap-out on server components, because you'll eventually end up paying for them anyway, in extra time and aggravation.