How are SMART selftests related to badblocks?
I have to disagree with voretaq7 — SMART is not magic. When you have a drive and one of it's sectors goes bad, you'll not be able to read data from it anymore. So it is perfectly possible to have an unreadable file on a modern disk drive. SMART would mark this unreadable sector as "Current Pending" and "Offline Uncorrectable" when it would be first accessed after failure.
But when this sector would be written to again then it would be remapped to remapping space, unmarked and a "Reallocated_Sector_Ct" counter would increase. Then a whole drive would be readable again.
smartctl -t long
test is useful — it will test the whole drive space for unreadable sectors and log and mark as "Current Pending" and "Offline Uncorrectable" the first bad sector encountered when run. I'm configuring my servers to run this long test once per week on every drive. It does not affect normal drive functions too much, as OS requests always have priority over SMART scans.
As on a server I always run disks in RAID1 mirrors, so when a long test finds a bad sector I can rewrite its contents using data from another drive in a mirror, forcing reallocation.
badblocks
is also useful sometimes — for example it'll test the whole drive and won't stop at a first error. It can test a single partition or any other part of a drive. You can use it to quickly check if a bad block was successfully reallocated.
Like I pointed out in my other answer, every modern hard drive has remapping space available (because especially at today's disk densities, no drive platter will be perfect - there will always be a few defects that the drive has to remap around, even on brand-new-never-been-used-came-off-the-assembly-line-into-my-hands drives).
Because of this, theoretically you should have a SMART failure reported before something like badblocks
notices (end-user-visible) bad sectors on a drive.
On modern hard disks any end-user-visible bad sectors (as might be reported by badblocks
or automatically detected by the OS) are a final gasp and shudder of a dying disk.
Ultimately SMART and badblocks
test two different, but related, things:
SMART is a self-monitoring tool:
The hard drive knows some information about its operating parameters, and has some meta-knowledge as to what is "normal" for some, and "acceptable" for others.
If the drive senses that certain parameters are "abnormal" or "unacceptable" it will report a pre-failure condition -- in other words the drive is still functional, but might fail soon.
For example: The spindle motor normally draws 0.10 amps, but now it's drawing 0.50 amps -- an abnormally high draw that may indicate the shaft is binding or the permanent lubricant on the bearings is gone. Eventually the motor will be unable to overcome the resistance and the drive will seize.
Another example: The drive has 1000 "remap" blocks to deal with bad sectors. It has used 750 of them, and the engineers that built the drive determined that number of remaps indicates something internally wrong (bad platter, old-age failure, damaged head) - the drive will report a pre-failure condition allowing you time to get your data off before the remap space runs out and bad sectors become visible.
SMART is looking for more than bad sectors - it's a more comprehensive assessment of the drive's health. You could have a SMART pre-failure warning on a drive with no bad sectors and no read/write errors (for example, the spindle motor issue I described above).
badblocks
is a tool with a specific (outdated) purpose: Find bad sectors.
badblocks
comes from a time before SMART and bad-sector remapping. Back then we knew drives had imperfections, but the only way to map them out to prevent accidentally storing data there was to stress-test the disk, cause a failure, and then remember not to put data there ever again.
The reason I say it is outdated is because the electronics on modern drives already do what badblocks
does, internally and a few thousand times faster. badblocks
basically allows ancient drives that lack sophisticated electronics to re-map (or skip over) sectors that have failed, but modern hard drives already detect failed sectors and remap them for you.
Theoretically you could use badblocks
data to have the OS remap (visible) failures as if your modern disk was an ancient Winchester disk, but that's ultimately counterproductive -- Like I said previously ANY bad sectors detected with badblocks
on a modern drive are a cause to discard the entire drive as defective (or about to fail).
Visible bad sectors indicate that the drive is out of remapping space, which is relatively rare for modern disks unless they're old (nearing end of functional life) or defective (bad platters/heads from the factory).
So basically if running badblocks
on a disk before you deploy it in production makes you feel better go ahead and do it, but if your disk was manufactured in this century and it shows a visible bad sector you should chuck it in the trash (or call in its warranty).
For my money SMART status and defense in depth is a better use of my time than manually checking disks.
Good answers to this question are
https://superuser.com/a/693065
https://superuser.com/a/693064
Contrary to other answers I find badblocks not outdated but a very useful tool. Once I upgraded my pc with a new hard drive and it started running unstable. It took me quite a while to realize thanks to badblocks that the disk surface had defects. Since then I run full write-mode (destructible!) badblocks for every new hard drive I start using and never had that problem again. I highly recommend a
time sudo badblocks -swvo sdX.log /sev/sdX
for every new hard drive. It will test every single bit of the disk a few times for writing and reading and so can avoid a lot of trouble later.
During this test bad blocks will be mapped out by the drive. So the "Realocated Sector Count" should be noted before and after the test and compared with the SMART threshold since it will tell something about the health of the drive.