SMART-Test never finishes

When running SMART-Tests using smartmontools, they NEVER finish. I always get "Interrupted (host reset.)" on various different systems and disks, including Debian in x86 and ARM, OS X on x64, with external and internal drives. Even when run in captive mode with disks all empty (zeroed with dd).

What am I doing wrong?


Solution 1:

When the drive does not handle any input/output activity during the test, it may go to standby, which raises the Interrupted (host reset) condition. Try to read from the disk at suitable intervals:

while true; do dd if=/dev/disk1 of=/dev/null count=1; sleep 60; done

(replace /dev/disk1 with the appropriate device; reads one sector from that device every 60 seconds until you hit ctrl-c)

This helped in my environment: OS X 10.6.8, WD Elements USB-connected drive, SAT-SMART-driver 0.8.

A captive test should theoretically keep the drive online. Yet the hardware command send by smartctl may time out before the test completes, causing the kernel to reset the link and ending up in the same situation as above (bug #303).

See this thread on the smartmontools-support mailing list for further details. I acknowledge Christian Franke for the insight given here.

Solution 2:

I tried the solution from Tobu, in my case the I kept finding the external USB drive in sleep mode regardless sometime after starting the test and interrupting it, it seems dd ended up reading from a kernel cache and the cache was large enough for the disk to enter sleep mode. I noticed that calling smartctl to ask for status was always able to "wake up" the disk. So: this version of the same idea did the trick for me:

sudo bash -c 'while true; do smartctl -a /dev/sdb > /dev/null; sleep 60; done'

After 5 hours the external USB disk is still spinning. For the first time I could see a smartctl long test finish in an external disk.

I believe this solution also has the advantage that the disk heads are not moved unnecesarily every minute. The long run finished almost exactly in the predicted time (the keep-awake script did not add time to the run)