How to check the physical health of a USB stick in Linux?
There is no way to query a USB memory stick for SMART-like parameters;
I'm not aware of any memory sticks that support doing so
even via publicly-available proprietary software.
The best you can do is to check that you can successfully read+write to the entire device using badblocks
.
https://en.wikipedia.org/wiki/Badblocks
You want to specify one of the write tests, which will wipe all data on the stick; make a backup first.
Find the device by looking at dmesg
after plugging in the USB stick;
you'll see a device name (most likely sd<letter>
,
e.g., sdc
, sdd
, etc.) and manufacturer information.
Make sure you're using the proper device!
If the stick is formatted with a valid filesystem,
you may have to unmount it first (with the umount
command).
Example syntax, for a USB stick enumerated as /dev/sdz
,
outputting progress information,
with a data-destructive write test
and error log written to usbstick.log
:
sudo badblocks -w -s -o usbstick.log /dev/sdz
You'll need to repartition and reformat the stick afterwards, assuming it passes; this test will wipe everything on the stick. Any failures indicate a failure of the device's memory controller, or it has run out of spare blocks to remap failed blocks. In that case, no area of the device can be trusted.
Via [ubuntu] Error Check USB Flash Drive, I eventually found this, which could be helpful:
- http://oss.digirati.com.br/f3/ "F3 - an alternative to h2testw"
I arrived at the blogs Fight Flash Fraud and SOSFakeFlash, which recomend the software H2testw (see here or here) to test flash memories. I downloaded H2testw and found two issues with it: (1) it is for Windows only, and (2) it is not open source. However, its author was kind enough to include a text file that explains what it does; this page is about my GPLv3 implementation of that algorithm.
My implementation is simple and reliable, and I don't know exactly how F3 compares to H2testw since I've never run H2testw. I call my implementation F3, what is short for Fight Flash Fraud, or Fight Fake Flash.
Addendum by @pbhj: F3 is in the Ubuntu repos. It has two part, f3write writes 1GB files to the device and f3read attempts to read them afterwards. This way capacity and ability to write and effectively read data are tested.
It depends on the failure mode, I suppose. They're cheap for a reason.
As a USB device, watching the bus via device manager in Windows or the output of dmesg in Linux will tell you if the device is even recognized as being plugged in. If it isn't, then either the controller on board or the physical connections are broken.
If the device is recognized as being plugged in, but doesn't get identified as a disk controller (and I don't know how that could happen, but...) then the controller is shot.
If it's recognized as a disk drive, but you can't mount it, you might be able to repair it via fdisk and rewrite the partition table, then make another filesystem.
If you're looking for the equivalent of S.M.A.R.T., then you won't find it. Thumbdrive controllers are cheap. They're commodity storage, and not meant to have the normal failsafes and intelligence that modern drives have.
Along the way to today, this thread raised some questions.
-How long will this take (implied by discussion of letting it run overnight).
I'm currently testing a USB 3.0 128G Sandisk using sudo badblocks -w -s -o
, it is connected to my USB 3/USBC PCIe card in an older Athlon 64x2. So, USB3 into USB3 on PCIe should be quite fast.
Here is my console command line at 33% completion:
Testing with pattern 0xaa: 33.35% done, 49:47 elapsed. (0/0/0 errors)
and again later:
Testing with pattern 0xaa: 54.10% done, 1:17:04 elapsed. (0/0/0 errors)
Next came this segment:
Reading and comparing: 43.42% done, 2:23:44 elapsed. (0/0/0 errors)
This process repeats with oxaa, then 0x55, 0xff, and finally 0x00.
ArchLinux gave an unqualified statement:
For some devices this will take a couple of days to complete.
N.B.: The testing was started about 8:30 p.m., testing had completed before 8:45 a.m. the next day, completing in about 12 hours for my situation.
-Destructive testing isn't the only method possible.
Wikipedia offered this statement:
badblocks -nvs /dev/sdb
This would check the drive "sdb" in non-destructive read-write mode and display progress by writing out the block numbers as they are checked.
My current distro man page confirms the -n is nondestructive.
-n Use non-destructive read-write mode. By default only a non-
destructive read-only test is done.
And finally that it isn't worth it. statement.
A summarizing statement, based on the situation of billions of memory sites in a flash chip, a failure is a cell that has already been written and erased tens of thousands of times, and is now failing. And when one test shows a cell has failed, remember that each file you added and erased is running up those cycles.
The idea here is that when 1 cell fails, many more cells are also reaching the same failure point. One cell failed today, but you use it normally for a while longer, then 3 more cells fail, then 24 more fail, then 183, and before you know it, the memory array is riddled with bad spots. There are only so many cells that can die before your usable capacity begins to fall, eventually falling rapidly. How will you know more cells are failing? So, posts here are guarding your data by saying once you have a bad cell, you are pretty much done in regards trustworthy storage. Your usage might still give you a few months.
It's your data.
HTH