Format usb and confirm all zeros
I'll throw my hat into the ring here as well. One alternative that I love to use is scrub
. It is in the repositories, so to install it from a terminal window type in:
sudo apt-get install scrub
scrub
supports many different types of scrubbing patterns
Available patterns are:
nnsa 3-pass NNSA NAP-14.1-C
dod 3-pass DoD 5220.22-M
bsi 9-pass BSI
usarmy 3-pass US Army AR380-19
random 1-pass One Random Pass
random2 2-pass Two Random Passes
schneier 7-pass Bruce Schneier Algorithm
pfitzner7 7-pass Roy Pfitzner 7-random-pass method
pfitzner33 33-pass Roy Pfitzner 33-random-pass method
gutmann 35-pass Gutmann
fastold 4-pass pre v1.7 scrub (skip random)
old 5-pass pre v1.7 scrub
dirent 6-pass dirent
fillzero 1-pass Quick Fill with 0x00
fillff 1-pass Quick Fill with 0xff
custom 1-pass custom="string" 16b max, use escapes \xnn, \nnn, \\
To use scrub
to fill the drive with all zeros
first make sure the drive is not mounted. Then run the following line (-p
means pattern to use):
sudo scrub -p fillzero /dev/sdX
then you should see something like this:
scrub: using Quick Fill with 0x00 patterns
scrub: please verify that device size below is correct!
scrub: scrubbing /dev/sdh 31260704768 bytes (~29GB)
scrub: 0x00 |..... |
Some of the patterns used for scrubbing should have a verify
pass to make sure the scrubbing passed.
If you would like, you can add the hexdump
(as in Byte Commander's answer) or any of the other answers to the end for verification.
Hope this helps!
Apply dd
, and tr
for virtual inspection:
dd if=/dev/sdb | tr '\0' 0
Apply dd
and grep
for automatic checking:
dd if=/dev/sdb | grep -zq . && echo non zero
The above is significantly slower than the optimized command below:
grep -zq . /dev/sdb && echo non zero
grep -z
reads in null-delimited lines. If all bytes are null, then each line is empty, so .
should never match.
Of course, this won't be true for a formatted partition - the format system will be using some bytes and they will be non-null.
My suggest would be hexdump
. It displays the content of any file or device in hexadecimal format as rows of 16 byte, but if two subsequential lines are equal, it omits them.
Here's an example output of the 512 MB file virtualdevice
which is filled with zeroes only on the current directory of my HDD. The leftmost column is the offset of the line in hexadecimal notation, the 8 following columns are the actual data, grouped in two bytes (4 hexadecimal characters):
$ hexdump ./virtualdevice
0000000 0000 0000 0000 0000 0000 0000 0000 0000
*
20000000
Performance:
I made the effort and compared my solution to the others by real run time and CPU time for the described example file (512 MB, containing only binary zeroes, located on HDD).
I measured every solution with the time
command two times with freshly cleared disk cache and two times with the file already being cached. The time names equal those of the time
command, and the additional row CPU
is just the sum of the USER
+SYS
times. It can exceed the REAL
time because I'm running a dual-core machine.
For most people, the interesting figures are REAL
(time from begin to end, as if measured with a stopwatch. This also contains IO wait and CPU time of other processes) and CPU
(CPU time which is actually occupied by the command).
Summary:
The best performance has muru's optimized second version (grep -zq . DEVICE
) which uses incredibly few CPU processing time.
Rank 2 share cmp /dev/zero DEVICE
(kos' optimized solution) and my own solution hexdump DEVICE
. There's nearly no difference between them.
To pipe the data from dd
to cmp
(dd if=/dev/zero | cmp - DEVICE
- kos' unoptimized solution) is very inefficient, the piping seems to consume much processing time.
Using dd
and grep
shows the by far worst performance of the tested commands.
Conclusion:
Although the most critical part of operations like these is the IO access time, there are significant differences in the processing speed and efficiency of the tested approaches.
If you are very impatient, use the second version of muru's answer (grep -zq . DEVICE
)!
But you can as well use either the second version of kos' answer (cmp /dev/zero DEVICE
) or my own (hexdump device
), as they have almost as good performance.
However, my approach has the advantage that you immediately see the file contents and can approximate how many bytes differ from zero and where they are located. If you have much alternating data though, the output will grow large and it will probably slow down.
What you should avoid in any case is to use dd
and pipes. The performance of dd
could probably be improved by setting a suitable buffer size, but why doing it the complicated way?
Please also note again that the test was done on a file on my disk instead of an actual device. Also the file contained only zeroes. Both affects the performances.
Here are the detailed results:
-
hexdump ./virtualdevice
(my own solution):| Uncached: | Cached: Time: | Run 1: Run 2: | Run 1: Run 2: --------+-------------------+------------------ REAL | 7.689s 8.668s | 1.868s 1.930s USER | 1.816s 1.720s | 1.572s 1.696s SYS | 0.408s 0.504s | 0.276s 0.220s CPU | 2.224s 2.224s | 1.848s 1.916s
-
dd if=./virtualdevice | grep -zq . && echo non zero
(muru's unoptimized solution):| Uncached: | Cached: Time: | Run 1: Run 2: | Run 1: Run 2: --------+-------------------+------------------ REAL | 9.434s 11.004s | 8.802s 9.266s USER | 2.264s 2.364s | 2.480s 2.528s SYS | 12.876s 12.972s | 12.676s 13.300s CPU | 15.140s 15.336s | 15.156s 15.828s
-
grep -zq . ./virtualdevice && echo non zero
(muru's optimized solution):| Uncached: | Cached: Time: | Run 1: Run 2: | Run 1: Run 2: --------+-------------------+------------------ REAL | 8.763s 6.485s | 0.770s 0.833s USER | 0.644s 0.612s | 0.528s 0.544s SYS | 0.440s 0.476s | 0.236s 0.264s CPU | 1.084s 1.088s | 0.764s 0.808s
-
dd if=/dev/zero | cmp - ./virtualdevice
(kos' solution unoptimized):| Uncached: | Cached: Time: | Run 1: Run 2: | Run 1: Run 2: --------+-------------------+------------------ REAL | 7.678s 6.539s | 3.151s 3.147s USER | 2.348s 2.228s | 2.164s 2.324s SYS | 3.672s 3.852s | 3.792s 3.516s CPU | 6.020s 6.080s | 5.956s 5.840s
-
cmp /dev/zero ./virtualdevice
(kos' solution optimized):| Uncached: | Cached: Time: | Run 1: Run 2: | Run 1: Run 2: --------+-------------------+------------------ REAL | 6.340s 9.183s | 1.660s 1.660s USER | 1.356s 1.384s | 1.216s 1.288s SYS | 0.640s 0.596s | 0.428s 0.360s CPU | 1.996s 1.980s | 1.644s 1.648s
Commands used:
For all four tests I ran the following procedure twice to reduce inaccuracies, replacing <COMMAND>
with the exact command from the headline of each table.
-
Let the kernel drop all disk caches:
sync && echo 3 | sudo tee /proc/sys/vm/drop_caches
-
First timed run (uncached), file gets loaded into the cache during this:
time <COMMAND>
-
Second timed run (cached). This time most of the data is taken from the disk cache in RAM, therefore it's much faster than when accessing the disk directly:
time <COMMAND>