What are the exact reasons `grep` on /proc and raw disks is a bad idea?
I ran grep -r "searchphrase" /
today and that did not work. I did some research and found find / -xdev -type f -print0 | xargs -0 grep -H "searchphrase"
to be the right approach.
I gather /proc
and disks like /dev/sda1
are culprits for a non-successful grep.
I would love some deep technical background on the "why". I think that some links within /proc
create infinite loops when traversed, and I read there are more reasons, but nothing specific.
Also, what happens when a raw disk is grepped? Can the binary data (that is accessible on /dev/sda1
, as far as I know?) not be interpreted, as only a mount
with a filesystem-type makes the data from the disk intelligible? Would it therefore still be possible to grep for a binary string?
Yes, you can grep
/dev/sda1
and /proc
but you probably don't want to. In more detail:
Yes, you can run grep the binary contents of
/dev/sda1
. But, with modern large hard disks, this will take a very long time and the result is not likely to be useful.Yes, you can grep the contents of
/proc
but be aware that your computer's memory is mapped in there as files. On a modern computer with gigabytes of RAM, this will take a long time to grep and, again, the result is not likely to be useful.
As an exception, if you are looking for data on a hard disk with a damaged file system, you might run grep something /dev/sda1
as part of an attempt to recover the file's data.
Other problematic files in /dev
The hard disks and hard disk partitions under /dev
can be, if one has enough patience, grepped. Other files (hat tip: user2313067), however, may cause problems:
-
/dev/zero
is a file of infinite length. Fortunately,grep
(at least the GNU version) is smart enough to skip it:$ grep something /dev/zero grep: input is too large to count
-
/dev/random
and/dev/urandom
are also infinite. The commandgrep something /dev/random
will run forever unlessgrep
is signaled to stop.It can be useful to grep
/dev/urandom
when generating passwords. To get, for example, five random alphanumeric characters:$ grep --text -o '[[:alnum:]]' /dev/urandom | head -c 10 G 4 n X 2
This is not infinite because, after it has received enough characters,
head
closes the pipe causing grep to terminate.
Infinite loops
"...links ... create infinite loops when traversed..."
Grep (at least the GNU version) is smart enough not to do that. Let's consider two cases:
With the
-r
option, grep does not follow symbolic links unless they are explicitly specified on the command line. Hence, infinite loops are not possible.-
With the
-R
option, grep does follow symbolic links but it checks them and refuses to get caught in a loop. To illustrate:$ mkdir a $ ln -s ../ a/b $ grep -R something . grep: warning: ./a/b: recursive directory loop
Excluding problematic directories from grep -r
As an aside, grep
provides a limited facility to stop grep from searching certain files or directories. For example, you can exclude all directories named proc
, sys
, and dev
from grep's recursive search with:
grep --exclude-dir proc --exclude-dir sys --exclude-dir dev -r something /
Alternatively, we can exclude proc
, sys
, and dev
using bash's extended globs:
shopt -s extglob
grep -r something /!(proc|sys|dev)