What are the exact reasons `grep` on /proc and raw disks is a bad idea?

I ran grep -r "searchphrase" / today and that did not work. I did some research and found find / -xdev -type f -print0 | xargs -0 grep -H "searchphrase" to be the right approach.

I gather /proc and disks like /dev/sda1 are culprits for a non-successful grep.

I would love some deep technical background on the "why". I think that some links within /proc create infinite loops when traversed, and I read there are more reasons, but nothing specific.

Also, what happens when a raw disk is grepped? Can the binary data (that is accessible on /dev/sda1, as far as I know?) not be interpreted, as only a mount with a filesystem-type makes the data from the disk intelligible? Would it therefore still be possible to grep for a binary string?


Yes, you can grep /dev/sda1 and /proc but you probably don't want to. In more detail:

  1. Yes, you can run grep the binary contents of /dev/sda1. But, with modern large hard disks, this will take a very long time and the result is not likely to be useful.

  2. Yes, you can grep the contents of /proc but be aware that your computer's memory is mapped in there as files. On a modern computer with gigabytes of RAM, this will take a long time to grep and, again, the result is not likely to be useful.

As an exception, if you are looking for data on a hard disk with a damaged file system, you might run grep something /dev/sda1 as part of an attempt to recover the file's data.

Other problematic files in /dev

The hard disks and hard disk partitions under /dev can be, if one has enough patience, grepped. Other files (hat tip: user2313067), however, may cause problems:

  1. /dev/zero is a file of infinite length. Fortunately, grep (at least the GNU version) is smart enough to skip it:

    $ grep something /dev/zero
    grep: input is too large to count
    
  2. /dev/random and /dev/urandom are also infinite. The command grep something /dev/random will run forever unless grep is signaled to stop.

    It can be useful to grep /dev/urandom when generating passwords. To get, for example, five random alphanumeric characters:

    $ grep --text -o '[[:alnum:]]' /dev/urandom | head -c 10
    G
    4
    n
    X
    2
    

    This is not infinite because, after it has received enough characters, head closes the pipe causing grep to terminate.

Infinite loops

"...links ... create infinite loops when traversed..."

Grep (at least the GNU version) is smart enough not to do that. Let's consider two cases:

  1. With the -r option, grep does not follow symbolic links unless they are explicitly specified on the command line. Hence, infinite loops are not possible.

  2. With the -R option, grep does follow symbolic links but it checks them and refuses to get caught in a loop. To illustrate:

    $ mkdir a
    $ ln -s ../ a/b
    $ grep -R something .
    grep: warning: ./a/b: recursive directory loop
    

Excluding problematic directories from grep -r

As an aside, grep provides a limited facility to stop grep from searching certain files or directories. For example, you can exclude all directories named proc, sys, and dev from grep's recursive search with:

grep --exclude-dir proc --exclude-dir sys --exclude-dir dev -r something /

Alternatively, we can exclude proc, sys, and dev using bash's extended globs:

shopt -s extglob
grep -r something /!(proc|sys|dev)