grep command and [ ] [duplicate]

I'm learning bash code and today I'm studying the command grep.

if I run

$ ps -fU user | grep thunderbird

terminal shows:

user  17410     1  0 10:09 ?        00:00:20 /usr/lib/thunderbird/thunderbird
user  18990 15896  0 12:25 pts/1    00:00:00 grep --color=auto thunderbird

But if I run:

$ ps -fU user | grep [t]hunderbird

terminal shows:

user  17410     1  0 10:09 ?        00:00:20 /usr/lib/thunderbird/thunderbird

why? I read the guide but I don't understand.


There are two issues here. First, when you run ps | grep ..., the grep process is also shown in the output of ps. The default ps output includes the arguments a process was launched with, not only the process's name. This means that if you run grep foo, and there is a running process called foo, there will be two ps results matching foo: the foo process and the grep itself since it is searching for foo. This is why you get two lines when running ps -f | grep thunderbird.

Now, the [ ] is a regular expression construct which defines a list of characters, a character class. For example, [abc] will match a or b or c. When you run ps -f | grep [t]hunderbird, that class only contains a single character so is equivalent to thunderbird without the brackets. However, the grep process was launched with [t]hunderbird as an argument this time, and not thunderbird. Therefore, its line in the output of ps will contain [t]hunderbird. It will look like this:

terdon   23101 10991  0 16:53 pts/3    00:00:00 grep --color [t]hunderbird

This means that it is not matched when you run ps -f | grep thunderbird since it contains [t]hunderbird and not thunderbird.

This is a common trick to avoid matching the grep process itself when running ps | grep. Another alternative is to run ps -f | grep foo | grep -v grep to exclude the grep. The best approach, however, is to use a program specifically designed for this, pgrep:

$ pgrep -l thunderbird
11330 thunderbird

In the first case you're looking for any process with the word thunderbird in. There are two thunderbird and the grep command itself.

In the second you're also looking for t character followed by hunderbird, as the [t] means match any listed characters in the square bracket of which there's just the one, the letter t, but this time your two processes are

user  17410     1  0 10:09 ?        00:00:20 /usr/lib/thunderbird/thunderbird
user  18990 15896  0 12:25 pts/1    00:00:00 grep --color=auto [t]hunderbird

The second process does not match because the rexep [t]hunderbird does not match the literal string [t]hunderbird as the ] prevents the match.