Use grep --exclude/--include syntax to not grep through certain files

I'm looking for the string foo= in text files in a directory tree. It's on a common Linux machine, I have bash shell:

grep -ircl "foo=" *

In the directories are also many binary files which match "foo=". As these results are not relevant and slow down the search, I want grep to skip searching these files (mostly JPEG and PNG images). How would I do that?

I know there are the --exclude=PATTERN and --include=PATTERN options, but what is the pattern format? The man page of grep says:

--include=PATTERN     Recurse in directories only searching file matching PATTERN.
--exclude=PATTERN     Recurse in directories skip file matching PATTERN.

Searching on grep include, grep include exclude, grep exclude and variants did not find anything relevant

If there's a better way of grepping only in certain files, I'm all for it; moving the offending files is not an option. I can't search only certain directories (the directory structure is a big mess, with everything everywhere). Also, I can't install anything, so I have to do with common tools (like grep or the suggested find).


Solution 1:

Use the shell globbing syntax:

grep pattern -r --include=\*.cpp --include=\*.h rootdir

The syntax for --exclude is identical.

Note that the star is escaped with a backslash to prevent it from being expanded by the shell (quoting it, such as --include="*.cpp", would work just as well). Otherwise, if you had any files in the current working directory that matched the pattern, the command line would expand to something like grep pattern -r --include=foo.cpp --include=bar.cpp rootdir, which would only search files named foo.cpp and bar.cpp, which is quite likely not what you wanted.

Update 2021-03-04

I've edited the original answer to remove the use of brace expansion, which is a feature provided by several shells such as Bash and zsh to simplify patterns like this; but note that brace expansion is not POSIX shell-compliant.

The original example was:

grep pattern -r --include=\*.{cpp,h} rootdir

to search through all .cpp and .h files rooted in the directory rootdir.

Solution 2:

If you just want to skip binary files, I suggest you look at the -I (upper case i) option. It ignores binary files. I regularly use the following command:

grep -rI --exclude-dir="\.svn" "pattern" *

It searches recursively, ignores binary files, and doesn't look inside Subversion hidden folders, for whatever pattern I want. I have it aliased as "grepsvn" on my box at work.

Solution 3:

Please take a look at ack, which is designed for exactly these situations. Your example of

grep -ircl --exclude=*.{png,jpg} "foo=" *

is done with ack as

ack -icl "foo="

because ack never looks in binary files by default, and -r is on by default. And if you want only CPP and H files, then just do

ack -icl --cpp "foo="

Solution 4:

grep 2.5.3 introduced the --exclude-dir parameter which will work the way you want.

grep -rI --exclude-dir=\.svn PATTERN .

You can also set an environment variable: GREP_OPTIONS="--exclude-dir=\.svn"

I'll second Andy's vote for ack though, it's the best.

Solution 5:

I found this after a long time, you can add multiple includes and excludes like:

grep "z-index" . --include=*.js --exclude=*js/lib/* --exclude=*.min.js