Use grep --exclude/--include syntax to not grep through certain files
I'm looking for the string foo=
in text files in a directory tree. It's on a common Linux machine, I have bash shell:
grep -ircl "foo=" *
In the directories are also many binary files which match "foo="
. As these results are not relevant and slow down the search, I want grep to skip searching these files (mostly JPEG and PNG images). How would I do that?
I know there are the --exclude=PATTERN
and --include=PATTERN
options, but what is the pattern format? The man page of grep says:
--include=PATTERN Recurse in directories only searching file matching PATTERN.
--exclude=PATTERN Recurse in directories skip file matching PATTERN.
Searching on grep include, grep include exclude, grep exclude and variants did not find anything relevant
If there's a better way of grepping only in certain files, I'm all for it; moving the offending files is not an option. I can't search only certain directories (the directory structure is a big mess, with everything everywhere). Also, I can't install anything, so I have to do with common tools (like grep or the suggested find).
Solution 1:
Use the shell globbing syntax:
grep pattern -r --include=\*.cpp --include=\*.h rootdir
The syntax for --exclude
is identical.
Note that the star is escaped with a backslash to prevent it from being expanded by the shell (quoting it, such as --include="*.cpp"
, would work just as well). Otherwise, if you had any files in the current working directory that matched the pattern, the command line would expand to something like grep pattern -r --include=foo.cpp --include=bar.cpp rootdir
, which would only search files named foo.cpp
and bar.cpp
, which is quite likely not what you wanted.
Update 2021-03-04
I've edited the original answer to remove the use of brace expansion, which is a feature provided by several shells such as Bash and zsh to simplify patterns like this; but note that brace expansion is not POSIX shell-compliant.
The original example was:
grep pattern -r --include=\*.{cpp,h} rootdir
to search through all .cpp
and .h
files rooted in the directory rootdir
.
Solution 2:
If you just want to skip binary files, I suggest you look at the -I
(upper case i) option. It ignores binary files. I regularly use the following command:
grep -rI --exclude-dir="\.svn" "pattern" *
It searches recursively, ignores binary files, and doesn't look inside Subversion hidden folders, for whatever pattern I want. I have it aliased as "grepsvn" on my box at work.
Solution 3:
Please take a look at ack, which is designed for exactly these situations. Your example of
grep -ircl --exclude=*.{png,jpg} "foo=" *
is done with ack as
ack -icl "foo="
because ack never looks in binary files by default, and -r is on by default. And if you want only CPP and H files, then just do
ack -icl --cpp "foo="
Solution 4:
grep 2.5.3 introduced the --exclude-dir
parameter which will work the way you want.
grep -rI --exclude-dir=\.svn PATTERN .
You can also set an environment variable: GREP_OPTIONS="--exclude-dir=\.svn"
I'll second Andy's vote for ack though, it's the best.
Solution 5:
I found this after a long time, you can add multiple includes and excludes like:
grep "z-index" . --include=*.js --exclude=*js/lib/* --exclude=*.min.js