How would you count every occurrence of a term in all files in the current directory?
How would you count every occurrence of a term in all files in the current directory? - and subdirectories(?)
I've read that to do this you would use grep
; what is the exact command?
Also, is it possible to the above with some other command?
Solution 1:
Using grep
+ wc
(this will cater for multiple occurences of the term on the same line):
grep -rFo foo | wc -l
-
-r
ingrep
: searches recursively in the current directory hierarchy; -
-F
ingrep
: matches against a fixed string instead of against a pattern; -
-o
ingrep
: prints only matches; -
-l
inwc
: prints the count of the lines;
% tree
.
├── dir
│ └── file2
└── file1
1 directory, 2 files
% cat file1
line1 foo foo
line2 foo
line3 foo
% cat dir/file2
line1 foo foo
line2 foo
line3 foo
% grep -rFo foo | wc -l
8
Solution 2:
grep -Rc [term] *
will do that. The -R
flag means you want to recursively search the current directory and all of its subdirectories. The *
is a file selector meaning: all files. The -c
flag makes grep
output only the number of occurrences. However, if the word occurs multiple times on a single line, it is counted only once.
From man grep
:
-r, --recursive
Read all files under each directory, recursively, following symbolic links only if they are on the command line.
This is equivalent to the -d recurse option.
-R, --dereference-recursive
Read all files under each directory, recursively. Follow all symbolic links, unlike -r.
If you have no symbolic links in your directory, there is no difference.
Solution 3:
In a small python script:
#!/usr/bin/env python3
import os
import sys
s = sys.argv[1]
n = 0
for root, dirs, files in os.walk(os.getcwd()):
for f in files:
f = root+"/"+f
try:
n = n + open(f).read().count(s)
except:
pass
print(n)
-
Save it as
count_string.py
. -
Run it from the directory with the command:
python3 /path/to/count_string.py <term>
Notes
- If the term includes spaces, use quotes.
- It counts every occurence of the term recursively, also if multiple occurences in one line.
Explanation:
# get the current working directory
currdir = os.getcwd()
# get the term as argument
s = sys.argv[1]
# count occurrences, set start to 0
n = 0
# use os.walk() to read recursively
for root, dirs, files in os.walk(currdir):
for f in files:
# join the path(s) above the file and the file itself
f = root+"/"+f
# try to read the file (will fail if the file is unreadable for some reason)
try:
# add the number of found occurrences of <term> in the file
n = n + open(f).read().count(s)
except:
pass
print(n)
Solution 4:
As a variant of @kos's nice answer, if you are interested in itemizing the counts, you can use grep's -c
switch to count occurrences:
$ grep -rFoc foo
file1:3
dir/file2:3