What is the best way to count "find" results?
Why not
find <expr> | wc -l
as a simple portable solution? Your original solution is spawning a new process printf
for every individual file found, and that's very expensive (as you've just found).
Note that this will overcount if you have filenames with newlines embedded, but if you have that then I suspect your problems run a little deeper.
Try this instead (require find
's -printf
support):
find <expr> -type f -printf '.' | wc -c
It will be more reliable and faster than counting the lines.
Note that I use the find
's printf
, not an external command.
Let's bench a bit :
$ ls -1
a
e
l
ll.sh
r
t
y
z
My snippet benchmark :
$ time find -type f -printf '.' | wc -c
8
real 0m0.004s
user 0m0.000s
sys 0m0.007s
With full lines :
$ time find -type f | wc -l
8
real 0m0.006s
user 0m0.003s
sys 0m0.000s
So my solution is faster =) (the important part is the real
line)
This solution is certainly slower than some of the other find -> wc
solutions here, but if you were inclined to do something else with the file names in addition to counting them, you could read
from the find
output.
n=0
while read -r -d ''; do
((n++)) # count
# maybe perform another act on file
done < <(find <expr> -print0)
echo $n
It is just a modification of a solution found in BashGuide that properly handles files with nonstandard names by making the find
output delimiter a NUL byte using print0
, and reading from it using ''
(NUL byte) as the loop delimiter.
This is my countfiles
function in my ~/.bashrc
(it's reasonably fast, should work for Linux & FreeBSD find
, and does not get fooled by file paths containing newline characters; the final wc
just counts NUL bytes):
countfiles ()
{
command find "${1:-.}" -type f -name "${2:-*}" -print0 |
command tr -dc '\0' | command wc -c;
return 0
}
countfiles
countfiles ~ '*.txt'
POSIX compliant and newline-proof:
find /path -exec printf %c {} + | wc -c
And, from my tests in /
, not even two times slower than the other solutions, which are either not newline-proof or not portable.
Note the +
instead of \;
. That is crucial for performance, as \;
spawns one printf
command per file name, whereas +
gives as much file names as it can to a single printf
command. (And in the possible case where there are too many arguments, Find intelligently spawns new Printfs on demand to cope with it, so it would be as if
{
printf %c very long argument list1
printf %c very long argument list2
printf %c very long argument list3
} | wc -c
were called.)