Linux command: How to 'find' only text files?
After a few searches from Google, what I come up with is:
find my_folder -type f -exec grep -l "needle text" {} \; -exec file {} \; | grep text
which is very unhandy and outputs unneeded texts such as mime type information. Any better solutions? I have lots of images and other binary files in the same folder with a lot of text files that I need to search through.
Solution 1:
I know this is an old thread, but I stumbled across it and thought I'd share my method which I have found to be a very fast way to use find
to find only non-binary files:
find . -type f -exec grep -Iq . {} \; -print
The -I
option to grep tells it to immediately ignore binary files and the .
option along with the -q
will make it immediately match text files so it goes very fast. You can change the -print
to a -print0
for piping into an xargs -0
or something if you are concerned about spaces (thanks for the tip, @lucas.werkmeister!)
Also the first dot is only necessary for certain BSD versions of find
such as on OS X, but it doesn't hurt anything just having it there all the time if you want to put this in an alias or something.
EDIT: As @ruslan correctly pointed out, the -and
can be omitted since it is implied.
Solution 2:
Based on this SO question :
grep -rIl "needle text" my_folder
Solution 3:
Why is it unhandy? If you need to use it often, and don't want to type it every time just define a bash function for it:
function findTextInAsciiFiles {
# usage: findTextInAsciiFiles DIRECTORY NEEDLE_TEXT
find "$1" -type f -exec grep -l "$2" {} \; -exec file {} \; | grep text
}
put it in your .bashrc
and then just run:
findTextInAsciiFiles your_folder "needle text"
whenever you want.
EDIT to reflect OP's edit:
if you want to cut out mime informations you could just add a further stage to the pipeline that filters out mime informations. This should do the trick, by taking only what comes before :
: cut -d':' -f1
:
function findTextInAsciiFiles {
# usage: findTextInAsciiFiles DIRECTORY NEEDLE_TEXT
find "$1" -type f -exec grep -l "$2" {} \; -exec file {} \; | grep text | cut -d ':' -f1
}
Solution 4:
find . -type f -print0 | xargs -0 file | grep -P text | cut -d: -f1 | xargs grep -Pil "search"
This is unfortunately not space save. Putting this into bash script makes it a bit easier.
This is space safe:
#!/bin/bash
#if [ ! "$1" ] ; then
echo "Usage: $0 <search>";
exit
fi
find . -type f -print0 \
| xargs -0 file \
| grep -P text \
| cut -d: -f1 \
| xargs -i% grep -Pil "$1" "%"