diff a directory recursively, ignoring all binary files
Working on a Fedora Constantine box. I am looking to diff
two directories recursively to check for source changes. Due to the setup of the project (prior to my own engagement with said project! sigh), the directories contain both source and binaries, as well as large binary datasets. While diffing eventually works on these directories, it would take perhaps twenty seconds if I could ignore the binary files.
As far as I understand, diff does not have an 'ignore binary file' mode, but does have an ignore argument which will ignore regular expression within a file. I don't know what to write there to ignore binary files, regardless of extension.
I'm using the following command, but it does not ignore binary files. Does anyone know how to modify this command to do this?
diff -rq dir1 dir2
Solution 1:
Kind of cheating but here's what I used:
diff -r dir1/ dir2/ | sed '/Binary\ files\ /d' >outputfile
This recursively compares dir1 to dir2, sed removes the lines for binary files(begins with "Binary files "), then it's redirected to the outputfile.
Solution 2:
Maybe use grep -I
(which is equivalent to grep --binary-files=without-match
) as a filter to sort out binary files.
dir1='folder-1'
dir2='folder-2'
IFS=$'\n'
for file in $(grep -Ilsr -m 1 '.' "$dir1"); do
diff -q "$file" "${file/${dir1}/${dir2}}"
done