Compare source code files, ignoring formatting differences (like whitespace, linebreaks, ...)
Solution 1:
You can use dwdiff
. From man dwdiff
:
dwdiff
- a delimited word diff program
Program is very clever - see dwdiff --help
:
$ dwdiff --help
Usage: dwdiff [OPTIONS] <OLD FILE> <NEW FILE>
-h, --help Print this help message
-v, --version Print version and copyright information
-d <delim>, --delimiters=<delim> Specify delimiters
-P, --punctuation Use punctuation characters as delimiters
-W <ws>, --white-space=<ws> Specify whitespace characters
-u, --diff-input Read the input as the output from diff
-S[<marker>], --paragraph-separator[=<marker>] Show inserted or deleted blocks
of empty lines, optionally overriding the marker
-1, --no-deleted Do not print deleted words
-2, --no-inserted Do not print inserted words
-3, --no-common Do not print common words
-L[<width>], --line-numbers[<width>] Prepend line numbers
-C<num>, --context=<num> Show <num> lines of context
-s, --statistics Print statistics when done
--wdiff-output Produce wdiff compatible output
-i, --ignore-case Ignore differences in case
-I, --ignore-formatting Ignore formatting differences
-m <num>, --match-context=<num> Use <num> words of context for matching
--aggregate-changes Allow close changes to aggregate
-A <alg>, --algorithm=<alg> Choose algorithm: best, normal, fast
-c[<spec>], --color[=<spec>] Color mode
-l, --less-mode As -p but also overstrike whitespace
-p, --printer Use overstriking and bold text
-w <string>, --start-delete=<string> String to mark begin of deleted text
-x <string>, --stop-delete=<string> String to mark end of deleted text
-y <string>, --start-insert=<string> String to mark begin of inserted text
-z <string>, --stop-insert=<string> String to mark end of inserted text
-R, --repeat-markers Repeat markers at newlines
--profile=<name> Use profile <name>
--no-profile Disable profile reading
Test it with:
cat << EOF > test_diff1.txt
else if (prop == "P1") { return 0; }
EOF
cat << EOF > test_diff2.txt
else if (prop == "P1") {
return 0;
}
EOF
Then launch comparison:
$ dwdiff test_diff1.txt test_diff2.txt --statistics
else if (prop == "P1") {
return 0;
}
old: 9 words 9 100% common 0 0% deleted 0 0% changed
new: 9 words 9 100% common 0 0% inserted 0 0% changed
Please note 100% common
above.
Solution 2:
I doubt this is something that diff can do. If there are space changes within a line, then it will work (or other similar programs like kompare). At worse, you can do a search-and-replace and collapse tab characters, etc. But what you're asking for whitespace changes beyond a line...
You would need a program that understands the C++ language. Note that all languages are different and Python, in particular, uses whitespace to define code blocks. As such, I doubt any general diff-like program would work with "any" (or a specific) programming language.
You might consider some kind of parser to go through the two source files and then compare the outputs of this parser.
This is beyond my background, but I suggest you look into Lex and Yacc. These are Wikipedia pages; you might want to take a look at this page which gives a concise explanation and an example.