How to compare the differences between two PDF files on Windows?
Try WinMerge with the xdocdiff plugin. Both are completely free. No strings attached.
A couple of the comments below suggest they don't see any difference. That means the plug-in isn't installed correctly. Here's how:
Put the files where the
xdocdiff
plugin's readme file says to put them (there are two places; I won't list them here as filenames can change, etc. — read the readme)In WinMerge, go to Plugins > List and tick the "Enable Plugins" checkbox (this step is missing from the
xdocdiff
readme)In WinMerge, choose Plugins > Automatic Unpacking (this was disabled prior to step 2)
Then when comparing, you'll see what look like text files in the comparison windows.
On Linux and Windows you can use diffpdf
(which differs from diff-pdf
mentioned in this thread).
On Ubuntu install using:
sudo apt-get install diffpdf
See further this UbuntuGeek page on comparing pds textually or visually.
For Windows, this Diffpdf Windows version works really great. You can download from http://soft.rubypdf.com/software/diffpdf (scroll down to Win32 static version).
I recently found this and I love it.
https://github.com/vslavik/diff-pdf
Cross platform, free, and works well.
Here is a screenshot of diff-pdf
in action - note that the text is not different in the PDF, but only fonts (and correspondingly, layout settings):
The call to obtain that image was:
diff-pdf --view testA.pdf testB.pdf
... where testA.pdf/testB.pdf are obtained by compiling this simple Latex file with pdflatex
(accordingly for each pdf, see comment):
\documentclass[12pt]{article}
% without mathpazo: testA.pdf
\usepackage{mathpazo} % with mathpazo: testB.pdf
\usepackage{lipsum}
\title{A brand new test}
\author{Testulio}
\begin{document}
\maketitle
\lipsum[1-3]
\end{document}
We also needed to compare PDFs at our company and were not satisfied with any of the solutions we found, so we made our own: i-net PDFC. It's not free, but we do offer a 30-day trial.
It's written in Java, so it's cross-platform.
What makes it special is that it compares the content as opposed to only the text (or just converting the pdf to an image and comparing the image). It also has a nice visual comparison tool.
I wanted to do this (diff PDFs) recently with these requirements:
- ignore whitespace, line breaks, page breaks, etc.
- easily see when just a couple words that changed, not just entire lines/paragraphs.
- color diff output
I installed pdftotext, wdiff, and colordiff, available in various package managers. (With macports: sudo port install poppler wdiff colordiff
)
Then:
wdiff <(pdftotext old.pdf -) <(pdftotext new.pdf -) | colordiff
Now I can see which words, nicely colored, have changed.
More details: http://philfreo.com/blog/how-to-view-a-color-diff-of-text-from-two-pdfs/
Variation:
Using dwdiff
can produce slightly better results.
I also wanted HTML output so this tiny script makes a basic web page with a bit of CSS.
bash pc-script.bash old.pdf new.pdf > q.htlm
Then open q.html
with your web browser.
pc-script.bash
file:
#!/bin/bash
OLD="$1"
NEW="$2"
cat <<EOF
<html><head><meta charset="UTF-8"/><title>Changes from $OLD to $NEW</title></head><style>
.plus { color: green; background: #E7E7E7; }
.minus { color: red; background: #D7D7D7; text-decoration: line-through; }
</style><body><h1>Changes from [ <span class="minus">$OLD</span> ] to [ <span class="plus">$NEW</span> ]</h1><pre>
EOF
dwdiff -i -A best -P \
--start-delete='<span class="minus">' --stop-delete='</span>' \
--start-insert='<span class="plus" >' --stop-insert='</span>' \
<( pdftotext -enc UTF-8 -layout "$OLD" - ) \
<( pdftotext -enc UTF-8 -layout "$NEW" - ) \
cat <<EOF
</pre></body></html>
EOF
An example of output can be seen here