Merge PDF annotations from two files
I have two versions of the same PDF document. One has annotations that I did on it while reading it on my laptop, while the other has annotations that I did on a tablet. Now I want to merge these annotations into the same file.
I know that Adobe Acrobat allows me to do this (see for example this answer on Ask Different). Is there any software I can use in Ubuntu that will allow me to do this?
For what it is worth, I am using Xodo on the tablet.
Solution 1:
At least okular
stores comments as objects of /Type/Annot
, see these examples for the syntax:
17 0 obj
<<
/Type/Annot
/Rect[67.023 756.168 85.203 774.333]
/Subtype/Text
/M(D:20170828091301)
/T(■ somebody)
/Contents(■ text)
/NM(okular-{8ff65cc1-7b89-45c6-8adf-1aa6cec06cd0})
/F 4
/C[1 1 0]
/CA 0.5
/Border[0 0 1]
/P 20 0 R
>>
endobj
18 0 obj
<<
/Type/Annot
/Rect[37.7 597.841 236.675 615.979]
/Subtype/FreeText
/DA(/Invalid_font 10 Tf)
/M(D:20170828091316)
/T(■ somebody)
/Contents(■ text)
/NM(okular-{50420111-1c05-4e07-8db5-08deffb0ec7e})
/F 20
/C[1 1 0]
/CA 0.5
/Border[0 0 1]
/Q 0
/IT/FreeText
/P 20 0 R
>>
endobj
Those objects are linked to pages using a command like /Annots 14 0 R
, which is how this script deletes all comments in a given pdf
file, it simply deletes all the /Annots
lines:
pdftk original.pdf output uncompressed.pdf uncompress
LANG=C sed -n '/^\/Annots/!p' uncompressed.pdf > stripped.pdf
pdftk stripped.pdf output final.pdf compress
If you dive really deep into the structure of your specific pdf
documents – just open them with a text editor –, you may be able to understand what's going on and manage to manipulate your documents with e. g. sed
, however I seriously doubt there exists a solution that fits every type of pdf
document here. For what it's worth, (at least for my test file) the following oneliner gives you the comments of input.pdf
in a terminal:
pdftk input.pdf output - uncompress | sed '/^\/Contents (/!d'
Add >> comments
to the end of that line to store the output in a file named comments
instead.