PDF metadata viewer / tag editor for Ubuntu

There are a lot of questions and answers regarding the best PDF viewer available with Ubuntu, but I want to parse the PDF file and know details of PDF, such as images, fonts and links that are available in a given PDF file.

Are there any PDF metadata viewer/tag editors available?


  1. View pdf metadata for a file called Example.pdf:

    pdfinfo Example.pdf  
    
  2. Edit existing metadata in the terminal using nano editor:

    pdftk Example.pdf dump_data output Metadata-output.txt
    nano Metadata-output.txt  
    
  3. Update metadata:

    pdftk Example.pdf update_info Metadata-output.txt output Example-new.pdf
    

Nano editor keyboard shortcuts
Use the keyboard combination Ctrl + O and after that press Enter to save the file to its current location.
Use the keyboard combination Ctrl + X to exit nano.


CLI solution

Another utility worth looking into is exiftool. The advantage exiftool holds over pdfinfo is that it supports a lot more metadata types (e.g. XMP tags).

Here's an example of a command that will print all available meta information (-a), sorted by groups (-G1):

exiftool -a -G1 "$File"

Overviews of the supported PDF-related tags:

  • PDF Tags
  • XMP PDF tags
  • XMP dc tags

You can install exiftool on Ubuntu with:

sudo apt-get install libimage-exiftool-perl

GUI solution

If you are looking for a GUI PDF metadata viewer/editor you could give PDFMtEd a try. It's a a set of graphical utilities I wrote for managing PDF metadata with exiftool:

enter image description here

enter image description here


The answer of "best" really depends on how much detail you want and on how stable you want the viewer to be. There exists many softwares for viewing and even editing post script and pdf files in linux; all which seem to have been removed from the current Ubuntu repositories (probably due to stability issues).

For now I'd recommend trying pdfedit. If you are using Quantal or earlier it can be installed via

sudo apt-get install pdfedit

For newer releases you'll need to download it from it's project page, unpack it, and compile it your self.


To elaborate on the pdftk editing method, which is nice because it shows you everything that's being set, at the same time as allowing you to change anything you like, here is a script (for your .bashrc or other aliases file) to do it with one command. This creates a new version of the file you want to edit, opens your favourite editor with the metadatafile, and then implements your changes and sets the file creation/modification time on the modified PDF file to be the same as the original. To use it, after resourcing your .bashrc file, just type

editPDFmetadata myfile.pdf

Here's the alias:

editPDFmetadata() {
OUTPUT="${1}-new.pdf"
METADATA="tmp${1}-report.txt"
pdftk ${1} dump_data output $METADATA
$EDITOR $METADATA
pdftk ${1} update_info $METADATA  output $OUTPUT
touch -r ${1} ${OUTPUT}
}

Simply place the definition above into the .bashrc file in your home folder, then open a new terminal and it will be ready to use.