How can I convert an ODT file to a PDF?

Does anyone know how to convert an ODT file (LibreOffice) to PDF?


Solution 1:

You can also use the command-line of libreoffice for your purpose. That gives you the advantage of batch conversion. But single files are also possible. This example converts all ODT files in the current directory to PDF:

libreoffice --headless --convert-to pdf *.odt

Get more information on command-line options with:

man libreoffice

Solution 2:

Just open the document with libre office and choose Export as PDF...:

enter image description here

For a command line solution there is unoconv that converts files from the command line:

unoconv -f pdf mydocument.odt

Note: unoconv depends on Libre Office.

Solution 3:

Here are a few more details about the "non-GUI" method.

  1. You can use this method not only to convert ODT files to PDF. It will also work for MS Word DOCX files (it will work as well as LibreOffice is able to handle the particular ODT), and, in general all file types which LibreOffice can open.

  2. I do not think that there is a binary named libreoffice as one of the other answers suggested. However, there is soffice(.bin) -- the binary that can be used to start LibreOffice from the command line. It is usually located in /usr/lib/libreoffice/program/; and very often, a symlink /usr/bin/soffice points to that location.

  3. Then, in most cases the parameters --headless --convert-to pdf are not sufficient. It needs to be:

    --headless --convert-to pdf:writer_pdf_Export
    

    Be sure to follow exactly this capitalization!

  4. Next, the command will not work if there is already a LibreOffice GUI instance up and running on your system. It is caused by bug #37531, known since 2011. Add this additional parameter to your command:

     "-env:UserInstallation=file:///tmp/LibreOffice_Conversion_${USER}"
    

    This will create a new, separate environment which can be used by a second, headless LO instance without interfering with a possibly running first GUI LO instance started by the same user.

  5. Also, make sure that the --outdir /pdf you specify does exist, and that you have write permission to it. Or, rather use a different output dir. Even if it is just for a first testing and debugging round:

    $ mkdir ${HOME}/lo_pdfs
    
  6. Hence:

    /path/to/soffice                                                     \
      --headless                                                         \
      "-env:UserInstallation=file:///tmp/LibreOffice_Conversion_${USER}" \
      --convert-to pdf:writer_pdf_Export                                 \
      --outdir ${HOME}/lo_pdfs                                           \
    /path/to/test.docx
    

    This works for me on Mac OS X Yosemite 10.10.5 with LibreOffice v5.1.2.2 (using my specific path for the binary soffice which will be different on Ubuntu anyway...). It also works on Debian Jessie 8.0 (using path /usr/lib/libreoffice/program/soffice). Sorry, cannot test it on Ubuntu right now....

    If all this doesn't work, when you try to process DOCX:

  7. It may be a problem with the specific DOCX file you try the command with... So create a very simple DOCX document of your own first. Use LibreOffice itself for this. Write "Hello World!" on an otherwise empty page. Save it as DOCX.

  8. Try again. Does it work with the simple DOCX?

  9. If it again doesn't work, repeat step 7, but save as ODT this time.

  10. Repeat step 8, but make sure to reference the ODT this time.

  11. Last: Use full path to soffice, to soffice.bin and to libreoffice and run each with the -h parameter:

    $ /path/to/libreoffice -h  # if that path exists, which I doubt!
    $ /path/to/soffice -h
    $ /path/to/soffice.bin -h
    
    • Do you get an output here?
    • For which one of the three binaries/symlinks?
    • Record the outputs.
    • Tell us your outputs!!!
       

    Compare them to the command line you used:

    • Are there any changes in parameter names, capitalizations, number of dashes used, etc.??
       

    For comparison, my own (Mac OS X) output is here:

     $ /Applications/LibreOffice.app/Contents/MacOS/soffice -h 
    
     LibreOffice 5.1.2.2 d3bf12ecb743fc0d20e0be0c58ca359301eb705f
    
     Usage: soffice [options] [documents...]
    
     Options:
     --minimized    keep startup bitmap minimized.
     --invisible    no startup screen, no default document and no UI.
     --norestore    suppress restart/restore after fatal errors.
     --quickstart   starts the quickstart service
     --nologo       don't show startup screen.
     --nolockcheck  don't check for remote instances using the installation
     --nodefault    don't start with an empty document
     --headless     like invisible but no user interaction at all.
     --help/-h/-?   show this message and exit.
     --version      display the version information.
     --writer       create new text document.
     --calc         create new spreadsheet document.
     --draw         create new drawing.
     --impress      create new presentation.
     --base         create new database.
     --math         create new formula.
     --global       create new global document.
     --web          create new HTML document.
     -o             open documents regardless whether they are templates or not.
     -n             always open documents as new files (use as template).
    
     --display <display>
           Specify X-Display to use in Unix/X11 versions.
     -p <documents...>
           print the specified documents on the default printer.
     --pt <printer> <documents...>
           print the specified documents on the specified printer.
     --view <documents...>
           open the specified documents in viewer-(readonly-)mode.
     --show <presentation>
           open the specified presentation and start it immediately
     --accept=<accept-string>
           Specify an UNO connect-string to create an UNO acceptor through which
           other programs can connect to access the API
     --unaccept=<accept-string>
           Close an acceptor that was created with --accept=<accept-string>
           Use --unnaccept=all to close all open acceptors
     --infilter=<filter>[:filter_options]
           Force an input filter type if possible
           Eg. --infilter="Calc Office Open XML"
               --infilter="Text (encoded):UTF8,LF,,,"
     --convert-to output_file_extension[:output_filter_name[:output_filter_options]] [--outdir output_dir] files
           Batch convert files (implies --headless).
           If --outdir is not specified then current working dir is used as output_dir.
           Eg. --convert-to pdf *.doc
               --convert-to pdf:writer_pdf_Export --outdir /home/user *.doc
               --convert-to "html:XHTML Writer File:UTF8" *.doc
               --convert-to "txt:Text (encoded):UTF8" *.doc
     --print-to-file [-printer-name printer_name] [--outdir output_dir] files
           Batch print files to file.
           If --outdir is not specified then current working dir is used as output_dir.
           Eg. --print-to-file *.doc
               --print-to-file --printer-name nasty_lowres_printer --outdir /home/user *.doc
     --cat files
           Dump text content of the files to console
           Eg. --cat *.odt
     --pidfile=file
           Store soffice.bin pid to file.
     -env:<VAR>[=<VALUE>]
           Set a bootstrap variable.
           Eg. -env:UserInstallation=file:///tmp/test to set a non-default user profile path.
    
     Remaining arguments will be treated as filenames or URLs of documents to open.
    
  12. Add one more argument to your command line to enforce the application of an input filter when soffice opens your DOCX file:

    --infilter="Microsoft Word 2007/2010/2013 XML"
    

    or

    --infilter="Microsoft Word 2007/2010/2013 XML"
    --infilter="Microsoft Word 2007-2013 XML"
    --infilter="Microsoft Word 2007-2013 XML Template"
    --infilter="Microsoft Word 95 Template"
    --infilter="MS Word 95 Vorlage"
    --infilter="Microsoft Word 97/2000/XP Template"
    --infilter="MS Word 97 Vorlage"
    --infilter="Microsoft Word 2003 XML"
    --infilter="MS Word 2003 XML"
    --infilter="Microsoft Word 2007 XML Template"
    --infilter="MS Word 2007 XML Template"
    --infilter="Microsoft Word 6.0"
    --infilter="MS WinWord 6.0"
    --infilter="Microsoft Word 95"
    --infilter="MS Word 95"
    --infilter="Microsoft Word 97/2000/XP"
    --infilter="MS Word 97"
    --infilter="Microsoft Word 2007 XML"
    --infilter="MS Word 2007 XML"
    --infilter="Microsoft WinWord 5"
    --infilter="MS WinWord 5"
    

Solution 4:

Nautilus Script

This script utilizes libreoffice to convert files compatible with LibreOffice to PDF.

#!/bin/bash
## PDFconvert 0.1
## by Glutanimate (https://askubuntu.com/users/81372/)
## License: GPL 3.0
## depends on python, libreoffice
## Note: if you are using a non-default LO version (e.g. because you installed it 
## from a precompiled package instead of the official repos) you might have to change
## 'libreoffice' according to the version you're using, e.g. 'libreoffice3.6'

# Get work directory
base="`python -c 'import gio,sys; print(gio.File(sys.argv[1]).get_path())' $NAUTILUS_SCRIPT_CURRENT_URI`"

#Convert documents
while [ $# -gt 0 ]; do
    document=$1
    libreoffice --headless --invisible --convert-to pdf --outdir "$base" "$document" 
    shift

done

For installation instructions see here: How can I install a Nautilus script?

Solution 5:

I'm adding a new answer, because in recent times a series of new conversion paths were opened by Pandoc gaining the capability to read ODT files.

When Pandoc reads in a file format, it converts it into an internal format, "native" (which is a form of JSON).

From its native form, it can then export the document into a whole range of other formats. Not only PDF, but also DocBook, HTML, EPUB, DOCX, ASCIIdoc, DokuWiki, MediaWiki and what-not...

Since here the wanted output format is PDF, we have another choice of different paths, provided by what Pandoc is calling a pdf-engine. Here is the list of currently available PDF engines (valid for Pandoc v2.7.2 and later -- previous versions may support only a smaller list):

  • pdflatex: This requires LaTeX to be installed in addition to Pandoc.

  • xelatex: This requires XeLaTeX to be installed in addition to Pandoc (also available as an additional package to general TeX distributions).

  • context: This requires ConTeXt to be installed in addition to Pandoc; ConTeXt is available as an additional package to most general TeX distributions).

  • lualatex: This requires LuaTeX to be installed in addition to Pandoc (also available as an additional package to general TeX distributions).

  • pdfroff: This requires GNU Roff to be installed in addition to Pandoc.

  • wkhtml2pdf: This requires wkhtmltopdf to be installed in addition to Pandoc.

  • prince: This requires PrinceXML to be installed in addition to Pandoc.

  • weasyprint: This requires weasyprint to be installed in addition to Pandoc.

There are some more and newer PDF engines now integrated into Pandoc, which I have not yet used myself and which I currently cannot describe in more detail: tectonic and latexmk.

WARNING: Do not expect that the appearance of your original document will be identical in all the PDF outputs to the print preview or PDF export of the ODT! Pandoc, when converting does not preserve layouts, it preserves the contents and the structure of documents: paragraphs remain paragraphs, emphasized words remain emphasized, headings remain headings, etc. But the overall look can change considerably.

Example commands

pdflatex:

 pandoc -f odt -o mydoc.pdf mydoc.odt --pdf-engine=pdflatex

XeLaTeX:

 pandoc -f odt -o mydoc.pdf mydoc.odt --pdf-engine=xelatex

LuaLaTeX:

 pandoc -f odt -o mydoc.pdf mydoc.odt --pdf-engine=lualatex

ConTeXt:

 pandoc -f odt -o mydoc.pdf mydoc.odt --pdf-engine=context

GNU troff:

 pandoc -f odt -o mydoc.pdf mydoc.odt --pdf-engine=pdfroff

wkhtmltopdf:

 pandoc -f odt -o mydoc.pdf mydoc.odt --pdf-engine=wkhtml2pdf

PrinceXML:

 pandoc -f odt -o mydoc.pdf mydoc.odt --pdf-engine=prince

weasyprint:

 pandoc -f odt -o mydoc.pdf mydoc.odt --pdf-engine=weasyprint

Above commands are the most basic for the conversion. Depending on the PDF engine you pick, there may be many other options possible to control the appearance of the output PDF file. For example, the following additional parameters may be added to all those paths routing through LaTeX:

 -V geometry:"paperwidth=23.3cm, paperheight=1000pt, margin=11.2mm, top=2cm"

which will use a custom page size (a bit larger than DIN A4) with margins of 2cm on the top edge and 1.12cm at the other three edges).