Substitute a font in a PDF document

I have a PDF document (not encrypted) with editable form fields. However, the font for those fields is broken: it is missing some glyphs, so when I enter text some gaps appear.

How can I modify the PDF document — I have no access to the source document used to create it — to substitute a different font in place of the broken one?

The font in question is Adobe's Caliban Regular, which I can see embedded in the document. The glyphs that display blank include “i”, “T”, “V”; perhaps others that I haven't discovered.

I also have another similar document using Caliban, which does display properly including the glyphs that are listed above as broken. If someone can tell me how to take a font from one PDF and substitute it into an existing PDF, that would be a solution.

I'm currently using:

  • Debian GNU+Linux
  • Evince (and I also tried Okular) for viewing
  • The Poppler library for PDF rendering
  • Emacs (or any text editor) for editing the PDF code
  • pdftk and OpenOffice.org installed, if that helps

I would be interested in other free software PDF editing tools (whether zero-price or not), if they'll help with this task.


Solution 1:

It is extremely difficult to replace a font that is embedded into a PDF. I'm not aware of any free-as-in-speech (GPL-licensed) or free-as-in-beer (gratis) software that could probably do that (by un-embedding the font first, and then re-embed a sustitute font). I only know of two commercial products which do that: callassoftware.com's pdfToolbox4 and Enfocus' PitStop (of course, there surely are others, but I'm not aware of them, and these two are the market leaders here).

Here is a way to extract an embedded font from a PDF using Free Software. Be aware, that you only are legally allowed to do that, if the font's license does not forbid it. In the Ghostscript source code repository lives a PostScript program utility named extractFonts.ps which can help here:

  1. Install Ghostscript. Use the latest version, 8.71.
  2. Download the file http://svn.ghostscript.com/ghostscript/trunk/gs/toolbin/extractFonts.ps
  3. You may want to read comments contained in the downloaded file.
  4. Run the following command in a DOS box (cmd.exe):

    gswin32c.exe ^
        -q ^
        -dNODISPLAY ^
        C:/path/to/extractFonts.ps ^
        -c "(c:/path/to/your-pdf-file.pdf) extractFonts quit"
    
  5. Take good note of any warning or error messages the command may spit out.
  6. Successfully extracted fonts will now be stored in your current directory using the same name as in the PDF.

(Be aware that extracting fonts here does not mean removing the fonts from the PDF, but to create fontfiles which are copies of the ones embedded into the PDF.)


Here is another building block that may contribute to achieve what you want. You may want to de-compress all compressed parts/streams of your PDF, so you can more easily edit the file with a simple text editor. (Warning: editing PDFs is not a simple, straightforward task --- your editing efforts will require quite substantial knowhow and smarts about PDF file format internals.)

This trick also uses a utility from Ghostscript's Subversion toolbin sub-directory.

  1. Download the file http://svn.ghostscript.com/ghostscript/trunk/gs/toolbin/pdfinflt.ps
  2. You may want to read comments in the downloaded file.
  3. Run the following command in a DOS box (cmd.exe):

    gswin32c.exe ^
         -- ^
         c:/path/to/pdfinflt.ps ^
         c:/path/to/your-pdf-file.pdf ^
         c:/path/to/your-pdf-file-decompressed.pdf
    

This command will try to decompress all 'flate'-compressed streams. (If you are unlucky, your file will also contain streams using other compression methods (such as 'zip') which will remain unchanged by this command.)