Is it possible to edit a PDF file directly?

I have a PDF file that is produced as part of a help file compilation. There is always late breaking stuff which goes into a text file (e.g. "What's new in this version" type of stuff) and while Help and Manual allows you to include stuff from a text file it only works for the CHM output and not for the PDF.

I'm wondering if I can do it by generating a unique placeholder string instead and then using some tool (I may need to write one) to do a search and replace of that unique string with the contents of the late breaking info text file.

Is this feasible? Or will it break some sort of internal structure?


Solution 1:

"It Depends."

You'll probably need a couple things: First of all, the text can't have been rasterized. If that's the case, then all bets are off. Second, the entire font must have been embedded. If the font was subsetted (which is most often the case) then you may not have the required glyphs. Finally, you'd probably want to limit the size of the textarea being modified to be as small as possible, just to prevent having to deal with large amounts of reflow. You'd want as much whitespace around the plcaeholder as possible.

Now, this probably won't be something that you'll be able to do with a simple text editor, but there might be some PDF maniulating tools that can do the substiution for you.

Solution 2:

You can use (Open Source) qpdf utility (available for Linux, Windows and MacOS X) to unpack the PDF into a more readable format. From there you can go and try some of the other advices from the other answers:

qpdf.exe ^
   --qdf ^
     input.pdf ^
     output.pdf

The file oUtput.pdf will have uncompressed object streams, all objects re-numbered and re-sorted in an ascending order, and some helpful comments sprinkled into the file. The file can be edited in a text editor (if it doesn't mess with the remaining binary sections).

Solution 3:

If you are willing to get your hands dirty; iText should work.

There are examples which cover a wide range of topics and should get you pointed in the right direction.

Note the example below; using the document.add method to add a Paragraph into an existing PDF document.

protected void createPdf(String filename)
        throws IOException, DocumentException, SQLException {
        // Open the database connection
        DatabaseConnection connection = new HsqldbConnection("filmfestival");
        // step 1
        Document document = new Document();
        // step 2
        PdfWriter.getInstance(document, new FileOutputStream(filename));
        // step 3
        document.open();
        // step 4
        // Add text with a local destination
        Paragraph p = new Paragraph();
        Chunk top = new Chunk("Country List", FilmFonts.BOLD);
        top.setLocalDestination("top");
        p.add(top);
        document.add(p);
        // Add text with a link to an external URL
        Chunk imdb = new Chunk("Internet Movie Database", FilmFonts.ITALIC);
        imdb.setAction(new PdfAction(new URL("http://www.imdb.com/")));
        p = new Paragraph(
            "Click on a country, and you'll get a list of movies, containing links to the ");
        p.add(imdb);
        p.add(".");
        document.add(p);
        // Add text with a remote goto
        p = new Paragraph("This list can be found in a ");
        Chunk page1 = new Chunk("separate document");
        page1.setAction(new PdfAction("movie_links_1.pdf", 1));
        p.add(page1);
        p.add(".");
        document.add(p);
        document.add(Chunk.NEWLINE);
        // Get a list with countries from the database
        Statement stm = connection.createStatement();
        ResultSet rs = stm.executeQuery(
            "SELECT DISTINCT mc.country_id, c.country, count(*) AS c "
            + "FROM film_country c, film_movie_country mc WHERE c.id = mc.country_id "
            + "GROUP BY mc.country_id, country ORDER BY c DESC");
        // Loop over the countries
        while (rs.next()) {
            Paragraph country = new Paragraph(rs.getString("country"));
            country.add(": ");
            Chunk link = new Chunk(String.format("%d movies", rs.getInt("c")));
            link.setAction(
                PdfAction.gotoRemotePage("movie_links_1.pdf", rs.getString("country_id"), false, true));
            country.add(link);
            document.add(country);
        }
        document.add(Chunk.NEWLINE);
        // Add text with a local goto
        p = new Paragraph("Go to ");
        top = new Chunk("top");
        top.setAction(PdfAction.gotoLocalPage("top", false));
        p.add(top);
        p.add(".");
        document.add(p);
        // step 5
        document.close();
        // Close the database connection
        connection.close();
    }

Solution 4:

pdfedit might do the trick - to quote the blurb on their sourceforge site

Free editor for PDF documents. Complete editing of PDF documents is possible with PDFedit. You can change raw pdf objects (for advanced users) or use many gui functions. Functionality can be easily extended using a scripting language (ECMAScript)

As of June 2013, there are *nix and Windows versions.