Is there a command-line method to split a gigantic single-page PDF file into one file with multiple pages?
How can I split (tile?) one huge PDF onto multiple pages? The result should be one PDF with multiple A4 pages or several A4 PDFs.
I'm not looking for a solution specific to this method of generating the problem: the input PDF has been generated by Graphviz dot
, eg. dot -Tpdf sample.dot > sample.pdf
. When I did not add a size="8,11"; ratio="fill";
to the graph the output PDF is very large. If I add the size/fill-hints dot
only scaled things down for me.
Lets give you one example, if my original PDF was huge like this:
+-------------------+
| |
| O |
| : |
| :..........C |
| : : |
| : : |
| : : |
| : G |
| : : |
| : : |
| : : |
| : : |
| U : |
| B |
| |
+-------------------+
This should be split by a command like
pdftile sample.pdf -x 2 -y 3 > sample-2x3.pdf
into
+---------+---------+
| | |
| O | |
| : | |
| :......|...C |
| : | : |
+---------+---------+
| : | : |
| : | : |
| : | G |
| : | : |
| : | : |
+---------+---------+
| : | : |
| : | : |
| U | : |
| | B |
| | |
+---------+---------+
It looks like pdfposter can do this.
Actually mutool
has almost the same syntax as the one you're suggesting (are you its author ??) :
mutool -x 2 -y 3 sample.pdf sample-2x3.pdf
To install mutool
, just install mupdf
, which is probably packaged with most GNU/Linux distributions.
Take a look on http://www.graphviz.org/doc/FAQ.html: Q14. How can I print a big graph on multiple pages?
The page attribute, if set, tells Graphviz to print the graph as an array of pages of the given size. Thus, the graph
digraph G { page="8.5,11"; ... }
will be emitted as 8.5 by 11 inch pages. When printed, the pages can be tiled to make a drawing of the entire graph. At present, the feature only works with PostScript output. Alternatively, there are various tools and viewers which will take a large picture and allow you to extract page-size pieces, which can then be printed.
ONLY Postscript! But for automation no showstopper :-). Just let ps2pdf run over it. It worked for me.
Several linux utilities come to mind:
(from their man pages)
pdfseperate [options] INPUT.PDF OUTPUT%d.PDF
reads INPUT.PDF, extracts one or more pages, and writes one PDF file for each page to OUTPUT%d.PDF (%d is placeholder for page number) (from 'poppler-utils' package)
pdftk INPUT.PDF burst
reads INPUT.PDF, producing one or more PDF files containing individual pages, named 'pg-XXXX.pdf' (unless output filename specified) (from 'pdftk' package)
In addition to the tools already mentioned, there is also a tool I wrote because I was unhappy with pdfposter (no overlap between pages for gluing) and wanted more control over the outpu. The primary interface to the tool is a GUI but it also has a CLI interface that will let you do the same things from the command line. You can grab it here:
https://pypi.org/project/plakativ/
If you are on Linux you can install it with pip. Windows executables are also regularly built on AppVeyor CI: https://ci.appveyor.com/project/josch/plakativ/build/artifacts
The source code is hosted here: https://gitlab.mister-muffin.de/josch/plakativ