Combining pdf files with separators
During corona days, I am managing my courses via Moodle.
Students submit their work to Moodle in pdf format.
Then I download all of them as a single zip file to my disk.
When unzip it, the following pretty good directory structure is obtained under my directory x
.
x
|- StudentA
| |- nameA.pdf
|- StudentB
| |- nameB.pdf
A necessary info is that I have already asked my students to have their name in the filename of their submissions.
I would like to collect these files and merge into a single file for grading. More than that what I want is to have a kind of "separator" inserted in between files so that I can easily navigate from one student paper to another next. That is, direct access to any paper.
- Automatic insertion of a table of content would be perfect.
- Or a bookmark for each paper would do.
- Worst comes to worst, I like to insert filename before the file itself.
One can to merge pdf files in one directory into single file as explained in How can I combine multiple PDFs using the command line? or Mac OS X: How to merge pdf files in a directory according to their file names.
Solution 1:
benwiggy created a python script (joinpdfs.py) that will take PDF files as arguments and join them, creating a Table of Contents with each file at the top level, and each file's TOC merged in under it.
Script here: https://github.com/benwiggy/PDFsuite/blob/master/Automator_Scripts/joinpdfs.py
(Download it by Option-clicking the Raw
button on the GitHub link above.)
You can pretty easily integrate that script into a terminal based workflow.
If you're not super savvy with Terminal there's a super simple workflow to make this work.
But first you have to make the script you downloaded executable via Terminal.
To make joinpdfs.py
executable (only need to do this once):
- Open a Terminal window.
- Type
chmod +x
(don't forget the space at the end) - Drag the downloaded script
joinpdfs.py
into the Terminal window. (This will put the entire path to the file in as an argument to the command you just typed). - Hit return.
The script will now be executable in Terminal.
To Join PDFs via the Terminal:
- Open a Terminal window.
- Drag
joinpdfs.py
into the terminal window. - Drag all PDFs to join into the Terminal window. (Order the files appear in finder window will be the order they appear in the PDF. Change this easily by dragging the files, or groups of the files, into the window in a particular order.)
- Hit return.
(Note: I'm getting the following error, but it looks like you can ignore it: "CoreGraphics PDF has logged an error. Set environment variable "CG_PDF_VERBOSE" to learn more.")
A new file Combined.pdf
will be created in the same folder as the first PDF you dragged in.
This file will have a TOC with the filenames at the top level, and the each PDF's TOC nested within.
To Join PDFs via Automator
benwiggy has instructions on this readme page for incorporating all his python scripts into Automator actions. The steps are:
-
Download the scripts. (No need to make it executable first.)
-
Launch Automator and create a New Service.
-
Set the drop-down menus to read "Service receives PDF Files in Finder".
-
Add "Run Shell Script" action. (under "Utilities".)
-
Set the shell drop-down list to /usr/bin/python and "Pass input" to as arguments.
-
Paste in the script you want to use (replacing the existing text).