How to extract and/or remove the last page of a bunch of PDFs?

One of our vendors started tacking on an unnecessarily huge image to the last page of PDFs we get from them. I need to trim this out. However, we have hundreds of these, so it's prohibitive to go in manually. What're the best ways to extract and then delete (Preferably first one, then the other; I still need to confirm via filesize that I'm not deleting one which doesn't have the image) the last page of a PDF automatically? OS is Linux.

I can extract it using ghostscript, with something along the lines of gs -dFirstPage=5 -dLastPage=5, but I need to automate this, I can't go through and manually find out what the number of the last page is.

Any ideas?

Edit: To clarify, I simply want to split out/delete the last page. Not the image in it, excise the last page period.


As @Daniel Andersson already commented, this can easily be done with pdftk:

pdftk input.pdf cat end-1 output temp.pdf
pdftk temp.pdf  cat end-2 output output.pdf
rm temp.pdf

I don't know if it can be done with one call to pdftk though...

Edit: you could combine it with thanosk's answer and use (in bash):

pdftk input.pdf cat 1-$((last-1)) output output.pdf

when you already extracted the last page to the variable $last.


To further improve on @eldering's answer, pdftk version 1.45 and later have the means to reference pages in reverse order by prepending the lower-case letter r to the page number. The final page in a PDF is r1, the next-to-last page is r2, etc.

For example, the single pdftk call:

pdftk input.pdf cat 1-r2 output output.pdf

will drop the final page from input.pdf -- the input should be at least two pages long.

To extract just the final page of a PDF in order to test its filesize, run:

pdftk input.pdf cat r1 output final_page.pdf

Pdftk is available on Linux. Many distros have a binary you can install. You should make sure it is version 1.45 or later, though. If not, you can build pdftk from source code.