Batch resize and compress PDF files
Solution 1:
I'm suggesting a command line tool here, which can be easily batched with loops in built-in scripting languages in Windows, Linux, OS X, etc.
ImageMagick supports PDFs and has a resize
option with its convert tool. I've never used it personally, but you can try to play around with that.
You can also use the compress
option (there's an example here):
Rotate a PDF
$ convert -rotate 270 -density 300x300 -compress lzw in.pdf out.pdf
This assumes a TIFF-backed PDF. The density parameter is important because otherwise ImageMagick down-samples the image (for some reason). Adding in the compression option helps keep the overall size of the PDF smaller, with no loss in quality.
For multipage PDFs, you may want to use pdftk
, then use mogrify
from ImageMagick to convert each page in place:
$ pdftk in.pdf burst $ mogrify -rotate 270 -density 300x300 -compress lzw pg_*.pdf $ pdftk pg*.pdf cat output out.pdf $ rm pg*.pdf
To convert PDF files with ImageMagick, you need to have GhostScript installed.
ImageMagick can convert multipage PDFs. While mogrify
will convert in place, I recommend you use convert
so you can keep the originals in case of accident.
I've done some testing on your provided sample PDF. This worked quite well for me:
convert -density 200 -compress jpeg -quality 20 test.pdf test2.pdf
Density defaults to 72
DPI. By setting it higher we can get a higher resolution and therefore acceptable quality. It looked alright at 150
, and was a little smaller, but if you want to cater for a range of PDFs 200
should work.
JPEG compression should either auto choose a level or default to 92
on a scale of 1
to 100
with 100
being the best. Setting it at 20
, it looks almost as good as the original (a little fuzzier and the small text at the bottom is a little hard to read, but it was originally anyway).
These options bring your 1.7MB sample down to 0.5MB, while keeping it readable. You can experiment a little.
If you want a smaller size (both of the file and of the image/PDF), you can use -resize #%
, e.g. -resize 75%
. On your example PDF, this makes the small print at the bottom pretty much unreadable, though.
If you're still tight for space, especially for the multipage PDFs, you could compress further by adding the files to a ZIP (or other) archive. This brought the file size down to 0.43MB on that test PDF (reducing the JPEG compression quality has a much more drastic effect). You could also split the PDF file into pages with pdftk
, as @glallen suggested in his edit, or split the archive and recombine at the other end.
2MB is also a rather small attachment limit, you may want to look into other email providers. From memory, GMail provides over 10MB per email.
These options, and more, are fully documented on their website.
Solution 2:
So convert
from ImageMagick will produce rasterized PDF and many people would be interested in keeping vector graphic and text untouched so only embedded images are compressed.
So good alternative to making compression is using gs
from package ghostscript
example of usage:
gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dPDFSETTINGS=/ebook -dNOPAUSE -dQUIET -dBATCH -sOutputFile=out.pdf in.pdf
in above command parameter: -dPDFSETTINGS=/ebook
is important. It can have 3 values:
-dPDFSETTINGS=/screen (screen-view-only quality, 72 dpi images)
-dPDFSETTINGS=/ebook (low quality, 150 dpi images)
-dPDFSETTINGS=/printer (high quality, 300 dpi images)
-dPDFSETTINGS=/prepress (high quality, color preserving, 300 dpi imgs)
-dPDFSETTINGS=/default (almost identical to /screen)