Convert a directory of JPEG files to a single PDF document

From the imagemagick package, use the convert command:

convert *.jpg -auto-orient pictures.pdf

You will get a single pdf containing all jpg in the current folder. The option -auto-orient reads the image's EXIF data to rotate the image.

Install IM with:

sudo apt-get install imagemagick

sources: stackoverflow imagemagick options

Edit: Note that images will be out of specific order if they are not numbered. if you have 10 or more you need to name them ending filename01.jpg...filename99.jpg etc. The leading zeros are required for proper ordering. If you have 100 or more 001...999.

Unfortunately, convert changes the image quality before "packing it" into the PDF. So, to have minimal loss of quality, is better to put the original jpg, (works with .png too) into the PDF, you need to use img2pdf.

I use these commands:

A shorter one liner solution also using img2pdf as suggested in the comments**

Make PDF

img2pdf *.jp* --output combined.pdf
(optional) OCR the output PDF

ocrmypdf combined.pdf combined_ocr.pdf

Below is the original answer commands with more command and more tools needed:

This command is to make a pdf file out of every jpg image without loss of either resolution or quality:

ls -1 ./*jpg | xargs -L1 -I {} img2pdf {} -o {}.pdf
This command will concatenate the pdfpages into one document:

pdftk *.pdf cat output combined.pdf
And finally, I add an OCRed text layer that doesn't change the quality of the scan in the pdfs so that they can be searchable:

pypdfocr combined.pdf

An alternative to using pypdfocr:

`ocrmypdf combined.pdf combined_ocr.pdf`

convert `ls -1v` file.pdf

This ls will list one file a time in a "natural order" (1,2,3...) and proceed with conversion.

Convert a directory of JPEG files to a single PDF document

Related

Recent Posts