Convert a directory of JPEG files to a single PDF document
From the imagemagick
package, use the convert
command:
convert *.jpg -auto-orient pictures.pdf
You will get a single pdf containing all jpg in the current folder.
The option -auto-orient
reads the image's EXIF data to rotate the image.
Install IM with:
sudo apt-get install imagemagick
sources: stackoverflow imagemagick options
Edit: Note that images will be out of specific order if they are not numbered. if you have 10 or more you need to name them ending filename01.jpg...filename99.jpg etc. The leading zeros are required for proper ordering. If you have 100 or more 001...999.
Unfortunately, convert
changes the image quality before "packing it" into the PDF. So, to have minimal loss of quality, is better to put the original jpg
, (works with .png
too) into the PDF, you need to use img2pdf
.
I use these commands:
A shorter one liner solution also using img2pdf
as suggested in the comments**
-
Make PDF
img2pdf *.jp* --output combined.pdf
-
(optional) OCR the output PDF
ocrmypdf combined.pdf combined_ocr.pdf
Below is the original answer commands with more command and more tools needed:
-
This command is to make a
pdf
file out of everyjpg
image without loss of either resolution or quality:ls -1 ./*jpg | xargs -L1 -I {} img2pdf {} -o {}.pdf
-
This command will concatenate the
pdf
pages into one document:pdftk *.pdf cat output combined.pdf
-
And finally, I add an OCRed text layer that doesn't change the quality of the scan in the pdfs so that they can be searchable:
pypdfocr combined.pdf
An alternative to using pypdfocr
:
`ocrmypdf combined.pdf combined_ocr.pdf`
convert `ls -1v` file.pdf
- This ls will list one file a time in a "natural order" (1,2,3...) and proceed with conversion.