Finding Image resolution in PDF file?
slhck's answer and scruss' comment deserve to be updated : pdfimages
now (at least since version 0.26.5) explicitely lists x-ppi
and y-ppi
. Here is an sample output :
$ pdfimages -list example.pdf
page num type width height color comp bpc enc interp object ID x-ppi y-ppi size ratio
--------------------------------------------------------------------------------------------
1 0 image 2244 2244 cmyk 4 8 image no 215 0 301 301 418K 2.1%
2 1 image 900 600 rgb 3 8 image no 324 0 1524 1525 35.5K 2.2%
On Debian (Wheezy) and Fedora (23), pdfimages
is part of poppler-utils
packages.
I know that you don't want to extract the image data, but this is probably the only way to find out the original resolution.
On *nix, if you have ImageMagick's identify
and Xpdf installed1:
pdfimages -j test.pdf test && for file in $(find . -name "test*.jpg"); do identify "$file"; done
Where test.pdf
is your input PDF. The output files are written to test-000.jpg
, test-001.jpg
, et cetera. This would give you the original size of all the contained images of that PDF2.
Example output for a PDF file that only contains one big image:
./test-000.jpg JPEG 2500x1961 2500x1961+0+0 8-bit DirectClass 1.022MB 0.000u 0:00.000
1) Windows has these too, but the script would be different of course.
2) Note that images don't really carry DPI information. Simply speaking: That's just something used for printing and images don't need an inherent measure of DPI.
What is the optimum resolution of converting text file into image PDF. 96dpi, 300dpi or more?
Generally, anything you want to print should be 300dpi or more. Most printers will handle a higher resolution too.
For some reason, the latest version of pdfimages that I can upgrade in my CentOS is version 3.04.
So, I don't have the -list option as stated by previous answers. However, the test image created from pdfimages based on slhck's answer contains the desired answer!
identify -verbose test-0000.jpg | more
Image: test-0000.jpg
Format: JPEG (Joint Photographic Experts Group JFIF format)
Mime type: image/jpeg
Class: DirectClass
Geometry: 6600x5100+0+0
Resolution: 600x600
Print size: 11x8.5
So the dpi is explicitly shown on the 6th line using the -verbose option in the identify command.
So, slhck's answer can be modified to the following.
pdfimages -j test.pdf test && for file in $(find . -name "test*.jpg"); do identify -verbose "$file" | awk 'NR==6'; done
On another note, I tried running
identify -verbose test.pdf
Format: PDF (Portable Document Format)
Mime type: application/pdf
Class: DirectClass
Geometry: 792x612+0+0
Resolution: 72x72
Print size: 11x8.5
It seems that Imagemagick always assumes a 72dpi and so the information printed here appears to be incorrect.
This worked with a pdf generated from a Kyocera mfp... This is probably only valid for full-page images like scans.
- Open the pdf w/ Reader-
File>Properties -Description tab -Page size. My example said 8.5x11.0 in.
Open the pdf with a text editor (notepad), look for
/width
and/height
- Take the height and width and divide them by the page height and width (in inches)
Example:
5100/8.5=600
6600/11.0=600
My PDF was scanned at a 600x600 resolution.
You can skip the first 2 steps if you know the document size (typically A4 is 8.27x11.69).