Finding Image resolution in PDF file?

slhck's answer and scruss' comment deserve to be updated : pdfimages now (at least since version 0.26.5) explicitely lists x-ppi and y-ppi. Here is an sample output :

$ pdfimages -list example.pdf 
page   num  type   width height color comp bpc  enc interp  object ID x-ppi y-ppi size ratio
--------------------------------------------------------------------------------------------
   1     0 image    2244  2244  cmyk    4   8  image  no       215  0   301   301  418K 2.1%
   2     1 image     900   600  rgb     3   8  image  no       324  0  1524  1525 35.5K 2.2%

On Debian (Wheezy) and Fedora (23), pdfimages is part of poppler-utils packages.

I know that you don't want to extract the image data, but this is probably the only way to find out the original resolution.

On *nix, if you have ImageMagick's identify and Xpdf installed¹:

pdfimages -j test.pdf test && for file in $(find . -name "test*.jpg"); do identify "$file"; done

Where test.pdf is your input PDF. The output files are written to test-000.jpg, test-001.jpg, et cetera. This would give you the original size of all the contained images of that PDF².

Example output for a PDF file that only contains one big image:

./test-000.jpg JPEG 2500x1961 2500x1961+0+0 8-bit DirectClass 1.022MB 0.000u 0:00.000

_{1) Windows has these too, but the script would be different of course.}
_{2) Note that images don't really carry DPI information. Simply speaking: That's just something used for printing and images don't need an inherent measure of DPI.}

What is the optimum resolution of converting text file into image PDF. 96dpi, 300dpi or more?

Generally, anything you want to print should be 300dpi or more. Most printers will handle a higher resolution too.

For some reason, the latest version of pdfimages that I can upgrade in my CentOS is version 3.04.

So, I don't have the -list option as stated by previous answers. However, the test image created from pdfimages based on slhck's answer contains the desired answer!

identify -verbose test-0000.jpg | more

Image: test-0000.jpg  
Format: JPEG (Joint Photographic Experts Group JFIF format)  
Mime type: image/jpeg  
Class: DirectClass  
Geometry: 6600x5100+0+0  
Resolution: 600x600  
Print size: 11x8.5

So the dpi is explicitly shown on the 6th line using the -verbose option in the identify command.

So, slhck's answer can be modified to the following.

pdfimages -j test.pdf test && for file in $(find . -name "test*.jpg"); do identify -verbose "$file" | awk 'NR==6'; done

On another note, I tried running

identify -verbose test.pdf

Format: PDF (Portable Document Format)  
Mime type: application/pdf  
Class: DirectClass  
Geometry: 792x612+0+0  
Resolution: 72x72  
Print size: 11x8.5

It seems that Imagemagick always assumes a 72dpi and so the information printed here appears to be incorrect.

This worked with a pdf generated from a Kyocera mfp... This is probably only valid for full-page images like scans.

Open the pdf w/ Reader-
File>Properties -Description tab -Page size. My example said 8.5x11.0 in.
Open the pdf with a text editor (notepad), look for /width and /height
Take the height and width and divide them by the page height and width (in inches)

Example:

5100/8.5=600
6600/11.0=600

My PDF was scanned at a 600x600 resolution.

You can skip the first 2 steps if you know the document size (typically A4 is 8.27x11.69).

Finding Image resolution in PDF file?

Related

Recent Posts