Determining JPG quality in Python (PIL)

I am playing around with PIL library in Python and I am wondering how do I determine quality of given JPG image. I try to open JPG image do something to it and save it again in the its original quality. Image.save let me determine the desired quality:

im.save(name, quality = x)  

but I can't see any way to extract original one. For now I am just guessing and try to have an output file of the same size as input by doing binary search on 'quality' parameter but this is not acceptable long term solution :)
I also tried using: Image.info but most of my images don't have any useful information there (ex: 'adobe', 'icc_profile', 'exif', 'adobe_transform')
Help !


Solution 1:

In PIL (and mostly all softwares/librairies that use libjpeg) the quality setting is use to construct the quantization table (ref.). In libjpeg the quality number "scale" the sample table values (from the JPEG specification Section K.1). In other librairies there's different tables assign to different qualities (ex.: Photoshop, digital camera).

So, in others terms, the quality equal to the quantization table, so it's more complex then just a number.

If you want to save your modify images with the same "quality", you only need to use the same quantization table. Fortunately, the quantization table is embeded in each JPEG. Unfortunately, it's not possible to specify a quantization table when saving in PIL. cjpeg, a command line utilities that come with libjpeg, can do that.

Here's some rough code that save a jpeg with a specified quantization table:

from subprocess import Popen, PIPE
from PIL import Image, ImageFilter

proc = Popen('%s -sample 1x1 -optimize -progressive -qtables %s -outfile %s' % ('path/to/cjpeg', '/path/ta/qtable', 'out.jpg'), shell=True, stdin=PIPE, stdout=PIPE, stderr=PIPE)
P = '6'
if im.mode == 'L':
    P = '5'
stdout, stderr = proc.communicate('P%s\n%s %s\n255\n%s' % (P, im.size[0], im.size[1], im.tostring()))

You will need to find the way to extract the quantization table from the orginal jpeg. djpeg can do that (part of libjpeg):

djpeg -verbose -verbose image.jpg > /dev/null

You will need also to find and set the sampling. For more info on that check here. You can also look at test_subsampling

UPDATE

I did a PIL fork to add the possibility to specify subsampling or quantization tables or both when saving JPEG. You can also specify quality='keep' when saving and the image will be saved with the same quantization tables and subsampling as the original (original need to be a JPEG). There's also some presets (based on Photoshop) that you can pass to quality when saving. My fork.

UPDATE 2

My code is now part of Pillow 2.0. So just do:

pip install Pillow

Solution 2:

Quality is something that is used to generate the data that is stored in the JPEG. This number is not stored in the JPEG.

One way that you might be able to determine quality is to take the topleft 8x8 pixel cell of the image before you edit it and run the JPEG compression formula on just that to get close to the original. You need to develop a distance function from the result of that to your original (pixel difference).

You will still be doing a binary search with quality, but it's a much smaller amount of work.

Here is information on how JPEG compression works

https://www.dspguide.com/ch27/6.htm

Here's another way from a MS FAQ

https://support.microsoft.com/kb/324790

You have to translate from C#.