Which duplicate should I remove?

A while ago I somehow managed to duplicate a lot of photos but didn't realise till many months later and now don't know which one is the duplicate. For the duplicates, there is a common pattern compared to the normal photos, I don't know which is the duplicate but can see a consistent difference between the two photos. The differences:

  • One is .jpg and the other is .JPG
  • The .JPG file is consistently a few megabytes bigger than .jpg
  • The difference in the histograms (links below) is shown where .jpg is smooth and .JPG is spikey

I don't know much about histograms but would it be fair to say that the .JPG histogram is worse due to it being spikey and therefore the one I should delete? Or should the .JPG be kept because it technically has more "data" because the file size is bigger?

.jpg histogram .JPG histogram


Solution 1:

I would go for the JPG ones as being the originals, based on these considerations:

  • They have the additional metadata of "Album name : disc" (meaning CD)
  • The name part has "IMGP" in upper-case, which goes better with JPG
  • A program would normally use the extension of jpg for the files that it creates.

It seems to me that you have in the past passed the images through some image optimizer, which reduced their file size (and perhaps also the quality).

Addition by Mokubai:

A scene optimiser may well also account for smoothing out the histogram with a gentle noise removing blur effect which would be more "compressor friendly" due to the algorithms used. Camera sensors may have a more quantised and noisy image which might show in the more "spiky" histogram. I'd agree with you that the JPG is likely the original as your other points also seem valid.