What is the value of MD5 checksums if the MD5 hash itself could potentially also have been manipulated?

Downloads on websites sometimes have an MD5 checksum, allowing people to confirm the integrity of the file. I have heard this is to allow not only corrupted files to be instantly identified before they cause a problem but also for for any malicious changes to be easily detected.

I follow the logic as far as file corruption is concerned but if someone deliberately wants to upload a malicious file, then they could generate a corresponding MD5 checksum and post that on the download site along with the altered file. This would deceive anyone downloading the file into thinking it was unaltered.

How can MD5 checksums provide any protection against deliberately altered files if there is no way of knowing if the checksum itself has been compromised?


Solution 1:

I have heard this is to allow [...] for any malicious changes to be detected also.

Well you heard wrong, then. MD5 (or SHA or whatever) checksums are provided (next to downloads links, specifically) only for verifying a correct download. The only thing they aim to guarantee is that you have the same file as the server. Nothing more, nothing less. If the server is compromised, you’re SOL. It’s really as simple as that.

Solution 2:

The solution used by some package management systems such as dpkg is to sign the hash: use the hash as input to one of the public key signing algorithms. See http://www.pgpi.org/doc/pgpintro/#p12

If you have the public key of the signatory, you can verify the signature, which proves the hash is unmodified. This just leaves you with the problem of getting the right public key in advance, although if someone once tampers with the key distribution they also have to tamper with everything you might verify with it otherwise you'll spot that something strange is going on.