How is the Git hash calculated?

Solution 1:

As described in "How is git commit sha1 formed ", the formula is:

(printf "<type> %s\0" $(git cat-file <type> <ref> | wc -c); git cat-file <type> <ref>)|sha1sum

In the case of the commit 9eabf5b536662000f79978c4d1b6e4eff5c8d785 (which is v2.4.2^{}, and which referenced a tree) :

(printf "commit %s\0" $(git cat-file commit 9eabf5b536662000f79978c4d1b6e4eff5c8d785 | wc -c); git cat-file commit 9eabf5b536662000f79978c4d1b6e4eff5c8d785 )|sha1sum

That will give 9eabf5b536662000f79978c4d1b6e4eff5c8d785.

As would:

(printf "commit %s\0" $(git cat-file commit v2.4.2{} | wc -c); git cat-file commit v2.4.2{})|sha1sum

(still 9eabf5b536662000f79978c4d1b6e4eff5c8d785)

Similarly, computing the SHA1 of the tag v2.4.2 would be:

(printf "tag %s\0" $(git cat-file tag v2.4.2 | wc -c); git cat-file tag v2.4.2)|sha1sum

That would give 29932f3915935d773dc8d52c292cadd81c81071d.

Solution 2:

There's bit of confusion here. Git uses different types of objects: blobs, trees and commits. The following command:

git cat-file -t <hash>

Tells you the type of object for a given hash. So in your example, the hash 9eabf5b536662000f79978c4d1b6e4eff5c8d785 corresponds to a commit object.

Now, as you figured out yourself, running this:

git cat-file -p 9eabf5b536662000f79978c4d1b6e4eff5c8d785

Gives you the content of the object according to its type (in this instance, a commit).

But, this:

git hash-object fi

...computes the hash for a blob whose content is the output of the previous command (in your example), but it could be anything else (like "hello world!"). Here try this:

echo "blob 277\0$(cat fi)" | shasum

The output is the same as the previous command. This is basically how Git hashes a blob. So by hashing fi, you are generating a blob object. But as we have seen, 9eabf5b536662000f79978c4d1b6e4eff5c8d785 is a commit, not a blob. So, you cannot hash fi as it is in order to get the same hash.

A commit's hash is based on several other informations which makes it unique (such as the committer, the author, the date, etc). The following article tells you exactly what a commit hash is made of:

The anatomy of a git commit

So you could get the same hash by providing all the data specified in the article with the exact same values as those used in the original commit.

This might be helpful as well:

Git from the bottom up