How to recover Git objects damaged by hard disk failure?
I have had a hard disk failure which resulted in some files of a Git repository getting damaged. When running git fsck --full
I get the following output:
error: .git/objects/pack/pack-6863e0a0e4b4ded6090fac5d12eba6ca7346b19c.pack SHA1 checksum mismatch
error: index CRC mismatch for object 6c8cae4994b5ec7891ccb1527d30634997a978ee from .git/objects/pack/pack-6863e0a0e4b4ded6090fac5d12eba6ca7346b19c.pack at offset 97824129
error: inflate: data stream error (invalid code lengths set)
error: cannot unpack 6c8cae4994b5ec7891ccb1527d30634997a978ee from .git/objects/pack/pack-6863e0a0e4b4ded6090fac5d12eba6ca7346b19c.pack at offset 97824129
error: inflate: data stream error (invalid stored block lengths)
error: failed to read object 0dcf6723cc69cc7f91d4a7432d0f1a1f05e77eaa at offset 276988017 from .git/objects/pack/pack-6863e0a0e4b4ded6090fac5d12eba6ca7346b19c.pack
fatal: object 0dcf6723cc69cc7f91d4a7432d0f1a1f05e77eaa is corrupted
I have backups of the repository, but the only backup that includes the pack file has it already damaged. So I think that I have to find out a way to retrieve the single objects from different backups and somehow instruct Git to produce a new pack with only correct objects.
Can you please give me hints how to fix my repository?
In some previous backups, your bad objects may have been packed in different files or may be loose objects yet. So your objects may be recovered.
It seems there are a few bad objects in your database. So you could do it the manual way.
Because of git hash-object
, git mktree
and git commit-tree
do not write the objects because they are found in the pack, then start doing this:
mv .git/objects/pack/* <somewhere>
for i in <somewhere>/*.pack; do
git unpack-objects -r < $i
done
rm <somewhere>/*
(Your packs are moved out from the repository, and unpacked again in it; only the good objects are now in the database)
You can do:
git cat-file -t 6c8cae4994b5ec7891ccb1527d30634997a978ee
and check the type of the object.
If the type is blob: retrieve the contents of the file from previous backups (with git show
or git cat-file
or git unpack-file
; then you may git hash-object -w
to rewrite the object in your current repository.
If the type is tree: you could use git ls-tree
to recover the tree from previous backups; then git mktree
to write it again in your current repository.
If the type is commit: the same with git show
, git cat-file
and git commit-tree
.
Of course, I would backup your original working copy before starting this process.
Also, take a look at How to Recover Corrupted Blob Object.
Banengusk was putting me on the right track. For further reference, I want to post the steps I took to fix my repository corruption. I was lucky enough to find all needed objects either in older packs or in repository backups.
# Unpack last non-corrupted pack
$ mv .git/objects/pack .git/objects/pack.old
$ git unpack-objects -r < .git/objects/pack.old/pack-012066c998b2d171913aeb5bf0719fd4655fa7d0.pack
$ git log
fatal: bad object HEAD
$ cat .git/HEAD
ref: refs/heads/master
$ ls .git/refs/heads/
$ cat .git/packed-refs
# pack-refs with: peeled
aa268a069add6d71e162c4e2455c1b690079c8c1 refs/heads/master
$ git fsck --full
error: HEAD: invalid sha1 pointer aa268a069add6d71e162c4e2455c1b690079c8c1
error: refs/heads/master does not point to a valid object!
missing blob 75405ef0e6f66e48c1ff836786ff110efa33a919
missing blob 27c4611ffbc3c32712a395910a96052a3de67c9b
dangling tree 30473f109d87f4bcde612a2b9a204c3e322cb0dc
# Copy HEAD object from backup of repository
$ cp repobackup/.git/objects/aa/268a069add6d71e162c4e2455c1b690079c8c1 .git/objects/aa
# Now copy all missing objects from backup of repository and run "git fsck --full" afterwards
# Repeat until git fsck --full only reports dangling objects
# Now garbage collect repo
$ git gc
warning: reflog of 'HEAD' references pruned commits
warning: reflog of 'refs/heads/master' references pruned commits
Counting objects: 3992, done.
Delta compression using 2 threads.
fatal: object bf1c4953c0ea4a045bf0975a916b53d247e7ca94 inconsistent object length (6093 vs 415232)
error: failed to run repack
# Check reflogs...
$ git reflog
# ...then clean
$ git reflog expire --expire=0 --all
# Now garbage collect again
$ git gc
Counting objects: 3992, done.
Delta compression using 2 threads.
Compressing objects: 100% (3970/3970), done.
Writing objects: 100% (3992/3992), done.
Total 3992 (delta 2060), reused 0 (delta 0)
Removing duplicate objects: 100% (256/256), done.
# Done!
Try the following commands at first (re-run again if needed):
$ git fsck --full
$ git gc
$ git gc --prune=today
$ git fetch --all
$ git pull --rebase
And then you you still have the problems, try can:
-
remove all the corrupt objects, e.g.
fatal: loose object 91c5...51e5 (stored in .git/objects/06/91c5...51e5) is corrupt $ rm -v .git/objects/06/91c5...51e5
-
remove all the empty objects, e.g.
error: object file .git/objects/06/91c5...51e5 is empty $ find .git/objects/ -size 0 -exec rm -vf "{}" \;
-
check a "broken link" message by:
git ls-tree 2d9263c6d23595e7cb2a21e5ebbb53655278dff8
This will tells you what file the corrupt blob came from!
-
to recover file, you might be really lucky, and it may be the version that you already have checked out in your working tree:
git hash-object -w my-magic-file
again, and if it outputs the missing SHA1 (4b945..) you're now all done!
-
assuming that it was some older version that was broken, the easiest way to do it is to do:
git log --raw --all --full-history -- subdirectory/my-magic-file
and that will show you the whole log for that file (please realize that the tree you had may not be the top-level tree, so you need to figure out which subdirectory it was in on your own), then you can now recreate the missing object with hash-object again.
-
to get a list of all refs with missing commits, trees or blobs:
$ git for-each-ref --format='%(refname)' | while read ref; do git rev-list --objects $ref >/dev/null || echo "in $ref"; done
It may not be possible to remove some of those refs using the regular branch -d or tag -d commands, since they will die if git notices the corruption. So use the plumbing command git update-ref -d $ref instead. Note that in case of local branches, this command may leave stale branch configuration behind in .git/config. It can be deleted manually (look for the [branch "$ref"] section).
-
After all refs are clean, there may still be broken commits in the reflog. You can clear all reflogs using git reflog expire --expire=now --all. If you do not want to lose all of your reflogs, you can search the individual refs for broken reflogs:
$ (echo HEAD; git for-each-ref --format='%(refname)') | while read ref; do git rev-list -g --objects $ref >/dev/null || echo "in $ref"; done
(Note the added -g option to git rev-list.) Then, use git reflog expire --expire=now $ref on each of those. When all broken refs and reflogs are gone, run git fsck --full in order to check that the repository is clean. Dangling objects are Ok.
Below you can find advanced usage of commands which potentially can cause lost of your data in your git repository if not used wisely, so make a backup before you accidentally do further damages to your git. Try on your own risk if you know what you're doing.
To pull the current branch on top of the upstream branch after fetching:
$ git pull --rebase
You also may try to checkout new branch and delete the old one:
$ git checkout -b new_master origin/master
To find the corrupted object in git for removal, try the following command:
while [ true ]; do f=`git fsck --full 2>&1|awk '{print $3}'|sed -r 's/(^..)(.*)/objects\/\1\/\2/'`; if [ ! -f "$f" ]; then break; fi; echo delete $f; rm -f "$f"; done
For OSX, use sed -E
instead of sed -r
.
Other idea is to unpack all objects from pack files to regenerate all objects inside .git/objects, so try to run the following commands within your repository:
$ cp -fr .git/objects/pack .git/objects/pack.bak
$ for i in .git/objects/pack.bak/*.pack; do git unpack-objects -r < $i; done
$ rm -frv .git/objects/pack.bak
If above doesn't help, you may try to rsync or copy the git objects from another repo, e.g.
$ rsync -varu git_server:/path/to/git/.git local_git_repo/
$ rsync -varu /local/path/to/other-working/git/.git local_git_repo/
$ cp -frv ../other_repo/.git/objects .git/objects
To fix the broken branch when trying to checkout as follows:
$ git checkout -f master
fatal: unable to read tree 5ace24d474a9535ddd5e6a6c6a1ef480aecf2625
Try to remove it and checkout from upstream again:
$ git branch -D master
$ git checkout -b master github/master
In case if git get you into detached state, checkout the master
and merge into it the detached branch.
Another idea is to rebase the existing master recursively:
$ git reset HEAD --hard
$ git rebase -s recursive -X theirs origin/master
See also:
- Some tricks to reconstruct blob objects in order to fix a corrupted repository.
- How to fix a broken repository?
- How to remove all broken refs from a repository?
- How to fix corrupted git repository? (seeques)
- How to fix corrupted git repository? (qnundrum)
- Error when using SourceTree with Git: 'Summary' failed with code 128: fatal: unable to read tree
- Recover A Corrupt Git Bare Repository
- Recovering a damaged git repository
- How to fix git error: object is empy / corrupt
- How to diagnose and fix git fatal: unable to read tree
- How to deal with this git error
- How to fix corrupted git repository?
- How do I 'overwrite', rather than 'merge', a branch on another branch in Git?
- How to replace master branch in git, entirely, from another branch?
- Git: "Corrupt loose object"
- Git reset = fatal: unable to read tree
Here are the steps I followed to recover from a corrupt blob object.
1) Identify corrupt blob
git fsck --full
error: inflate: data stream error (incorrect data check)
error: sha1 mismatch 241091723c324aed77b2d35f97a05e856b319efd
error: 241091723c324aed77b2d35f97a05e856b319efd: object corrupt or missing
...
Corrupt blob is 241091723c324aed77b2d35f97a05e856b319efd
2) Move corrupt blob to a safe place (just in case)
mv .git/objects/24/1091723c324aed77b2d35f97a05e856b319efd ../24/
3) Get parent of corrupt blob
git fsck --full
Checking object directories: 100% (256/256), done.
Checking objects: 100% (70321/70321), done.
broken link from tree 0716831e1a6c8d3e6b2b541d21c4748cc0ce7180
to blob 241091723c324aed77b2d35f97a05e856b319efd
Parent hash is 0716831e1a6c8d3e6b2b541d21c4748cc0ce7180.
4) Get file name corresponding to corrupt blob
git ls-tree 0716831e1a6c8d3e6b2b541d21c4748cc0ce7180
...
100644 blob 241091723c324aed77b2d35f97a05e856b319efd dump.tar.gz
...
Find this particular file in a backup or in the upstream git repository (in my case it is dump.tar.gz). Then copy it somewhere inside your local repository.
5) Add previously corrupted file in the git object database
git hash-object -w dump.tar.gz
6) Celebrate!
git gc
Counting objects: 75197, done.
Compressing objects: 100% (21805/21805), done.
Writing objects: 100% (75197/75197), done.
Total 75197 (delta 52999), reused 69857 (delta 49296)