How can I fix a corrupted Git repository?

I tried cloning my repository which I keep in my Ubuntu One folder to a new machine, and I got this:

cd ~/source/personal
git clone ~/Ubuntu\ One\ Side\ Work/projects.git/

Cloning into 'projects'...
done.
fatal: unable to read tree 29a422c19251aeaeb907175e9b3219a9bed6c616

So I tried looking at the many other questions like this that have been asked here and most of them say to run git fsck --full and then I get this when I try that.

cd ~/Ubuntu\ One\ Side\ Work/projects.git
git fsck --full

Checking object directories: 100% (256/256), done.
Checking objects: 100% (447/447), done.
broken link from  commit 235ae1f48701d577d71ebd430344a159e5ba4881
              to  commit 984c11abfc9c2839b386f29c574d9e03383fa589
broken link from    tree 632a9cf0ef9fccea08438b574e2f1c954f4ff08b
              to    blob 25a742dff0a403b2b3884f2ffddf63eb45721fac
broken link from    tree 632a9cf0ef9fccea08438b574e2f1c954f4ff08b
              to    blob dd4e97e22e159a585b20e21028f964827d5afa4e
broken link from    tree 632a9cf0ef9fccea08438b574e2f1c954f4ff08b
              to    tree 29a422c19251aeaeb907175e9b3219a9bed6c616
broken link from    tree 632a9cf0ef9fccea08438b574e2f1c954f4ff08b
              to    tree 8084e8e04d510cc28321f30a9646477cc50c235c
broken link from    tree 774b5b4157b4caae1c6cad96c8eaf5d4eba2c628
              to    blob a0daa0c1567b55d8de2b4d7a3bc010f58c047eab
broken link from    tree 774b5b4157b4caae1c6cad96c8eaf5d4eba2c628
              to    blob e9052d35bfb6d30065b206fc43f4200a04d5281b
broken link from    tree 774b5b4157b4caae1c6cad96c8eaf5d4eba2c628
              to    blob 1a3a5e4dd2502ac121c22f743c4250e254a94eeb
broken link from    tree 4aa336dc1a5838e8918e03b85580069d83f4ad09
              to    tree 8cc55ec952dc192a233e062201d1e7e873ac3db0
broken link from    tree e5674a91a53e15575a1f3bf5786bc5cc719fb483
              to    blob 4a994e1e7bb7ce28dcec98bad48b9a891d7dec51
broken link from    tree e5674a91a53e15575a1f3bf5786bc5cc719fb483
              to    blob ac033bf9dc846101320c96a5ce8aceb8c96ec098
broken link from    tree 252ab84542264e1589576b6ee51e7a31e580a0e2
              to    tree 2069041cd5950e529e2991d37b7290ec021d90d4
broken link from    tree 2d4964aa4d4f5d8c7228518ce72ef6a63f820c6d
              to    blob d83690e1b9a6bdd8a08754b38231799acefcb2ab
broken link from    tree c7192e82fc581bd6448bda1a25e8729bdac5f4ff
              to    blob 30d54d47ae82add1917ca173d42e58b396df580b
broken link from    tree 7c66306901fc71389623286936cef172d4ffe408
              to    blob bc7e05d705401273b1df4e939de0f540597c0931
broken link from    tree 0940f5fd227d4c84d6e6749d872db50a4522ae3a
              to    tree 923767594ac22023e824948d65622fe5b407d1a1
broken link from    tree 8eadcd2a971e8357d24f0d80f993d2963452209f
              to    blob 2598bde3dc8cb80ee49510b8159344004b88645f
broken link from    tree ffa302dd0d969172ef23caeefe856ab2f57a4e4d
              to    blob d6925fa431be1ac585bf9a481e98f75107a6e6fb
broken link from    tree 7045b8870a49ce30a2027537a96d73d162bda773
              to    blob 25688652dea26f61f576ca1b52b9d1a18fbfd01d
broken link from    tree 37e4705d34bd440ce681ae32ae9a180a13256d72
              to    tree 246f564d4cee53339b8a4244f3173b61caa518eb
missing blob d6925fa431be1ac585bf9a481e98f75107a6e6fb
missing blob ac033bf9dc846101320c96a5ce8aceb8c96ec098
missing tree 29a422c19251aeaeb907175e9b3219a9bed6c616
missing tree 8084e8e04d510cc28321f30a9646477cc50c235c
missing blob 30d54d47ae82add1917ca173d42e58b396df580b
missing tree 8cc55ec952dc192a233e062201d1e7e873ac3db0
missing blob e9052d35bfb6d30065b206fc43f4200a04d5281b
dangling tree 4b26e95db542c72ac4a22ec25abe38fb2de79752
missing blob d83690e1b9a6bdd8a08754b38231799acefcb2ab
missing blob 25a742dff0a403b2b3884f2ffddf63eb45721fac
missing tree 923767594ac22023e824948d65622fe5b407d1a1
missing blob 25688652dea26f61f576ca1b52b9d1a18fbfd01d
missing blob 2598bde3dc8cb80ee49510b8159344004b88645f
dangling tree 3a683869f1bb0c1634de75700c316b3b36570dbd
dangling blob 4098d30843380d798a811f1aa9a02994f0dbbb27
missing tree 2069041cd5950e529e2991d37b7290ec021d90d4
missing blob 4a994e1e7bb7ce28dcec98bad48b9a891d7dec51
missing blob 1a3a5e4dd2502ac121c22f743c4250e254a94eeb
missing blob a0daa0c1567b55d8de2b4d7a3bc010f58c047eab
dangling tree 6c7b5162aa7a303fa3fe8dc393c5da564e309521
missing commit 984c11abfc9c2839b386f29c574d9e03383fa589
missing blob bc7e05d705401273b1df4e939de0f540597c0931
missing blob dd4e97e22e159a585b20e21028f964827d5afa4e
missing tree 246f564d4cee53339b8a4244f3173b61caa518eb
dangling commit a01f5c1e5315dc837203d6dee00d3493be9c5db9

That looks really bad. When I do a git log | head I get this

git log | head

error: Could not read 984c11abfc9c2839b386f29c574d9e03383fa589
fatal: Failed to traverse parents of commit 235ae1f48701d577d71ebd430344a159e5ba4881
commit 2fb0d2d0643b445440f01b164f11ee9ee71fca48
Author: christopher <[email protected]>
Date:   Wed Aug 7 15:51:42 2013 -0400

    finishing chapter 7

Other questions here have said to look at ./git/refs/heads/master. It's a bare repo and refs/heads/ exists but refs/heads/master does not. HEAD in the bare repository says ref: refs/heads/master though.

packed-refs does say this though

# pack-refs with: peeled
2fb0d2d0643b445440f01b164f11ee9ee71fca48 refs/heads/master

Still other questions have suggested running git reflog and no output shows up when I run that.

So I really have no idea what to do here. What strategy should be taken? Is it possible to reset head to this last commit on Aug 7?

Doing a git log and going to the bottom of the screen output shows this:

commit 996e03b949aea176238e3c7a8452700bbb987ac9
Author: christopher <christopher@christopher>
Date:   Wed Jul 3 23:00:44 2013 -0400

    many many changes
error: Could not read 984c11abfc9c2839b386f29c574d9e03383fa589
fatal: Failed to traverse parents of commit 235ae1f48701d577d71ebd430344a159e5ba4881

That seems to be preventing the Git prune from working.


Solution 1:

As an alternative to Todd's last option (Full Restores and Re-Initialization), if only the local repository is corrupted, and you know the URL to the remote, you can use this to reset your .git to match the remote (replacing ${url} with the remote URL):

mv -v .git .git_old &&            # Remove old Git files
git init &&                       # Initialise new repository
git remote add origin "${url}" && # Link to old repository
git fetch &&                      # Get old history
# Note that some repositories use 'master' in place of 'main'. Change the following line if your remote uses 'master'.
git reset origin/main --mixed     # Force update to old history.

This leaves your working tree intact, and only affects Git's bookkeeping.

I also recently made a Bash script for this very purpose (Appendix A), which wraps a bit of safety around this operation.

Note:

  • If your repository has submodules, this process will mess them up somehow, and the only solution I've found so far is deleting them and then using git submodule update --init (or recloning the repository, but that seems too drastic).
  • This tries to determine the correct choice between 'main' and 'master' depending on local configuration settings, however there may be some issues if used on a repository that uses 'master', on a machine that has 'main' as the default branch.
  • This uses wget to check that the url is reachable before doing anything. This is not necessarily the best operation to determine that a site is reachable, and if you haven't got wget available, this can likely be replaced with ping -c 1 "${url_base}" (linux), ping -n 1 "${url_base}" (windows), or curl -Is "${url_base}"

Appendix A - Full script

Also published as a gist, though it is now out of date.

#!/bin/bash

# Usage: fix-git [REMOTE-URL]
#   Must be run from the root directory of the repository.
#   If a remote is not supplied, it will be read from .git/config
#
# For when you have a corrupted local repo, but a trusted remote.
# This script replaces all your history with that of the remote.
# If there is a .git, it is backed up as .git_old, removing the last backup.
# This does not affect your working tree.
#
# This does not currently work with submodules!
# This will abort if a suspected submodule is found.
# You will have to delete them first
# and re-clone them after (with `git submodule update --init`)
#
# Error codes:
# 1: If a URL is not supplied, and one cannot be read from .git/config
# 4: If the URL cannot be reached
# 5: If a Git submodule is detected


if [[ "$(find -name .git -not -path ./.git | wc -l)" -gt 0 ]] ;
then
    echo "It looks like this repo uses submodules" >&2
    echo "You will need to remove them before this script can safely execute" >&2
    echo "Then use \`git submodule update --init\` to re-clone them" >&2
    exit 5
fi

if [[ $# -ge 1 ]] ;
then
    url="$1"
else
    if ! url="$(git config --local --get remote.origin.url)" ;
    then
        echo "Unable to find remote 'origin': missing in '.git/config'" >&2
        exit 1
    fi
fi

if ! branch_default="$(git config --get init.defaultBranch)" ;
then
    # if the defaultBranch config option isn't present, then it's likely an old version of git that uses "master" by default
    branch_default="master"
fi

url_base="$(echo "${url}" | sed -E 's;^([^/]*://)?([^/]*)(/.*)?$;\2;')"
echo "Attempting to access ${url_base} before continuing"
if ! wget -p "${url_base}" -O /dev/null -q --dns-timeout=5 --connect-timeout=5 ;
then
    echo "Unable to reach ${url_base}: Aborting before any damage is done" >&2
    exit 4
fi

echo
echo "This operation will replace the local repo with the remote at:"
echo "${url}"
echo
echo "This will completely rewrite history,"
echo "but will leave your working tree intact"
echo -n "Are you sure? (y/N): "

read confirm
if ! [ -t 0 ] ; # i'm open in a pipe
then
    # print the piped input
    echo "${confirm}"
fi
if echo "${confirm}"|grep -Eq "[Yy]+[EeSs]*" ; # it looks like a yes
then
    if [[ -e .git ]] ;
    then
        # remove old backup
        rm -vrf .git_old | tail -n 1 &&
        # backup .git iff it exists
        mv -v .git .git_old
    fi &&
    git init &&
    git remote add origin "${url}" &&
    git config --local --get remote.origin.url | sed 's/^/Added remote origin at /' &&
    git fetch &&
    git reset "origin/${branch_default}" --mixed
else
    echo "Aborting without doing anything"
fi

Solution 2:

TL;DR

Git doesn't really store history the way you think it does. It calculates history at run-time based on an ancestor chain. If your ancestry is missing blobs, trees, or commits then you may not be able to fully recover your history.

Restore Missing Objects from Backups

The first thing you can try is to restore the missing items from backup. For example, see if you have a backup of the commit stored as .git/objects/98/4c11abfc9c2839b386f29c574d9e03383fa589. If so you can restore it.

You may also want to look into git-verify-pack and git-unpack-objects in the event that the commit has already been packed up and you want to return it to a loose object for the purposes of repository surgery.

Surgical Resection

If you can't replace the missing items from a backup, you may be able to excise the missing history. For example, you might examine your history or reflog to find an ancestor of commit 984c11abfc9c2839b386f29c574d9e03383fa589. If you find one intact, then:

  1. Copy your Git working directory to a temporary directory somewhere.
  2. Do a hard reset to the uncorrupted commit.
  3. Copy your current files back into the Git work tree, but make sure you don't copy the .git folder back!
  4. Commit the current work tree, and do your best to treat it as a squashed commit of all the missing history.

If it works, you will of course lose the intervening history. At this point, if you have a working history log, then it's a good idea to prune your history and reflogs of all unreachable commits and objects.

Full Restores and Re-Initialization

If your repository is still broken, then hopefully you have an uncorrupted backup or clone you can restore from. If not, but your current working directory contains valid files, then you can always re-initialize Git. For example:

rm -rf .git
git init
git add .
git commit -m 'Re-initialize repository without old history.'

It's drastic, but it may be your only option if your repository history is truly unrecoverable. YMMV.

Solution 3:

Before trying any of the fixes described on this page, I would advise to make a copy of your repository and work on this copy only. Then at the end if you can fix it, compare it with the original to ensure you did not lose any file in the repair process.

Another alternative which worked for me was to reset the Git head and index to its previous state using:

git reset --keep

You can also do the same manually by opening the Git GUI and selecting each "Staged changes" and click on "Unstage the change". When everything is unstaged, you should now be able to compress your database, check your database and commit.

I also tried the following commands, but they did not work for me. But they might for you depending on the exact issue you have:

git reset --mixed
git fsck --full
git gc --auto
git prune --expire now
git reflog --all

Finally, to avoid this problem of synchronization damaging your Git index (which can happen with Dropbox, SpiderOak, or any other cloud disk), you can do the following:

  1. Convert your .git folder into a single "bundle" Git file by using: git bundle create my_repo.git --all, then it should work just the same as before, but since everything is in a single file you won't risk the synchronization damaging your git repo any more.
  2. Disable instantaneous synchronization: SpiderOak allows you to set the scheduling for checking changes to "automatic" (which means that it is as soon as possible, being monitoring file changes thanks to the OS notifications). This is bad, because it will start to upload changes as soon as you are doing a change, and then download the change, so it might erase the latest changes you were just doing. A solution to fix this issue is to set the changes monitoring delay to 5 minutes or more. This also fixes issues with instant saving note taking applications (such as Notepad++).