How can I clean my .git folder? Cleaned up my project directory, but .git is still massive

The .git/objects in my rails project directory is still massive, after deleting hundreds of Megabytes of accidentally generated garbage.

I have tried git add -A, as well as other commands to update the index and remove nonexistent files. I gather, perhaps incorrectly, that the files with two character names in the directory are blobs. I have tried rolling back to previous commits, but no luck.

What can I do to clean this directory?


Solution 1:

  • If you added the files and then removed them, the blobs still exist but are dangling. git fsck will list unreachable blobs, and git prune will delete them.

  • If you added the files, committed them, and then rolled back with git reset --hard HEAD^, they’re stuck a little deeper. git fsck will not list any dangling commits or blobs, because your branch’s reflog is holding onto them. Here’s one way to ensure that only objects which are in your history proper will remain:

    git reflog expire --expire=now --all
    git repack -ad  # Remove dangling objects from packfiles
    git prune       # Remove dangling loose objects
    
  • Another way is also to clone the repository, as that will only carry the objects which are reachable. However, if the dangling objects got packed (and if you performed many operations, git may well have packed automatically), then a local clone will carry the entire packfile:

    git clone foo bar                 # bad
    git clone --no-hardlinks foo bar  # also bad
    

    You must specify a protocol to force git to compute a new pack:

    git clone file://foo bar  # good
    

Solution 2:

Have you tried the git gc command?

Solution 3:

Sparkleshare created 13GB of tmp_pack_ files in my git after failing to pull many times a huge images checkin. The only thing that helped was ...

rm -f .git/objects/*/tmp_*

'git gc' did not remove those files.

Solution 4:

If you still have a large repo after pruning and repacking (gc --aggressive --prune=tomorrow...) then you can simply go looking for the odd one out:

git rev-list --objects --all |
    while read sha1 fname
    do 
        echo -e "$(git cat-file -s $sha1)\t$\t$fname"
    done | sort -n

This will give you a sorted list of objects in ascending size. You could use git-filter-branch to remove the culprit from your repo.

See "Removing Objects" in http://progit.org/book/ch9-7.html for guidance

Solution 5:

Recursively:

find ./ -iname '*.!*' -size 0 -delete
for i in */.git; do ( echo $i; cd $i/..; git gc --aggressive --prune=now --force; ); done