Recover files that were added to the index but then removed by a git reset

I added some files to the index but then by mistake I deleted them with git reset --hard. How do I recover them? Here's what happened:

I added all files using git add .
I then committed
When I checked the status, there were still files that weren't included in the commit from the add, which was strange
I added the untracked files again and it worked this time
But I wanted everything to be in 1 single commit so I looked up how to unstage what I just committed
I used git reset --hard HEAD^ — bad idea obviously, all files were deleted
so then I used git reflog to find where I left off
then I used git reflog ______ to go back to my last commit.
then I used git reset HEAD to unstage the commit (what I should have originally done) but the files I added (see above) after the commit were still gone.

How do I get those files back?

First, make a full backup of your Git repository!

When you git add a file, git will create a blob out of this file's content and add it to its object database (.git/objects/??/*).

Let's look at your commands, one by one:

I added all files using git add .

$ git add .

This will add all files contained in the current directory and its subdirectories to Git's object database. Untracked files matching patterns from .gitignore files will not be added. Tree files will also be written. Please see the end of my answer.

I then committed

$ git commit -m'added all files'

This will write a new commit object to the object database. This commit will reference a single tree. The tree references blobs (files) and other trees (subdirectories).

When I checked the status, there were still files that weren't included in the commit from the add, which was strange

$ git status

I can think of two scenarios where this happens: something modified your files or new files were added behind your back.

I added the untracked files again and it worked this time

$ git add .

I assume you used the same add command again, as in step 1.

But I wanted everything to be in 1 single commit so I looked up how to unstage what I just committed

I will tell you a better way at the end of this answer, which does not require the user to issue a potentially dangerous reset

I used git reset --hard HEAD^ — bad idea obviously, all files were deleted

$ git reset --hard HEAD^

This command will set your current working tree and index to be exactly at the commit HEAD^ (the second-last commit). In other words, it will discard any local uncommitted changes and move the branch pointer back one commit. It does not touch untracked files.

so then I used git reflog to find where I left off

$ git reflog

This shows the last commits that were recently checked out (identical to git reflog HEAD). If you specify a branch name, it will show you the last commits that this branch pointed to recently.

then I used git reflog __ to go back to my last commit.

Not sure about this one. git reflog is (mostly) a read-only command and cannot be used to "get back" to commits. You can only use it, to find commits a branch (or HEAD) pointed to.

then I used git reset HEAD to unstage the commit (what I should have originally done) but the files I added (see above) after the commit were still gone. $ git reset HEAD

This will not unstage this commit, but it will unstage all staged (but uncommitted) changes from the index. Originally (1st step), you wanted to say git reset HEAD^ (or git reset --mixed HEAD^) – this will leave your working tree untouched, but set the index to match the tree pointed to by the commit named by HEAD^.

Now, to get back your files, you have to use git fsck --full --unreachable --no-reflog. It will scan all objects in Git's object database and perform a reachability analysis. You want to look for blob objects. There should also be a tree object, describing the state after your second git add .

git cat-file -p <object hash> will print the files content, so you can verify that you have the right objects. For blobs, you can use IO redirection to write the content to the correct file name. For trees, you have to use git commands (git read-tree). If it's only a few files, you are better off writing them directly to files.

A few notes here:

If you want to add files to the last commit (or edit its commit message), you can simply use git commit --amend. It's basically a wrapper around git reset --soft HEAD^ && git commit -c HEAD@{1}.

Also, it's almost never a good idea to use git add .. Usually, you only want to use it the first time, when you are creating a new repository. Better alternatives are git add -u, git commit -a, which will stage all changes to tracked files. To track new files, better specify them explicitely.

I had a similar issue but I had a lot of dangling blobs and trees in my repo so I ended up filtering with grep the output of all the dangling blobs and printing the ones that matched. Assuming ${UNIQUE_CODE} is some code that is unique to the files you had in the index, then this should give you the hashes of the blobs you are looking for:

for b in $(git fsck --lost-found | grep blob | awk '{print $3}'); do git cat-file -p $b | grep -q ${UNIQUE_CODE} && echo $b; done

Recover files that were added to the index but then removed by a git reset

Related

Recent Posts