Rewrite and Resign git commit-history with git-filter-repo
I have an existing private git-repository (with two branches), hosted on GitHub. I plan to make it public, but I don't want to make public the author email-address. So, the plan is to rewrite the commit-history, by replacing the <old-author-email>
with the <new-author-email>
, and resign the commits. Also, I want to maintain the time-stamps (this is important for me!).
So, based on extensive research I have come-up with the following steps:
- Clone the repository on my local machine.
- Create a python-3.x virtual-environment, and run
pip install git-filter-repo
. (source) - Create a
.mailmap
file in the repositoryWORKING_DIR
, with the email remapping information. (source) - Run
git-filter-repo --force --mailmap .mailmap
. (source1, source2) - Somehow resign the new-commits !!
- Then run
git push origin <your_branch_name> --force
, to rewrite the remote commit-history. (source)- If this step doesn't work, I would simply delete the original Github repository; Create a new one with the same name, and then push the local-repository to it.
The above steps (until step-4), rewrite the author-email and recompute the hashes, but discard the signatures. The documentation says: (source)
Since git filter-repo calls fast-export and fast-import to do a lot of the heavy lifting, it inherits limitations from those systems ... commits get rewritten meaning they will have new hashes; therefore, signatures on commits and tags cannot continue to work and instead are just removed (thus signed tags become annotated tags)
So, to resign the commits, I found the following method (source).
git filter-branch --commit-filter 'git commit-tree -S "$@";' HEAD
But using git filter-branch
is not recommended (source), even by Git's own documentation (source).
Questions:
- And, is there a way to rewrite author-email and resign the new-commits, all using
git-filter-repo
? - And if no! then, how to rewrite the author-email over the entire commit-history and resign the new-commits? (a better way than mine!)
This method DOESN'T work if the repository has multiple branches. So, before getting started, merge all secondary branches in to the primary-branch (usually
master
/main
), as per your requirement. And then delete all the secondary-branches. (both in the local and the remote repositories)
I came up with a possible solution:
- Clone the repository on my local machine.
- Create a python-3.x virtual-environment, activate it, and then run
pip install git-filter-repo GitPython
. -
cd <repo-directory>/../
, i.e., the immediate parent-directory of the local-repository'sWORKING_DIR
-
Run the following python-script: This will create a
.mailmap
file, right outside the<repo-directory>
, with all the unique author/committer-name and email combinations appearing in the said repository's commit history - to be replaced with theNEW_NAME
andNEW_EMAIL
. (Further edit the code to exclude certain names/emails from being rewritten)
from git import Repo repo_path = '<repo-directory>' repo = Repo(repo_path) commits_list = list(repo.iter_commits()) unq = set() for cmt in commits_list: unq.add((cmt.committer.name, cmt.committer.email)) unq.add((cmt.author.name, cmt.author.email)) NEW_NAME = "<new-author-name>" NEW_EMAIL = "<new-author-email>" lines = list() for c in unq: lines.append(f"{NEW_NAME} <{NEW_EMAIL}> {c[0]} <{c[1]}>\n") with open(".mailmap", "w") as f: f.writelines(lines)
-
cd <repo-directory>
, and then rungit-filter-repo --mailmap ../.mailmap
. --> Running this command, for some reason, seems to remove theremote
repository links, i.e.git remote -v
returns nothing. - To reset the repository's remote, run
git remote add origin <remote-repo-url>
-
DON'T DO THIS: if you have multiple branches To sign the commits, run
git filter-branch --commit-filter 'git commit-tree -S "$@";' HEAD
. Also, you might have to add an-f
flag to make it work.for the branch separately, i.e. you will need to checkout into each branch and run the said command.- After performing the above steps, I ran
git fsck --full --strict
and got no errors. So, I think it worked fine! - But is there a better way to check for the repository's integrity?
- After performing the above steps, I ran
- Finally, run
git push origin <primary-branch> --force
to over-write all the remote branches with the updated commit history.