What's the fastest way to edit hundreds of Git commit messages?
I have a fairly large Git repository with 1000s of commits, originally imported from SVN. Before I make my repo public, I'd like to clean up a few hundred commit messages that don't make sense in my new repo, as well as to remove all that git-svn informational text that got added.
I know that I can use 'git rebase -i' and then 'git commit --amend' to edit each individual commit message, but with hundreds of messages to be edited, that's a huge pain in the you-know-what.
Is there any faster way to edit all of these commit messages? Ideally I'd have every commit message listed in a single file where I could edit them all in one place.
Thanks!
That's an old question but as there is no mention of git filter-branch
, I just add my two cents.
I recently had to mass-replace text in commit message, replacing a block of text by another without changing the rest of the commit messages. For instance, I had to replace Refs: #xxxxx with Refs: #22917.
I used git filter-branch
like this
git filter-branch --msg-filter 'sed "s/Refs: #xxxxx/Refs: #22917/g"' master..my_branch
- I used the option
--msg-filter
to edit only the commit message but you can use other filters to change files, edit full commit infos, etc. - I limited
filter-branch
by applying it only to the commits that were not in master (master..my_branch
) but you can apply it on your whole branch by omitting the range of commits.
As suggested in the doc, try this on a copy of your branch. Hope that helps.
Sources used for the answer
- Use case on when to use the function : https://git-scm.com/book/en/v2/Git-Tools-Rewriting-History#The-Nuclear-Option:-filter-branch
- Function reference (with the list of options) : https://git-scm.com/docs/git-filter-branch
- Examples of rewrite : https://davidwalsh.name/update-git-commit-messages
This is easy to do as follows:
- Perform first import.
-
Export all commits into text:
git format-patch -10000
Number should be more than total commits. This will create lots of files named
NNNNN-commit-description.patch
. - Edit these files using some script. (Do not touch anything in them except for top with commit messages).
- Copy or move edited files to empty git repo or branch.
-
Import all edited commits back:
git am *.patch
This will work only with single branch, but it works very well.
git-filter-repo https://github.com/newren/git-filter-repo is now recommend. I used it like:
PS C:\repository> git filter-repo --commit-callback '
>> msg = commit.message.decode(\"utf-8\")
>> newmsg = msg.replace(\"old string\", \"new string\")
>> commit.message = newmsg.encode(\"utf-8\")
>> ' --force
New history written in 328.30 seconds; now repacking/cleaning...
Repacking your repo and cleaning out old unneeded objects
HEAD is now at 087f91945a blah blah
Enumerating objects: 346091, done.
Counting objects: 100% (346091/346091), done.
Delta compression using up to 8 threads
Compressing objects: 100% (82068/82068), done.
Writing objects: 100% (346091/346091), done.
Total 346091 (delta 259364), reused 346030 (delta 259303), pack-reused 0
Completely finished after 443.37 seconds.
PS C:\repository>
you probably don't want to copy the powershell extra things, so here is just the command:
git filter-repo --commit-callback '
msg = commit.message.decode(\"utf-8\")
newmsg = msg.replace(\"old string\", \"new string\")
commit.message = newmsg.encode(\"utf-8\")
' --force
If you want to hit all the branches don't use --refs HEAD
. If you don't want to use --force
you can run it on a clean git clone --no-checkout
. This got me started: https://blog.kawzeg.com/2019/12/19/git-filter-repo.html
You can use git rebase -i
and replace pick
with reword
(or just r
). Then git rebasing stops on every commit giving you a chance to edit the message.
The only disadvantages are that you don't see all messages at once and that you can't go back when you spot an error.
A great and simple way to do this would be to use git filter-branch --msg-filter ""
with a python script.
The python script would look something like this:
import os
import sys
import re
pattern = re.compile("(?i)Issue-\d{1,4}")
commit_id = os.environ["GIT_COMMIT"]
message = sys.stdin.read()
if len(message) > 0:
if pattern.search(message):
message = pattern_conn1.sub("Issue",message)
print message
The command line call you would make is git filter-branch -f --msg-filter "python /path/to/git-script.py"