How to remove sensitive data from a file in github history
I am using a shared github repository to collaborate on a project. Because i am an idiot, I committed and pushed a script file containing a password which I don't want to share (Yes, i can change the password, but I would like to remove it anyway!).
Is there any way to revert the commits from github's history, remove the password locally and then recommit and push the updated files? I do not want to remove the file completely, and I would rather not lose the commit history on github.
(This question How can I completely remove a file from a git repository? shows how to remove a sensitive file, but not how to edit sensitive data from a file, so this is not a duplicate)
Solution 1:
I would recommend to use the new git filter-repo
, which replaces BFG and git filter-branch
.
Note: if you get the following error message when running the above-mentioned commands:
Error: need a version of `git` whose `diff-tree` command has the `--combined-all-paths` option`
it means you have to update git
.
First: do that one copy of your local repo (a new clone)
See "Content base filtering":
At the end, you can (if you are the only one working on that repository) do a git push --force
If you want to modify file contents, you can do so based on a list of expressions in a file, one per line.
For example, with a file namedexpressions.txt
containing:p455w0rd foo==>bar glob:*666*==> regex:\bdriver\b==>pilot literal:MM/DD/YYYY==>YYYY-MM-DD regex:([0-9]{2})/([0-9]{2})/([0-9]{4})==>\3-\1-\2
then running
git filter-repo --replace-text expressions.txt
will go through and replace:
p455w0rd
with***REMOVED***
,foo
withbar
,- any line containing
666
with a blank line,- the word
driver
withpilot
(but not if it has letters before or after; e.g. drivers will be unmodified),- the exact text
MM/DD/YYYY
withYYYY-MM-DD
and- date strings of the form
MM/DD/YYYY
with ones of the formYYYY-MM-DD
.