git: change styling (whitespace) without changing ownership/blame?

We have a massive, ancient codebase that needs a lot of cleanup. We have always had coding standards and everyone has always tried to follow them, but they were not enforced so over time a lot of violations have creeped in. Many of them are just whitespace issues, like using tabs instead of spaces, or spaces where there shouldn't be any, or missing spaces where they should be. We are going to start actively enforcing our coding standards to make sure more violations don't creep in, but it's difficult to enforce them in an automated way on only the changes, so it would be nice to clean up these old files.

There are tools that can automate fixing these issues, however if I do that then blame is going to show me as the owner of those lines, when in reality I may never have even seen them. I know there is a setting to make blame ignore whitespace changes, but I can't make everyone use blame the same way, including other visual tools and things like gitstats. In an ideal world there would be some way to rewrite history to look like the violations were never introduced, without covering up who introduced the actual code, but I can't find anything like that.


If you are trying to get a Root Cause issue using blame, don't forget use the -w flag to ignore all the white-spaces or indentation changes. So you'll get the last real change to the code, instead just an indentation, or removing trailing spaces.

git blame -w app/to/file.rb

or you can just use, git slap command..

git config alias.slap "blame -w";
git slap app/path/to/file.rb

having same results :D


In an ideal world there would be some way to rewrite history to look like the violations were never introduced

git filter-branch does precisely that.

http://git-scm.com/docs/git-filter-branch

This has the same issues as all history rewriting commands do, as it essentially invalidates all cloned repositories.


Building on Mario's answer, I would suggest git shame as a global git-alias:

git config --global alias.shame 'blame -w -M'

...and use it instead of git-blame:

git shame path/to/file

To explain:

  • -w Ignores whitespace changes, so not to blame someone who re-indented the code
  • -M Detects lines that were moved or copied, and blames the original author

EDIT:

Some argue that the -M is misleading, blaming the wrong person
(i.e.: don't blame me if someone rearranged what I wrote).
If you feel the same, please use the original suggestion: git slap


I did a pull request to the TextMate git Bundle, to set this "-w" parameter by default for the "Browse Annoted File (Blame)" command. Thanks Mario Zaizar, you made my day.

diff --git a/Support/lib/git.rb b/Support/lib/git.rb
index 5e8de13..5192953 100644
--- a/Support/lib/git.rb
+++ b/Support/lib/git.rb
@@ -307,6 +307,9 @@ module SCM
       file = make_local_path(file_path)
       args = [file]
       args << revision unless revision.nil? || revision.empty?
+      # Ignore whitespace when comparing the parent's version and
+      # the child's to find where the lines came from.
+      args << '-w'
       output = command("annotate", *args)
       if output.match(/^fatal:/)
         puts output