Can I add metadata to git commits? Or can I hide some tags in gitk

I want to associate custom metadata with a git commit. Specifically to record a review ID from a code review but it could be anything. Tags seem a natural way to do that but I expect to have a review for every commit and I don't want to clutter gitk with tons of tags. Is there some other mechanism to add custom metadata? Can I make certain tags invisible? If I could tell gitk not to display tags matching some pattern or RE, that would likely work but I don't see a way to do that.


Git-notes

With git notes you can add a “note” to a commit. You can also add them to other Git objects, but let’s just focus on commits since that is what the question is about.

A note is a Git object, and can in principle be “whatever” (arbitrary data). But we’ll focus on something simple and textual for our purposes.

Example: review id

The question mentions review ids, so let’s make up some way to represent such a thing. I don’t know what review ids really look like, but hopefully the following would be sensible:

Review-id: 42

So this is effectively a key-value pair. Let’s add the above string to the current commit:

git notes add -m "Review-id: 42"

If you run git log the note will be shown inline:(†1)

Author: Victor Version Control <[email protected]>
Date:   Tue Nov 8 21:10:25 2016 +0100

    Implement feature x

Notes:
    Review-id: 42

Another example

Of course you can add more “subnotes” to this note (we will stick with the simple key: value syntax, one value per line). For example, if you found out three months later that the commit message got something wrong, just append the correction to the note:

git notes append -m "Errata: It was actually feature y."

git log:

Author: Victor Version Control <[email protected]>
Date:   Tue Nov 8 21:10:25 2016 +0100

    Implement feature x

Notes:
    Review-id: 42

    Errata: It was actually feature y.

We use git notes append in order to easily add this extra data to the note. You could also use git notes edit in order to edit the file directly.

Of course, since a Git note is just a single mutable file, you can run into merge conflicts. To make that less likely, you can:

  1. Stick to simple data like the above (one key-value per line).
  2. Use special merge strategies; see man git-notes, section “Notes merge strategies”.

Visibility

The OP asked:

> Can I make certain tags invisible?

By default, git log only shows one note, namely .git/refs/notes/commits. commits is just one note in the namespace. Maybe you want issues to be in their own namespace:

git notes --ref=issues add -m "Fixes: #32"

Since this is stored in .git/refs/notes/issues and not in .git/refs/notes/commits, “Fixes: #32” won’t show up when you run git log. So you have effectively made such notes invisible by default.

If you want it to be shown, pass --notes=issues to git log:

$ git log --notes=issues
Author: Victor Version Control <[email protected]>
Date:   Tue Nov 8 21:10:25 2016 +0100

    Implement feature x

Notes (issues):
    Fixes: #32

But now .git/refs/notes/commits are hidden. That one can easily be included as well:

$ git log --notes=issues --notes=commits
Author: Victor Version Control <[email protected]>
Date:   Tue Nov 8 21:10:25 2016 +0100

    Implement feature x

Notes (issues):
    Fixes: #32

Notes:
    Review-id: 42

    Errata: It was actually feature y.

There are variables to configure which notes are shown by default; see man git-config.

Benefits compared to commit messages

Metadata can of course be recorded in the commit message directly.(†2) But commit messages are immutable, so to change them really means to make a whole new commit, with all the rippling consequences that that entails. Git-notes on the other hand are mutable, so you are always able to revise them. And each modification of a note is of course version controlled. In our case, for .git/refs/notes/commits:

$ git log refs/notes/commits
Author: Victor Version Control <[email protected]>
commit 9f0697c6bbbc6a97ecce9834d4c9afa0d668bcad
Date:   Tue Nov 8 21:13:52 2016 +0100

    Notes added by 'git notes append'

commit b60997e49444732ed2defc8a6ca88e9e21001a1d
Author: Victor Version Control <[email protected]>
Date:   Tue Nov 8 21:10:38 2016 +0100

    Notes added by 'git notes add'

Sharing notes

Your notes aren’t shared by default; you have to do so explicitly. And compared to other refs, sharing notes isn’t very user-friendly. We have to use the refspec syntax:

git push refs/notes/*

The above will push all of your notes to your remote.

It seems that fetching notes is a bit more involved; you can do it if you specify both sides of the refspec:

git fetch origin refs/notes/*:refs/notes/*

So that’s definitely not convenient. If you intend to use Git-notes regularly, you’ll probably want to set up your gitconfig to always fetch notes:

[remote "origin"]
    …
    fetch = +refs/notes/*:refs/notes/*

Carry over notes on rewrites

Git has the inconvenient default that notes are not carried over when a commit is rewritten. So if you for example rebase a series of commits, the notes will not carry over to the new commits.

The variable notes.rewrite.<command> is by default set to true, so one might assume that notes are carried over. But the problem is that the variable notes.rewriteRef, which determines which notes will be carried over, has no deafult vaule. To set this value to match all notes, execute the following:

git config --global notes.rewriteRef "refs/notes/*"

Now all notes will be carried over when doing rewrite operations like git rebase.

Carry over notes through email patches

If you are using git format-patch to format your changes to be sent as emails, and you have some metadata stored as Git notes, you can pass the --notes option to git format-patch in order to append the notes to the email draft.


† 1: “This is the default for git log […] when there is no --pretty, --format, or --oneline option given on the command line.” ― man git-log, git version 2.10.2

† 2: One practice/convention for metadata-in-commit-messages that is used in projects like e.g. Git and the Linux kernel is to add key–value pairs in the “trailer” of the commit message, i.e. at the bottom. See for example this commit by Linus Torvalds:

 mm: remove gup_flags FOLL_WRITE games from __get_user_pages()
 This is an ancient bug that was actually attempted to be fixed once
 (badly) by me eleven years ago in commit 4ceb5db9757a ("Fix
 get_user_pages() race for write access") but that was then undone due to
 problems on s390 by commit f33ea7f404e5 ("fix get_user_pages bug").

 In the meantime, the s390 situation has long been fixed, and we can now
 fix it by checking the pte_dirty() bit properly (and do it better).  The
 s390 dirty bit was implemented in abf09bed3cce ("s390/mm: implement
 software dirty bits") which made it into v3.9.  Earlier kernels will
 have to look at the page state itself.

 Also, the VM has become more scalable, and what used a purely
 theoretical race back then has become easier to trigger.

 To fix it, we introduce a new internal FOLL_COW flag to mark the "yes,
 we already did a COW" rather than play racy games with FOLL_WRITE that
 is very fundamental, and then use the pte dirty flag to validate that
 the FOLL_COW flag is still valid.

 Reported-and-tested-by: Phil "not Paul" Oester <[email protected]>
 Acked-by: Hugh Dickins <[email protected]>
 Reviewed-by: Michal Hocko <[email protected]>
 Cc: Andy Lutomirski <[email protected]>
 Cc: Kees Cook <[email protected]>
 Cc: Oleg Nesterov <[email protected]>
 Cc: Willy Tarreau <[email protected]>
 Cc: Nick Piggin <[email protected]>
 Cc: Greg Thelen <[email protected]>
 Cc: [email protected]
 Signed-off-by: Linus Torvalds <[email protected]>

See:

  • man git-interpret-trailers
  • This Kernel Wiki page which lists various trailer lines (usually key–value pairs) that are used in various projects.

That's precisely what git notes are for.