How do we verify commit messages for a push?

Solution 1:

Using the update hook

You know about hooks - please, read the documentation about them! The hook you probably want is update, which is run once per ref. (The pre-receive hook is run once for the entire push) There are tons and tons of questions and answers about these hooks already on SO; depending on what you want to do, you can probably find guidance about how to write the hook if you need it.

To emphasize that this really is possible, a quote from the docs:

This hook can be used to prevent forced update on certain refs by making sure that the object name is a commit object that is a descendant of the commit object named by the old object name. That is, to enforce a "fast-forward only" policy.

It could also be used to log the old..new status.

And the specifics:

The hook executes once for each ref to be updated, and takes three parameters:

  • the name of the ref being updated,
  • the old object name stored in the ref,
  • and the new objectname to be stored in the ref.

So, for example, if you want to make sure that none of the commit subjects are longer than 80 characters, a very rudimentary implementation would be:

#!/bin/bash
long_subject=$(git log --pretty=%s $2..$3 | egrep -m 1 '.{81}')
if [ -n "$long_subject" ]; then
    echo "error: commit subject over 80 characters:"
    echo "    $long_subject"
    exit 1
fi

Of course, that's a toy example; in the general case, you'd use a log output containing the full commit message, split it up per-commit, and call your verification code on each individual commit message.

Why you want the update hook

This has been discussed/clarified in the comments; here's a summary.

The update hook runs once per ref. A ref is a pointer to an object; in this case, we're talking about branches and tags, and generally just branches (people don't push tags often, since they're usually just for marking versions).

Now, if a user is pushing updates to two branches, master and experimental:

o - o - o (origin/master) - o - X - o - o (master)
 \
  o - o (origin/experimental) - o - o (experimental)

Suppose that X is the "bad" commit, i.e. the one which would fail the commit-msg hook. Clearly we don't want to accept the push to master. So, the update hook rejects that. But there's nothing wrong with the commits on experimental! The update hook accepts that one. Therefore, origin/master stays unchanged, but origin/experimental gets updated:

o - o - o (origin/master) - o - X - o - o (master)
 \
  o - o - o - o (origin/experimental, experimental)

The pre-receive hook runs only once, just before beginning to update refs (before the first time the update hook is run). If you used it, you'd have to cause the whole push to fail, thus saying that because there was a bad commit message on master, you somehow no longer trust that the commits on experimental are good even though their messages are fine!

Solution 2:

You could do it with the following pre-receive hook. As the other answers have noted, this is a conservative, all-or-nothing approach. Note that it protects only the master branch and places no constraints on commit messages on topic branches.

#! /usr/bin/perl

my $errors = 0;
while (<>) {
  chomp;
  next unless my($old,$new) =
    m[ ^ ([0-9a-f]+) \s+   # old SHA-1
         ([0-9a-f]+) \s+   # new SHA-1
         refs/heads/master # ref
       \s* $ ]x;

  chomp(my @commits = `git rev-list $old..$new`);
  if ($?) {
    warn "git rev-list $old..$new failed\n";
    ++$errors, next;
  }

  foreach my $sha1 (@commits) {
    my $msg = `git cat-file commit $sha1`;
    if ($?) {
      warn "git cat-file commit $sha1 failed";
      ++$errors, next;
    }

    $msg =~ s/\A.+? ^$ \s+//smx;
    unless ($msg =~ /\[\d+\]/) {
      warn "No bug number in $sha1:\n\n" . $msg . "\n";
      ++$errors, next;
    }
  }
}

exit $errors == 0 ? 0 : 1;

It requires all commits in a push to have a bug number somewhere in their respective commit messages, not just the tip. For example:

$ git log --pretty=oneline origin/master..HEAD
354d783efd7b99ad8666db45d33e30930e4c8bb7 second [123]
aeb73d00456fc73f5e33129fb0dcb16718536489 no bug number

$ git push origin master
Counting objects: 6, done.
Delta compression using up to 2 threads.
Compressing objects: 100% (4/4), done.
Writing objects: 100% (5/5), 489 bytes, done.
Total 5 (delta 0), reused 0 (delta 0)
Unpacking objects: 100% (5/5), done.
No bug number in aeb73d00456fc73f5e33129fb0dcb16718536489:

no bug number

To file:///tmp/bare.git
 ! [remote rejected] master -> master (pre-receive hook declined)
error: failed to push some refs to 'file:///tmp/bare.git'

Say we fix the problem by squashing the two commits together and pushing the result:

$ git rebase -i origin/master
[...]

$ git log --pretty=oneline origin/master..HEAD
74980036dbac95c97f5c6bfd64a1faa4c01dd754 second [123]

$ git push origin master
Counting objects: 4, done.
Delta compression using up to 2 threads.
Compressing objects: 100% (2/2), done.
Writing objects: 100% (3/3), 279 bytes, done.
Total 3 (delta 0), reused 0 (delta 0)
Unpacking objects: 100% (3/3), done.
To file:///tmp/bare.git
   8388e88..7498003  master -> master

Solution 3:

This is a python version of pre-receive, which took me a while to finish, hope it could help others. I mainly use it with Trac, but it could be easily modified for other purposes.

I have also put down the instructions to modify back the historical commit message, which is a little more complicated than I thought.

#!/usr/bin/env python
import subprocess

import sys 
import re

def main():
    input  = sys.stdin.read()
    oldrev, newrev, refname = input.split(" ")
    separator = "----****----"


    proc = subprocess.Popen(["git", "log", "--format=%H%n%ci%n%s%b%n" + separator, oldrev + ".." +  newrev], stdout=subprocess.PIPE)
    message = proc.stdout.read()
    commit_list = message.strip().split(separator)[:-1] #discard the last line

    is_valid = True

    print "Parsing message:"
    print message

    for commit in commit_list:
        line_list = commit.strip().split("\n")
        hash = line_list[0]
        date = line_list[1]
        content = " ".join(line_list[2:])
        if not re.findall("refs *#[0-9]+", content): #check for keyword
            is_valid = False

    if not is_valid:
        print "Please hook a trac ticket when commiting the source code!!!" 
        print "Use this command to change commit message (one commit at a time): "
        print "1. run: git rebase --interactive " + oldrev + "^" 
        print "2. In the default editor, modify 'pick' to 'edit' in the line whose commit you want to modify"
        print "3. run: git commit --amend"
        print "4. modify the commit message"
        print "5. run: git rebase --continue"
        print "6. remember to add the ticket number next time!"
        print "reference: http://stackoverflow.com/questions/1186535/how-to-modify-a-specified-commit"

        sys.exit(1)

main()

Solution 4:

You need made a script on your pre-receive.

In this script you receive the old and new revision. You can check all commit and return false if one of this is bad.

Solution 5:

You didn't mention what is your bug tracker, but if it is JIRA, then the add-on named Commit Policy can do this for without any programming.

You can set up a commit condition which requires the commit message to match a regular expression. If it doesn't, the push is rejected, and the developer must amend (fix) the commit message, then push again.