Git keyword substitution like those in Subversion?

I used to work under Subversion/SVN and was instantly using nice feature called keyword substitution. Just putting in source files smth like:

/*
 *   $Author: ivanovpv $
 *   $Rev: 42 $
 *   $LastChangedDate: 2012-05-25 21:47:42 +0200 (Fri, 25 May 2012) $
 */

And each time Subversion was substituting keywords (Author, Rev, LastChangedDate) with actual ones.

Some time ago I was forced to move to Git and just wondering is there's something similar to Subversion's keyword substitution in Git?


Git doesn't ship with this functionality out of the box. However, there is a chapter in the Git Book on Customizing Git and one of the examples is how to use git attributes to implement a similar result.

It turns out that you can write your own filters for doing substitutions in files on commit/checkout. These are called “clean” and “smudge” filters. In the .gitattributes file, you can set a filter for particular paths and then set up scripts that will process files just before they’re checked out (“smudge”) and just before they’re staged (“clean”). These filters can be set to do all sorts of fun things.

There is even an example for $LastChangedDate: $:

Another interesting example gets $Date$ keyword expansion, RCS style. To do this properly, you need a small script that takes a filename, figures out the last commit date for this project, and inserts the date into the file. Here is a small Ruby script that does that:

#! /usr/bin/env ruby
data = STDIN.read
last_date = `git log --pretty=format:"%ad" -1`
puts data.gsub('$Date$', '$Date: ' + last_date.to_s + '$')

All the script does is get the latest commit date from the git log command, stick that into any $Date$ strings it sees in stdin, and print the results – it should be simple to do in whatever language you’re most comfortable in. You can name this file expand_date and put it in your path. Now, you need to set up a filter in Git (call it dater) and tell it to use your expand_date filter to smudge the files on checkout. You’ll use a Perl expression to clean that up on commit:

$ git config filter.dater.smudge expand_date
$ git config filter.dater.clean 'perl -pe "s/\\\$Date[^\\\$]*\\\$/\\\$Date\\\$/"'

This Perl snippet strips out anything it sees in a $Date$ string, to get back to where you started. Now that your filter is ready, you can test it by setting up a Git attribute for that file that engages the new filter and creating a file with your $Date$ keyword:

date*.txt filter=dater
$ echo '# $Date$' > date_test.txt If you commit

those changes and check out the file again, you see the keyword properly substituted:

$ git add date_test.txt .gitattributes
$ git commit -m "Testing date expansion in Git"
$ rm date_test.txt
$ git checkout date_test.txt
$ cat date_test.txt
# $Date: Tue Apr 21 07:26:52 2009 -0700$

You can see how powerful this technique can be for customized applications. You have to be careful, though, because the .gitattributes file is committed and passed around with the project, but the driver (in this case, dater) isn’t, so it won’t work everywhere. When you design these filters, they should be able to fail gracefully and have the project still work properly.


Solution

Well, you could easily implement such a feature yourself.

Basically I embedded the commit command into a shell script. This script will first substitute the desired macros and then commit the changes. The project consists of two files:

Content?

keysub, a bash shell script and keysub.awk an awk script to replace keywords in a specific file. A third file is a config file which contains the values that should be substituted (besides variable stuff like commit count and timestamp).

How do you use it?

You call keysub instead of commit with the same options. The -m or -a option should come before any other commit option. A new option (that should always come first) is -f which takes a config file as a value. Example:

$ git add 'someJavaFile.java'
$ keysub -m 'fixed concurrent thread issue'
$ git push

or

$ git -f .myfile.cnf -m 'enhanced javadoc entries'

keysub

#!/bin/bash

# 0 -- functions/methods
#########################
# <Function description>
function get_timestamp () {
  date    # change this to get a custom timestamp
}

# 1 -- Variable declarations
#############################
# input file for mapping
file=".keysub.cnf"
timestamp=$(get_timestamp)


# 2 -- Argument parsing and flag checks
########################################

# Parsing flag-list
while getopts ":f:m:a" opt;
do
  case $opt in
    f) file=${OPTARG}
       ;;
    a) echo 'Warning, keyword substitution will be incomplete when invoked'
       echo 'with the -a flag. The commit message will not be substituted into'
       echo 'source files. Use -m "message" for full substitutions.'
       echo -e 'Would you like to continue [y/n]? \c'
       read answer
       [[ ${answer} =~ [Yy] ]] || exit 3
       unset answer
       type="commit_a"
       break
       ;;
    m) type="commit_m"
       commitmsg=${OPTARG}
       break
       ;;
   \?) break
       ;;
  esac
done
shift $(($OPTIND - 1))

# check file for typing
if [[ ! -f ${file} ]]
then
  echo 'No valid config file found.'
  exit 1
fi

# check if commit type was supplied
if [[ -z ${type} ]]
then
  echo 'No commit parameters/flags supplied...'
  exit 2
fi

# 3 -- write config file
#########################
sed "
  /timestamp:/ {
    s/\(timestamp:\).*/\1${timestamp}/
  }
  /commitmsg:/ {
    s/\(commitmsg:\).*/\1${commitmsg:-default commit message}/
  }
" ${file} > tmp

mv tmp ${file}

# 4 -- get remaining tags
##########################
author=$(grep 'author' ${file} | cut -f1 -d':' --complement)


# 5 -- get files ready to commit
#################################
git status -s | grep '^[MARCU]' | cut -c1-3 --complement > tmplist

# 6 -- invoke awk and perform substitution
###########################################
# beware to change path to your location of the awk script
for item in $(cat tmplist)
do
  echo ${item}
  awk -v "commitmsg=${commitmsg}" -v "author=${author}" \
      -v "timestamp=${timestamp}" -f "${HOME}/lib/awk/keysub.awk" ${item} \
      > tmpfile
  mv tmpfile ${item}
done
rm tmplist

# 5 -- invoke git commit
#########################
case ${type} in
  "commit_m") git commit -m "${commitmsg}" "$@"
              ;;
  "commit_a") git commit -a "$@"
              ;;
esac

# exit using success code
exit 0

keysub.awk

# 0 BEGIN
##########
BEGIN {
  FS=":"
  OFS=": "
}

# 1 parse source files 
########################
# update author
$0 ~ /.*\$Author.*\$.*/ {
  $2=author " $"
}

# update timestamp
$0 ~ /.*\$LastChangedDate.*\$.*/ {
  $0=$1
  $2=timestamp " $"
}

# update commit message
$0 ~ /.*\$LastChangeMessage.*\$.*/ {
  $2=commitmsg " $"
}

# update commit counts
$0 ~ /.*\$Rev.*\$.*/ {
  ++$2
  $2=$2 " $"
}

# print line
{
  print
}

Config file

author:ubunut-420
timestamp:Fri Jun 21 20:42:54 CEST 2013
commitmsg:default commit message

Remarks

I've tried to document well enough so you can easily implement it and modify it to your own, personal needs. Note that you can give the macros any name you want to, as long as you modify it in the source code. I also aimed to keep it relatively easy to extend the script, you should be able to add new macros fairly easily. If you're interested in extending or modifying the script, you might want to take a look at the .git directory too, there should be plenty of info there that can help to enhance the script, due to lack of time I didn't investigate the folder though.


Sadly not natively.

Does Git have keyword expansion? Keyword expansion is not recommended. Keyword expansion causes all sorts of strange problems and isn't really useful anyway, especially within the context of an SCM. You can perform keyword expansion outside of Git using a custom script. The Linux kernel export script does this to set the EXTRA_VERSION variable in the Makefile.

See gitattributes(5) if you really want to do this. If your translation is not reversible (eg SCCS keyword expansion) this may be problematic. (Hint: the supplied $Id$-expansion puts the 40-character hexadecimal blob object name into the id; you can figure out which commits include this blob by using a script like this.)

Read their documentation, link attached: Keyword Expansion