Git commit auditing

I have a git server running over ssh and each user has a unix account on the system.

Given that two users have access to a repo, how can I be sure which user performed which commit, since the commit user name and email is submitted and controlled by the git client.

I am concerned that a user might try to impersonate another, even if they have the same authorization rights.


If you are that worried about it, there are a couple of ways of addressing the issue.

  1. Make your users sign your commits, there is support for GPG signing.
  2. Don't give users the right to commit to the main repository, have them commit to their own subrepository and then have a trusted user bring the changes into the main repository. That's why if you look at the log messages for some git projects (such as git itself) you'll see that their are separate fields for "Author" - the person who created the change. and "Committer" - the person who committed the change into the repository.

I see two good ways of getting this kind of information. One is by increasing the logging from sshd itself, and the other by doing deeper monitoring of the git repository on disk. Since neither one individually gives you the information you want, you may want to do both and correlate the log data using an external log analysis engine or on demand using human eyes and timestamps.

sshd Modifications

By default, as you've no doubt seen, you can see when a user logged in, and from where, using the ssh authentication logs. What you want to do is change the level at with you're logging out of sshd. So edit your /etc/ssh/sshd_config and find the line that looks like

#LogLevel INFO

and change that to

LogLevel VERBOSE

then restart the sshd service. This increases the logging level of sshd by 1 step, which gives a lot more information. Check out this log snippet of my remote access after making that change.

Nov  2 08:37:09 node1 sshd[4859]: Connection from 10.10.10.5 port 50445
Nov  2 08:37:10 node1 sshd[4859]: Found matching RSA key: f2:9e:a1:ca:0c:33:02:37:9b:de:e7:63:d5:f4:25:06
Nov  2 08:37:10 node1 sshd[4860]: Postponed publickey for scott from 10.10.10.5 port 50445 ssh2
Nov  2 08:37:10 node1 sshd[4859]: Found matching RSA key: f2:9e:a1:ca:0c:33:02:37:9b:de:e7:63:d5:f4:25:06
Nov  2 08:37:10 node1 sshd[4859]: Accepted publickey for scott from 10.10.10.5 port 50445 ssh2
Nov  2 08:37:10 node1 sshd[4859]: pam_unix(sshd:session): session opened for user scott by (uid=0)
Nov  2 08:37:10 node1 sshd[4859]: User child is on pid 4862
Nov  2 08:40:27 node1 sshd[4862]: Connection closed by 10.10.10.5
Nov  2 08:40:27 node1 sshd[4862]: Transferred: sent 30632, received 7024 bytes
Nov  2 08:40:27 node1 sshd[4862]: Closing connection to 10.10.10.5 port 50445
Nov  2 08:40:27 node1 sshd[4859]: pam_unix(sshd:session): session closed for user scott 

The important things to notice in here are two-fold

  1. We see the fingerprint of the public key used to authenticate me
  2. We see the timestamp of my log off

Using the default LogLevel (INFO) sshd logs neither of those items. Getting the fingerprint of a key is one extra step. You have to process the appropriate authorized_keys file with ssh-keygen as such.

[root@node1 ssh]# ssh-keygen -l -f /home/scott/.ssh/authorized_keys
4096 f2:9e:a1:ca:0c:33:02:37:9b:de:e7:63:d5:f4:25:06 /home/scott/.ssh/authorized_keys (RSA)

So now you know the following pieces of information:

  1. Username that logged on
  2. Time that user logged on
  3. Which public key was used for authentication
  4. Time that user logged off

Now that we have a way to attribute user action at a specific time, assuming both users weren't logged in at the same time, we can start looking at the changes made to the repository.

Directory Monitoring with Auditd

As sysadmin1138 said, this could be an excellent use case for the auditd subsystem. If you are not using a RedHat based distro there is probably an analogue, but you'll have to find it. The configuration for auditd is pretty intense and has a redonkulous number of configuration options. To get an idea of some of the options, please check out this question on our sister site for Information Security Professionals.

Minimally, I would recommend setting up what's called a "watch" on the directory on disk that contains your git repository in question. What this does is instruct the kernel module to report on attempts to perform file access calls, such as open() or creat(), on file handles pointing to the files or directories we list.

Here's a sample config that would do this, and only this. So be careful to read through, and understand, your existing /etc/audit/audit.rules in order to integrate changes appropriately.

# This file contains the auditctl rules that are loaded
# whenever the audit daemon is started via the initscripts.
# The rules are simply the parameters that would be passed
# to auditctl.

# First rule - delete all
-D

# Increase the buffers to survive stress events.
# Make this bigger for busy systems
-b 1024

-w /path/to/git/repos-p wa

# Disable adding any additional rules - note that adding *new* rules will require a reboot
-e 2

The only technical approach that you can take is to trust the identity of the ssh connection. You could then enforce that each user only pushes commits that he made by validating the committer of each new pushed commit.

For this to be reliable you almost certainly don't want to give your users unrestricted shell access to the box where the repository resides; you would want to ensure the use of something like git-shell only otherwise the restrictions are easily worked around.

Users would still be able to impersonate each other as authors, though. You could restrict this as well but this would lose of common workflows such as cherry-picking and rebasing and perhaps even branching (depending on your hook implementation) so you may not want to do this.

At some point, to some extent, you need to trust your developers.