Squash to only one "proper" commit for github pull request

I have a repo on github which someone else (Bob, for the sake of argument) has issued a pull request for. His code's not perfect, so we go through a few rounds of markups. As I understand it, he commits and pushes to his pull-request for each set of marked up changes.

So my repository now looks like this:

master: ---o A (Chowlett
           |
           |
pull-req:  o---o---o---o
               B   C   D (all Bob)

Commit SHAs and msgs are as follows:

A:

123456 Good commit <chowlett>

B:

777ccc Fix the widget bug <bob>

C:

888ddd Review markups <bob>

D:

999eee Further markups <bob>

I'm now happy to accept this pull request; but I'd rather the pre-markup versions weren't in my repo. Can I achieve all of the following; and how?

  • Merge B, C & D into my repo as a single commit
  • Generate the "Merge pull request #99 into ..." commit as well
  • Have github automatically close the pull request

Note that Bob doesn't have to squash his commits when he is making a GitHub PR.
Since March 2016, you can leave that operation to the maintainer (you) accepting your PR.

See "Squash your commits" and its new documentation

This is a new option which lets you force commit squashing on all pull requests merged via the merge button.

https://help.github.com/assets/images/help/pull_requests/squash-and-merge.png


There are two 'squash' functions inbuilt into git. There is git merge --squash and there's the squash action in git rebase --interactive. The former doesn't retain any author or date information, just collecting all changes from a series of commits into the local working copy. The latter is annoying because it requires interaction.

The git squash extension does what you want. It rebases the current HEAD onto a specified base while automatically squashing the commits in between. It also provides a command-line option to set the message on the final squashed commit in cases where the rebase doesn't create conflicts.

Throwing this together with hub and ghi, you might be able to construct a script along these lines:

git pull upstream master
hub checkout https://github.com/$user/$repo/pull/$issue
git squash master
rev=$(git rev-parse --short HEAD)
git checkout master
git merge $rev
git commit --amend "Merged pull request #$issue"
git push upstream master
ghi close $issue $user/$repo
ghi comment $issue "Merged as $rev" $user/$repo 

You can use the --squash option for merge

git merge <remote url> <remote branch> --squash

This will however not produce a merge commit. It will instead produce a normal set of working tree changes as if you manually applied all of his changes to your copy. You would then commit like normal.

The downside will be that your history on master will not show this commit as a merge from his branch. It will just look like you did the work yourself and not give Bob credit.


Using git rebase

One idea would be to checkout the branch and squash all commits into one using the iteractive rebase, then force push to update the pull request and merge (though part of this work could be delegated to Bob).

To automatically squash all commits from a branch into the first one and apply this to the pull request, you can use the following commands:

$ git checkout pull-req
$ GIT_SEQUENCE_EDITOR='sed -i "2,\$s/^pick/s/g" $1' git rebase -i origin/master
$ git push --force

GIT_SEQUENCE_EDITOR is a Git environment variable to set a temporary editor for the rebase commit list. We set it to a inline script which replaces the word pick by s (meaning squash) in the beginning of all lines except the first (that's the 2,\$ in the sed pattern). The commit list that gets passed to the script is a simple text file. Git then proceeds with the rebase and lets you edit the final commit message.

Also, with a git hook you could then more or less easily edit this final message to suit your needs (say add a visual separator between the squashed commits' messages).

Using git merge --squash

Squashing is also possible through git merge --squash. See here for the difference between the two methods. The script bellow would squash the commits of a branch into a single commit using the merge command. It also creates a backup of the branch (just in case).

MAINBRANCH="master"    
CURRENT_BRANCH=$(git rev-parse --abbrev-ref HEAD)

# update current feature branch with master
git pull origin $MAINBRANCH

# delete existing backup
git branch -D "$CURRENT_BRANCH-backup"

# backup current feature branch and go back
git checkout -b "$CURRENT_BRANCH-backup" && git checkout -

# checkout and update master branch
git checkout $MAINBRANCH && git pull

# create new branch from master
git checkout -b "$CURRENT_BRANCH-squashed"

# set it to track the corresponding feature branch
git branch "$CURRENT_BRANCH-squashed" --set-upstream-to "$CURRENT_BRANCH"

# merge and squash the feature branch into the one created
git merge --squash $CURRENT_BRANCH

# commit the squashed changes
git commit

# force push to the corresponding feature branch
git push -f . HEAD:$CURRENT_BRANCH

# checkout the feature branch
git checkout $CURRENT_BRANCH

# delete the squashed copy
git branch -D "$CURRENT_BRANCH-squashed"