When should I use git pull --rebase?

git

I know of some people who use git pull --rebase by default and others who insist never to use it. I believe I understand the difference between merging and rebasing, but I'm trying to put this in the context of git pull. Is it just about not wanting to see lots of merge commit messages, or are there other issues?

Solution 1:

I would like to provide a different perspective on what "git pull --rebase" actually means, because it seems to get lost sometimes.

If you've ever used Subversion (or CVS), you may be used to the behavior of "svn update". If you have changes to commit and the commit fails because changes have been made upstream, you "svn update". Subversion proceeds by merging upstream changes with yours, potentially resulting in conflicts.

What Subversion just did, was essentially "pull --rebase". The act of re-formulating your local changes to be relative to the newer version is the "rebasing" part of it. If you had done "svn diff" prior to the failed commit attempt, and compare the resulting diff with the output of "svn diff" afterwards, the difference between the two diffs is what the rebasing operation did.

The major difference between Git and Subversion in this case is that in Subversion, "your" changes only exist as non-committed changes in your working copy, while in Git you have actual commits locally. In other words, in Git you have forked the history; your history and the upstream history has diverged, but you have a common ancestor.

In my opinion, in the normal case of having your local branch simply reflecting the upstream branch and doing continuous development on it, the right thing to do is always "--rebase", because that is what you are semantically actually doing. You and others are hacking away at the intended linear history of a branch. The fact that someone else happened to push slightly prior to your attempted push is irrelevant, and it seems counter-productive for each such accident of timing to result in merges in the history.

If you actually feel the need for something to be a branch for whatever reason, that is a different concern in my opinion. But unless you have a specific and active desire to represent your changes in the form of a merge, the default behavior should, in my opinion, be "git pull --rebase".

Please consider other people that need to observe and understand the history of your project. Do you want the history littered with hundreds of merges all over the place, or do you want only the select few merges that represent real merges of intentional divergent development efforts?

Solution 2:

You should use git pull --rebase when

your changes do not deserve a separate branch

Indeed -- why not then? It's more clear, and doesn't impose a logical grouping on your commits.

Ok, I suppose it needs some clarification. In Git, as you probably know, you're encouraged to branch and merge. Your local branch, into which you pull changes, and remote branch are, actually, different branches, and git pull is about merging them. It's reasonable, since you push not very often and usually accumulate a number of changes before they constitute a completed feature.

However, sometimes--by whatever reason--you think that it would actually be better if these two--remote and local--were one branch. Like in SVN. It is here where git pull --rebase comes into play. You no longer merge--you actually commit on top of the remote branch. That's what it actually is about.

Whether it's dangerous or not is the question of whether you are treating local and remote branch as one inseparable thing. Sometimes it's reasonable (when your changes are small, or if you're at the beginning of a robust development, when important changes are brought in by small commits). Sometimes it's not (when you'd normally create another branch, but you were too lazy to do that). But that's a different question.

Solution 3:

Perhaps the best way to explain it is with an example:

Alice creates topic branch A, and works on it
Bob creates unrelated topic branch B, and works on it
Alice does git checkout master && git pull. Master is already up to date.
Bob does git checkout master && git pull. Master is already up to date.
Alice does git merge topic-branch-A
Bob does git merge topic-branch-B
Bob does git push origin master before Alice
Alice does git push origin master, which is rejected because it's not a fast-forward merge.
Alice looks at origin/master's log, and sees that the commit is unrelated to hers.
Alice does git pull --rebase origin master
Alice's merge commit is unwound, Bob's commit is pulled, and Alice's commit is applied after Bob's commit.
Alice does git push origin master, and everyone is happy they don't have to read a useless merge commit when they look at the logs in the future.

Note that the specific branch being merged into is irrelevant to the example. Master in this example could just as easily be a release branch or dev branch. The key point is that Alice & Bob are simultaneously merging their local branches to a shared remote branch.

Solution 4:

I think you should use git pull --rebase when collaborating with others on the same branch. You are in your work → commit → work → commit cycle, and when you decide to push your work your push is rejected, because there's been parallel work on the same branch. At this point I always do a pull --rebase. I do not use squash (to flatten commits), but I rebase to avoid the extra merge commits.

As your Git knowledge increases you find yourself looking a lot more at history than with any other version control systems I've used. If you have a ton of small merge commits, it's easy to lose focus of the bigger picture that's happening in your history.

This is actually the only time I do rebasing(*), and the rest of my workflow is merge based. But as long as your most frequent committers do this, history looks a whole lot better in the end.

(*) While teaching a Git course, I had a student arrest me on this, since I also advocated rebasing feature branches in certain circumstances. And he had read this answer ;) Such rebasing is also possible, but it always has to be according to a pre-arranged/agreed system, and as such should not "always" be applied. And at that time I usually don't do pull --rebase either, which is what the question is about ;)

Solution 5:

I don't think there's ever a reason not to use pull --rebase -- I added code to Git specifically to allow my git pull command to always rebase against upstream commits.

When looking through history, it is just never interesting to know when the guy/gal working on the feature stopped to synchronise up. It might be useful for the guy/gal while he/she is doing it, but that's what reflog is for. It's just adding noise for everyone else.