Git cherry pick and datamodel integrity
Given that two branches have diverged and a specific commit from one branch (and not everything) needs to be introduced to the other, git cherry pick achieves exactly that.
After some time there is the need to completely merge the two branches. How will git know that it has already the commit that was cherry picked in the past so that it won't reintroduce it?
Solution 1:
The "avoiding duplicate commit" article mentioned in tonio's answer says:
Imagine we have the master branch and a branch b:
o---X <-- master
\
b1---b2---b3---b4 <-- b
Now we urgently need the commits b1 and b3 in master, but not the remaining commits in b. So what we do is checkout the master branch and cherry-pick commits b1 and b3:
$ git checkout master
$ git cherry-pick “b1’s SHA”
$ git cherry-pick “b3’s SHA”
The result would be:
o---X---b1'---b3' <-- master
\
b1---b2---b3---b4 <-- b
Let’s say we do another commit on master and we get:
o---X---b1'---b3'---Y <-- master
\
b1---b2---b3---b4 <-- b
If we would now merge branch b into master:
$ git merge b
We would get the following:
o---X---b1'---b3'---Y--- M <-- master
\ /
b1----b2----b3----b4 <-- b
That means the changes introduced by b1 and b3 would appear twice in the history. To avoid that we can rebase instead of merge:
$ git rebase master b
Which would yield:
o---X---b1'---b3'---Y <-- master
\
b2---b4 <-- b
Finally:
$ git checkout master
$ git merge b
gives us:
o---X---b1'---b3'---Y---b2---b4 <-- master, b
(after this thread)
The OP adds in the comment:
But still it seems that I dont quite understand how rebase works.. I mean even after rebasing shouldn't the cherry picked commits still appear?
No. The git commit
man page explicitly mentions:
If the upstream branch already contains a change you have made (e.g., because you mailed a patch which was applied upstream), then that commit will be skipped.
For example, running git rebase master on the following history (in which A' and A introduce the same set of changes, but have different committer information):
A---B---C topic
/
D---E---A'---F master
will result in:
B'---C' topic
/
D---E---A'---F master
You can detect if a commit is already present on master with git cherry master
(if you are on the topic
branch).
Solution 2:
You might want to read
Git Cherry-pick vs Merge Workflow for a good comparison between merge and cherry-pick, especially that cherry-pick does not store parent id, and thus will not know that it has already the commit that was cherry picked in the past so that it won't reintroduce it.
and
http://davitenio.wordpress.com/2008/09/27/git-merge-after-git-cherry-pick-avoiding-duplicate-commits/ about how to avoid duplicating commits in this case, using rebase
.