TFS: Merge best practices

We have a standard branch architecture where we have a development branch for each team, a common integration branch (from where all development branches are branched) and production branch branched from Integration.

During the development phase I make lots of commits into the development branch. At the end of the phase I merge my changes to integration and later on to production.

Does it make sense to merge every commit individually, copying original commit description and linking to original task? Another option is of course to merge all commits at once, with a single merge operation. The reason for my question is that the first way takes lot of time. I don't see any automation tools in TFS that would link merge into other branch to original commit.

I would like to hear your opinion on best practices.


Solution 1:

  1. Merges from Dev* -> Integration and Integration -> Production should always be "copy" merges. This is the safest way to preserve the stability of your downstream branches.
    1. First merge in the other direction (e.g. Integration -> Dev2) to pick up the latest changes from the target branch.
    2. If there are conflicts, handle the diffs on a case by case basis. AcceptMerge (either auto or manual) is usually the desired result but sometimes you'll want to take one or the other branch's copy unchanged.
    3. Use the source branch (Dev #2 in our case) to fully incorporate, react to, and stabilize these changes.
    4. Merge in the desired direction (e.g. Dev2 -> Integration). Resolve all conflicts as AcceptTheirs [aka "copy from source"].
    5. Ensure there are no changes in the target branch between step #1-4. If the Dev branch is accepting merges early & often, like it should be, then it shouldn't be burdensome to lock the target branch during this hopefully-short process. If you anticipate a "big bang" merge of death for whatever reason, there's a decent chance locking will block other teams from doing the same thing in parallel, so you may have to iterate thru steps #1-4 repeatedly until you're ready.
  2. Do "catch up" merges whenever possible. That is, merge things in the same order they were checked in. If changesets 10, 20, and 30 are candidates to merge from A -> B, then any of these merge ranges is a "catch up:" 10~10, 10~20, 10~30. Any changeset range that skips #10 is known as a "cherry pick." Once you start cherry picking you run into a few hazards:
    1. You can't use the merge down, copy up model described above. For that alone, Laura Wingerd would say you're jumping over a curb.
    2. If any of the files touched in your range were also touched previously, you'll have to do a 3-way content merge so that only the cherry-picked diffs are propagated. No diff tool is perfect; you're adding a nonzero risk of bringing over more code than intended, accidentally overwriting changes made in the target, introducing logic bugs where the two branches diverge, etc.
    3. The set of changes you're promoting into the supposedly more stable branch represents a configuration that has never been built or tested before. You can make a decent guess about the final state of the target branch. "I'm merging all the changes affecting Module Foo, and I tested the new version of Foo in Dev, so that's how Foo will behave in Integration, right?" Sure...maybe...if you can track every dependency in your head (including everything that may have changed in Integration while you were testing Dev). But these guesses are in no way known to or validated by your SCM tool chain.
    4. In TFS specifically, cherry picking where namespace changes are involved is just asking to get burned. If your version range and/or path scope excludes the source of a rename, it'll come over as a branch instead. If you exclude the target, it'll pend a delete. If your path scope doesn't include the root of an undelete you'll get cryptic errors. If your range spans a time in between an undelete & re-delete, you'll get "phantom" files appearing in the target even if you don't include the undelete itself. If you merge Moves with all your path & version scopes correct, but do so out of order, it's possible to end up with a different target name than the source name even after all the candidate changesets have been exhausted. I'm sure there are more ways for this combo to go wrong that aren't coming to mind right now...just trust me.
  3. Always do a Get on the target branch before merging. Advanced version for ultimate safety: sync the workspace where you'll be merging to a specific changeset number that's at or near the Tip, then also [catch-up] merge to that same changeset. Doing so avoids a few potential issues:
    1. Merging into stale code, yielding confusing 3-way diffs that appear to remove changes from what you see at Tip. [you'd eventually get them back upon Checkin + Resolve, but no reason to go thru two risky diffs when you can avoid both]
    2. Having to go thru the conflict resolution process twice: once on Merge, once on Checkin. There's no way to avoid this in the general case, but most of the time the # of simultaneous changes made while you Merge + Resolve is tiny compared with the # of changes you'd encounter in a workspace that might be days or weeks out of date.
  4. Don't merge by label (or workspace) unless you really really know what you're doing. Let's review the features offered by TFS labels and then dissect why each is inappropriate for safe & consistent merging.
    1. Labels can represent multiple points in time. If a label represents a consistent snapshot of the VCS -- and was always intended as such -- then it has no technical advantage over a date or changeset #. Unfortunately it's quite difficult to tell if a label is in fact consistent over time. If not, merging by label can lead to:
      1. Inadvertent cherry-picking, if the range begins with a label that points to an item @ a time ahead of its first candidate
      2. Inadvertent exclusion, if the range begins with a label that points to an item @ a time ahead of the end of the range
      3. Inadvertent exclusion, if the range ends with a label that points to an item @ a time prior to the start of the range
    2. Label versionspecs represent a specific set of items. They can be used to deliberately exclude files and folders that a pure recursive query would otherwise see. This feature, too, is a bad match for Merge operations. (And again, if you don't need this ability, you're incurring the following risk without gaining anything over dates & changesets.)
      1. Items not present in the label will be simply ignored, rather than merged as pending deletes. Unlike some of the edge cases covered so far, this is a big deal that's quite likely to happen in mainstream scenarios yet most people miss. [As a result, TFS 2010 adds support for deleted items inside labels.]
      2. Inadvertent cherry picking, if you add an item to the label that has been present for awhile but was excluded from prior merges due to one of the aforementioned side effects.
      3. Intentional cherry picking. The whole advantage this feature brings to Merge is to break one of our guidelines, so obviously that's not a good reason at all. Furthermore, it causes cherry-picking at the file level, which is even more dangerous than "ordinary" cherry picking by changeset.
    3. Labels have friendly customizable names, owners, and comments. Thus we have a pure usability difference vs dates/changesets; no technical advantage is conferred. But even here it's not as attractive as it looks. TFS doesn't do much to actually surface labels in the UI, whereas you can see changeset comments all over the place. Querying by owner is fast (server side), but most other searches are slow (client side) unless you know the exact label name. Management facilities are virtually nonexistent. No changelog or auditing, only a timestamp. In all, these are hardly reasons to abandon the surety provided by changesets.
  5. Always merge the entire branch at once. Merging files or subtrees is sometimes tempting, but ultimately amounts to mere cherry-picking under a new guise.
  6. Plan ahead. Unfortunately, re-parenting branches in TFS is a painful topic. Sometimes it's arduous, sometimes it's only a few steps, but it's never obvious; there is no built in command (until 2010). Pulling it off in 2005/2008 requires a pretty deep knowledge of your current branch structure, desired structure, and how to abuse the side effects of various TF commands.
  7. Don't create branches inside of branches. For example, branching & merging is sometimes recommended as a way to maintain common modules or binaries between loosely coupled projects. I don't think this is very good advice to begin with -- far better to make your build system do its primary job properly, than to shoehorn your source control system into doing something it's not really designed to do. Anyway, this "sharing" tactic clashes terribly with projects themselves live inside a broader branch hierarchy for SCM purposes. If you're not uber careful, TFS will happily let you create arbitrary many-to-many branch relationships between version control items. Good luck sorting that out (I once had to do it for a customer, not pretty.)
  8. Don't create files with the same relative path in two branches independently; use Merge to branch them around or you'll spend hours chasing namespace conflicts. (n/a in 2010)
  9. Don't re-add files on top of a path where other items used to exist. Whether the old items were Rename/Moved away, or simply Deleted, you'll face interesting challenges at Merge time; at minimum, it'll require two Checkins to fully propagate. (n/a in 2010, though the experience is still somewhat degraded; only 1 checkin is required, item contents is preserved, but the name history is in that branch & all downstream branches)
  10. Don't use the /force flag unless you know what you're doing. All /force merges are effectively cherry picks, leading to very similar risks (code getting lost during the Resolve process, etc etc).
  11. Don't use the /baseless flag unless you really really know what you're doing. You miss out on deletes -- similar to labels, except that renames always get morphed into branches instead of just in the unlucky edge cases. You don't get any debit/credit protections whatsoever. And scariest of all, you'll be creating new branch relationships. Sometimes. (no feedback is shown to the user as to whether each target items is new, old with a new relationship, or old with an existing relationship)
  12. Avoid /discard (and equivalently, AcceptYours resolutions) when possible. Discarding some changesets only to accept subsequent ones is yet another name for cherry-picking :)
  13. Be careful with your resolutions in general. Each has unique downstream effects apart from its effect on the merge at hand.
    1. AcceptTheirs is a quick & powerful way to get a "copy" merge, as advocated in the first guideline. If you use it in other scenarios as well, remember you're not just telling TFS to make the file contents the same. You're telling it that the two files are completely in sync from a versioning POV. To wit, any prior changes to the target file that might've merged in opposite direction will no longer be considered candidates once you checkin an AcceptTheirs.
    2. Note that an AcceptMerge (auto or manual) whose resulting contents is identical to the source file will be considered an AcceptTheirs by the server. There is no differentiation to be found in the Checkin webservice protocol.
    3. Using AcceptYours when renames are involved can twist your brain. You'll quickly end up in a situation where "the same" item has different names in different branches. Assuming you have a good reason for discarding changes in the first place, this phenomenon isn't unsafe per se -- in fact, it's probably necessary to avoid either build breaks or one-off customizations to your makefiles. It's just confusing to humans, and very likely to break any automation scripts you have that assume tree structures are consistent from branch to branch.
    4. AcceptMerge is the default for a reason. It sometimes leads to more version conflicts than seem strictly necessary, but is the safest choice when true merging is required. (E.g. step #1 of the primary guideline "merge down, copy up".) So long as you're following the other guidelines, the number of merges that require manual attention should fall -- dramatically so if you're coming from a workflow that's heavy on cherry-picking.
  14. Bugs should be linked to the changeset where the fix was actually made. If you later need to drill into downstream branches to see when, where (and possibly how) the bugfix was propagated, that's a purely source control function. No need to pollute the work item with extra baggage, much less alter the way you fundamentally perform merges. In 2005/2008 you can traverse merge history with the 'tf merges' command or a 3rd party UI like Attrice SideKicks. In 2010 you get nifty visualizations built into Visual Studio. Instructions & screenshots on MSDN.

Solution 2:

I've always only merged a range of commits into the integration branch, only specifying the range of the changesets that I've merged.

The work items related to individual work items at the development stage are development phase work items. I don't think there's any need to roll them on to integration or release.

You haven't specified where you're recording bugs / feature requests from customers. If you're assigning these to the release branch you're probably creating other, more detailed work items for the development branch and when merging you'll simply mark all the issues the bug fixes as resolved for the branch you're merging in.

So summing it up I see no reason why not to go with bulk merges.