Note: This is the fourth post in a series of posts about branching in Team Foundation Version Control. You can get the other posts here (1, 2, 3).
Have you ever sworn at a collegue because he or she repeatedly didn’t check in source files before the nightly build? Have you ever contemplated homicide because someone has checked out the project or solution file and then gone on holiday?
I have, and even though it is usually possible to get someone to go in with administrative rights and undo the check-out it still doesn’t make it easier to put up with that kind of behaviour. The problem is that the source code repository is a shared resource and we are essentially using a cooperative algorithm to dish out access.
While I think that a pre-emptive check-in/out system would be fun to watch as developers scramble to make their changes before their time-slice ends I think that ultimately everyone would spend all their time context switching.
The Hardware Guys are Laughing at the Software Guys
There is a certain element of truth in the saying that “there is nothing new under the sun”. In this case our issue with source file contention is something that the hardware guys solved when they started machines with multiple processors.
When a CPU starts executing instructions a small amount of data is moved from main memory into a highspeed cache. There is one cache per CPU so on multi-processor machines concurrency is achieved by copying relevant data into a seperate work area and executing on that rather than having processor go back to main memory (an expensive and slow resource comparitively speaking).
Obviously there are a few subtleties in the analogy that I have skipped over here but the point is that contention can be reduced by duplicating data (branching) and in the case of Team Foundation Version Control merging those changes back in.
Funily enough, the hardware guys can’t just merge their data back in, I bet they would love to have access to our intelligent merge tools – who’s laughing now?
Patterns for Branching to Reduce Contention
Creating a branch is like creating an easy to access cache just for you (or your small team) but if you are going to take this approach you need to think about what pattern you are going to use and the impact that is going to have on your source code repository.
Unlike mainstream branching approaches used for versioning and patching branching to reduce contention is far more risky because its something that your configuration manager is going to have a harder time keeping on top of (because of the volume). In order to be successful individual developers are going to need to understand branching.
There are three different branching strategies to reduce contention that I have identified and you can use one or a combination of them depending on your scenario.
- Branch per Task
- Branch per Developer
- Branch per Feature
Each strategy has a slightly different impact on Team Foundation Version Control and the kinds of things that you are going to need to keep on top of them.
Branch per Task
One of the central features in Team Foundation Server is the ability to track project progress by updating a series of work items, and when it comes to code cutting activities we are specifically talking about Tasks. If source file contention and code stability is an issue on your project you may consider using the Branch per Task strategy.
The Branch per Task strategy involves creating a branch of the source code for each coding related task that you are assigned. Working on that coding task until the coding is complete and then integrating the changes or additions back onto the main version branch.
The diagram above shows six different branch per task invocations staggered over time and I have highlighted keypoints in the branches lifecycle. The first step is creating the branch and what I like to do here is create the branch off an appropriate piece of the source tree and check-in the branch while associating it with the task that was allocated from work item tracking. This immediately creates a connection between the work item and the branch that the work was performed on.
In the second step the code is completed and code from the main version branch is forward integrated into the task branch. This allows the developer to ensure only minimal stabilisation work will need to be done after reverse integrating. It also provides an opportunity to ensure that any new unit tests that were written continue to pass successfully (both from the task branch, and the version branch). Once forward integration is completed successfully you can go ahead and reverse integrate.
Once reverse integration has occured it is possible that stabilisation is still required because of code that slipped in after forward integrating, however there should be minimal problems. Using this approach allows developers to work freely on the entire code base and not be impacted by others that have checked out critical solution and project files.
If you are going to use this strategy there are some rules of engagement:
- Follow a naming convention; what you don’t in your version control repository was a whole heap of branches like “ProductX-IWasHere95” and “ProductX-YickityYack”. It’ll just make it impossible to highball the source tree and find what you want. I would suggest doing something like “ProductX-Task3967” to keep it nice and orderly.
- Keep those branches short; when you branch, you branch per task. Don’t try and squeeze tasks on to a branch where it doesn’t belong and make sure that the tasks are small enough in size (but not too small) that you can close the task off in a day or two.
- Be reasonable; you don’t need a branch to remove a carriage return from the end of a file. Understand the difference between stuff that can be done without consequence and things that are going to affect other developers on the team.
- Clean up after yourself; if have finished on a task let the branch lay around for a little while but make sure you clean it up. If you combine the Branch per Task strategy with the branch per major version approach and you don’t clean up then branching from a Team Project’s area will result is a whole heap of files being duplicated uncessariliy.
While I have taken the time to describe this branching strategy, its not necessarily my favorite, especially on large teams (small teams of five to ten developers should be OK).
Branch per Developer
The Branch per Developer strategy involves developers creating a branch when they start work on a version and maintaining that version throw a series for forward and reverse integrations through the course of a development cycle. Each task assigned to the developer during the cycle is work on in their branch and is then integrated once the task is complete. When the next task is started it continues on the same branch.
There are a few advantages and disadvantages of using this approach. One disadvantage is that the longer the branch is active the harder it is to synchronise it with its parent so this strategy definately relies on developers being very disciplined. Having said that, if you have ever worked with CVS some developers manage to keep their sandbox clean for extraordinary periods of time and there is nothing stoping you scratching the branch and creating another one.
Two advantages that I can see is that when compared to the Branch per Task strategy there are a whole lot less branches created and once a developer knows what their branch is they are less likely to make a mistake and start working on one they shouldn’t. As a side effect it becomes possible to ease new team members into the project by allowing them to branch and throw away their code as they get familiar with the code-base.
Branch per Feature
Finally, lets have a look at a more hierarchical approach to branching inside a development cycle which I like to call Branch per Feature. Many organisations including Microsoft use this approach to seperate the work of the various teams until they are ready to integrate.
With Branch per Feature each team creates a branch for the portion of code that they are responsible for in the development cycle. The team works on that branch until their code reaches a level of stability that it can be integrated with the main product. Integration would normally happen on significant milestones like just before a BETA release.
I actually used the diagram above in my second post on branching terminology (just in case it looks familiar). I think that Branch per Feature is probably by favorite branching strategy if I had to pick one because it kind of reflects the work breakdown structure of your average project and it is inclusive in that developers and testers can all work on the same sub-branch and even produce a common build for that feature which can help with communication about issues from smoke tests.
Here I have presented three similar but different options for branching to avoid source file contention. Which one you use or whether you need to use one at all will depend largely on the way you and your team want to work. If you are a one man band there isn’t much point branching beyond what is required to support patching but as the size of a team grows it gets harder to stop treading on peoples toes (especially when a physical barrier such as the Pacific or Indian Oceans are put in the way.