 
			
		
		
	
		
		
		
		
		
		
			
				 
			
		
		
	
    
	Linus On Branching Practices 90
			
		 	
				rocket22 writes "Not long ago Linus Torvalds made some comments about issues the kernel maintainers were facing while applying the 'feature branch' pattern. They are using Git (which is very strong with branching and merging) but still need to take care of the branching basics to avoid ending up with unstable branches due to unstable starting points. While most likely your team doesn't face the same issues that kernel development does, and even if you're using a different DVCS like Mercurial, it's worth taking a look at the description of the problem and the clear solution to be followed in order to avoid repeating the same mistakes. The same basics can be applied to every version control system with good merge tracking, so let's avoid religious wars and focus on the technical details."
		 	
		
		
		
		
			
		
	 
	 
	
	
RELIGIOUS? (Score:1)
Challenge Accepted!
Re:RELIGIOUS? (Score:4, Funny)
Re: (Score:1, Funny)
Oh yeah? My VCS [wikipedia.org] has over 30 years of continuous history!
Re: (Score:2)
You are incredulous that there is a possibility that choice of of version-control could lead to a holy war?
Man, you haven't developed much code then  ... even CVS versus Source Safe can lead to fisticuffs.  Don't even mention Subversion of Perforce unless you're ready for a bit of a row.  Some of us are old enough to have used RCS in our home folders.
This is serious business, and everybody has a feature set they feel th
Re: (Score:2)
Hence why it's a challenge to avoid religious wars over the issue?
I'm not sure who deserves the whoosh here, you or me.
Re: (Score:2)
Likely me.  ;-)
Re: (Score:1)
I find that neither emacs nor vi qualifies to be a proper editor; that simplifies discussions very much. More serious contestants for the title include nano and mcedit, but roughly any GUI editor wins from CLI editors if you ask me.
Re: (Score:2)
nano!
Re: (Score:1)
Yeah (Score:4, Funny)
so let's avoid religious wars and focus on the technical details
Hahahaha... Good one!
Re:Yeah (Score:5, Funny)
I know. Linus is such a Linux Fanboy. Its so obvious.
Re:Yeah (Score:4, Funny)
him and his blanket
Re: (Score:1)
This all sounds complicated (Score:5, Insightful)
Which I imagine makes sense, as the kernel is very complicated from a dev standpoint.
For most projects I’ve been involved with, the path to success is keeping the trunk in a stable state, and using _that_ as the baseline. Dev code should never be in the trunk imo... the trunk should always be in a ready to release (or proceed to formal testing, or whatever) state. Everyone branches from the trunk.. everyone can update their branch to the latest trunk.. and everyone merges back down into the trunk when it’s good and ready.
Resisting the temptation to make “quick fixes” in the trunk is also important. Additionally, dev platforms should be setup so the system can be run from any branch as easily as the trunk (making it a pain to test out the system from a branch is a great way to ensure unstable code ends up in your trunk).
Obviously in the case of the kernel.. they probably have branches off branches off branches, but I think for most reasonably sized projects, that shouldn’t be necessary.
Re: (Score:3, Insightful)
I think you actually restated the point that Linus made in the original thread. Which was: Don't branch and start new development from an unknown state.
For you, the stable baseline is equal to the trunk. For Linus, the stable baseline is equal the labeled release build node.
Re: (Score:2, Insightful)
You merge several branches together into a "integration" branch, then test that and merge it to the trunk if it passes.
Re: (Score:2)
And don't branch off the integration branch for exactly the same reasons stated in the article.
The problem comes when stable releases are too infrequent. Changes start requiring features and fixes from other changes waiting to go to the trunk, and a programmer is tempted to branch from the integration branch to pull those in.
A better solution is to branch from a stable point on the trunk and merge the needed changes into that branch. This makes it clear exactly what dependencies exist and the required cha
Re: (Score:2)
In a past life I had a fairly straightforward strategy:
1. The trunk was always the next enhancement release--assumed unstable at all times.
2. When we were ready to test that release, we would branch the trunk.
3. From that point forward, new development would be done on the trunk again and only fixes would go into the release branch.
4. Once we had the release branch stable enough to release, we would tag it and cut a release.
5. A release branch would wind up with one or more "maintenance releases," so each t
Re: (Score:2)
The approach I tend to like is:
- merge the trunk back up into the branch
- do your "pre trunk commit" testing in the branch
- merge branch down into trunk
If things do get crazy, you can create an "integration branch"  .. but I think that can generally be avoided
Re:This all sounds complicated (Score:5, Informative)
He's also saying that everybody should branch from the exact same point along the branch or trunk. That way everybody has a set of diffs against the same baseline to merge back in.
If you always branch from trunk, then as more stuff gets added, you start from a different point than you might otherwise.
The specifically labeled "point in time" means that three separate changes can more readily be integrated as they'll be all from the exact same baseline.
If the trunk is ready for formal testing, and it affects your other branches, you have a harder time if you fix things and need to push them back into those branches.
Re:This all sounds complicated (Score:5, Interesting)
There are two conflicting goals: to avoid breaking the main branch (trunk) and to get changes out to the other developers soon. A broken main branch wastes the time of other developers on the project. But integrating changes late has its own inefficiencies: Problems in the modifications will only be raised after the work is done. It is more likely for one set of modifications to conflict with another set if both are being developed in parallel for a longer time. Other developers might have to wait for a full set of changes to arrive while they only need a subset, or they might start merging the subset from each other's development branches, creating a confusing mix of versions.
Committing directly into trunk can be acceptable and even desirable depending on the project. It depends on how likely commits are to break the code: How many developers are there? How many mistakes do they make? (a combination of experience and carefulness) Is there decent test coverage before committing? How fragile is the code base; are there many unexpected side effects? And it depends on how much damage a broken main branch does: How long does it typically take to find and fix a problem? How modular is the code base: will a bug in one part be a nuisance to developers working on another part? And it also depends on how much there is to gain from early merging: Is the project in the start-up phase where it is likely that other developers are waiting for new core functionality, or is the code base mature and are most changes done on the edges of the program? Are all design decisions made before code is written or are developers doing design and implementation work at the same time?
Re:This all sounds complicated (Score:4, Interesting)
I go in the reverse.
Trunk is dev, branches are stable. We haven't had much trouble with this set up at all.
Re: (Score:3, Interesting)
We did this where I worked previously too. It was also the MO for the artists building the artwork for the game.
Your trunk is the "Main line", a boiling pot of all the changes and can change on a minute by minute bases right near crunch time; This is good because you fail early if your change is not compatible with other changes instead of at the end of the day or whatever. This is very important for artwork.
The last known good build is tagged/labelled (or branch if you prefer) and was generated by an auto
Re: (Score:2)
Trunk is dev, branches are stable. We haven't had much trouble with this set up at all.
I think the problem is that this does not really work when you have people all over the world trying to do different things on a huge base of code over different timescales. If everyone is working towards a single release date then this makes more sense as you can implement things like feature freezes.
In Kernel development though there can be people working on a refinement that will take a very long period of time so there will be several releases that go by in between. There will also be be people starting
Re: (Score:2)
That's the model that ended up causing the "Source" engine to be named so.
From what I heard, near the end of the development of Half-Life 1, Valve had their "src" directory (their mainline) and wanted to make some more radical engine improvements. These improvements were too last-minute to make it into the game, so they created a branch for the "gold" version of Half-Life, called "goldsrc", which was only used to commit stable code and polish the game, and the experimental changes were committed to the trun
Re:This all sounds complicated (Score:4, Informative)
I hate to break it to you, but even if your trunk is clean, you will still have this problem in some other branch. Let's examine a very common situation where you have an interface being changed, one or more implementations of that interface, and one or more users of that interface. Developers are working simultaneously on both sides of that interface in order to meet a deadline.
Because of your clean trunk rule, none of the changes can be checked into the trunk until all of the changes are ready, but they still need to be shared among the people working on it, or they will have no idea if it is "good and ready." So those developers create their own branch, which of necessity is sometimes in a temporarily broken state. You might not think of it as a branch, if it's John's working directory and the "checkout" procedure is him emailing files around, but it's conceptually a branch nonetheless.
Linus is simply acknowledging that temporary brokenness is inevitable when multiple people integrate changes to the same code, and therefore whatever branch contains that messy integration should use tags to communicate the best branch points. I'm not saying keeping a clean trunk isn't a good idea, just that you have to deal with broken branch points one way or another, even if it's just John deciding when the best time is to email out the new header files to his team.
comment from original page (Score:2)
Seems reasonable to me... Don't know why this wouldn't solve the problem, or any other reason why it's not desirable.
Re:comment from original page (Score:5, Informative)
Yep, this is standard practice if your scm support knows what they're doing. The only reason it's not "desirable" to only branch off of stable, 'known-good' baselines is developer laziness. It can take more time setting up the branch, and sometimes that quick checkout-edit-checkin on the trunk is just SOOOO tempting as a shortcut. I see this a lot in groups working on new products, too - "it's never been released to production, so we'll just branch from wherever, and call it a day." Usually they grow out of this type of practice after they spend a few days untangling a mess they've created, but there are some die-hards who just hate having to deal with anybody else, and insist on doing their own thing.
This is why it's important to have:
1) Management / leadership that understands the value of proper configuration management, and expects good practices to be used;
2) Support for your SCM system that knows how to set up these practices and is empowered to enforce them;
3) Mature developers who understand that "fastest" isn't always "best";
(Full disclosure: part of my role in my current job involves clearcase admin, and i've also worked with svn, cvs, pvcs, and (shudder) vss in varying capacities)
Re:comment from original page (Score:4, Insightful)
And those people get smacked on the knuckles with a ruler. If they keep failing to abide by your policies, you smack 'em on the ass with the ruler. If they keep going like that, you get rid of them.
There are very few things more destructive to a development team than some prima donna who won't follow the rules and procedures. In the long run, if they won't play by the rules laid down, they'll do more harm than good.
Source Code Management and "cowboys" can't really coexist if you want to be able to have maintainable software. I've seen someone who would apply changes to any old branch and more or less decree it was someone else's problem to get them onto main -- buh bye, if you're sabotaging the build process, we don't need you.
Re:comment from original page (Score:4, Insightful)
I agree, and if the choice were mine, there are some people I work with who would be pink-slipped immediately... but, politics at a large-ish company being what they are, it's a matter of demonstrating to managers that the actions are counter-productive and costing us time and money... then letting them draw the proper conclusions. In a well-run meritocracy, these people would be gone for violating the "No Asshole" rule.
The problem is, some of the managers are over-promoted cowboys themselves - I've heard, no exaggeration, the following from a manager when I was arguing for locking down one of our production systems because people kept making changes live: "I know it's good policy, but as soon as policy slows down my developers, the policy goes out the window."
The technical problems are easy. It's this political maneuvering that requires the patience of a saint.
Re:comment from original page (Score:5, Insightful)
Run. Run fast, run far.
If managers are going to support the notion of un-tracked changes on a production server in the name of getting things done, then eventually someone will be looking to lay blame for something that went horribly wrong.
Failure to understand why people have change procedures for live systems is pretty significant. And, depending on your industry  ... un-tracked fixes and tweaks can actually get you in legal trouble.  Think Sarbanes-Oxley.
In almost any sane shop, failure to follow the change procedures can be a grounds for immediate dismissal.
Re: (Score:2)
Most companies are not sane. I once worked for a forex provider where policies were not followed and never enforced by management. It was a reactive and chaotic environment, where engineering had direct access to production and build/release was responsible for production operations versus an actual operations/mis team. I blame this squarely on the youth culture in the IT world, were discipline is rare.
Re: (Score:2)
s/Source Code Management/Software Configuration Management/g
Re: (Score:1)
Re: (Score:3, Insightful)
I'm sorry, how does "automated testing of the main line via a CI tool after the changes are committed to the main line" assure that your main line stays stable?
"Virtually stable" is not "stable". When you work for a financial services firm whose livelihood depends on the market data and trading systems your team builds, "virtually stable" is nowhere near "stable" and doesn't even begin to approach "good enough".
Re: (Score:2, Insightful)
Re: (Score:2)
I'm sorry, how does "automated testing of the main line via a CI tool after the changes are committed to the main line" assure that your main line stays stable?
"Virtually stable" is not "stable". When you work for a financial services firm whose livelihood depends on the market data and trading systems your team builds, "virtually stable" is nowhere near "stable" and doesn't even begin to approach "good enough".
How are you going to know that the mainline is stable unless you are going to test it?
How are you going to ensure the testing was complete and consistently applied unless you're going to automate it?
You NEED CI to assure the testing is done as part of the process on every mainline check-in. You NEED it because it runs the suites of tests which PROVE the main line is stable.
If your test suite doesn't prove the mainline is stable, then it's a fault of the test suite, not the CI / build / automated testing sy
Re: (Score:2)
Yep, this is standard practice if your scm support knows what they're doing.
And I have yet to see it done right in practice. Especially with ClearCase. Every config spec I've seen includes  /main/LATEST or similar, instead of working off of labels.
The closest I've seen was with Subversion, where the policy was to only branch off of the tags directory. But even then most people just worked off of trunk. It was a mess.
If you're using Subversion, the right way for a project of any decently large size is to only work off of a branch and only branch off a tag.
Re: (Score:3, Informative)
You can do it right with a 4-line config spec. The config spec needs to include that  /main/LATEST clause at the bottom because new elements being added to the branch aren't labeled with the baseline you're branching from.
The config spec should take the form of:
element * CHECKEDOUT  .../branch/LATEST  /main/LATEST -mkbranch branch
element *
element * BASELINE -mkbranch branch
element *
The only time the  /main/LATEST rule will ever be evaluated is if an element is added to the branch after the BASELINE is applied, 
Re: (Score:1)
element * CHECKEDOUT  .../branch/LATEST  /main/0 -mkbranch branch
element *
element * BASELINE -mkbranch branch
element *
Re: (Score:2)
If you're labeling properly, that'll have more or less the same effect - the only time you should be 'falling through' to the  /main/X clause is if it's a new element added after label "BASELINE" was applied, and if that's the case, then  /main/0 should be  /main/LATEST unless you're adopting this config spec halfway through a dev cycle and scared you'll pick up things you don't intend to.
Re: (Score:1)
Let's say you had a uncaught labelling error on a source file (foo.c) when you create your baseline label. I know, you're checking for that, but just for the sake of argument, let's say it happened. With a  /main/LATEST catch-all rule, you would be getting whatever version of foo.c happens to be  /main/LATEST.  When you run a build, you may or may not get an error, but you run the risk of introducing a bizarre bug in your build image, which cou 
Re: (Score:2)
Sure, you can do that if you have 5-10 commits per day. However, Linus merges on *average* around 100 change sets to the Linux kernel trunk every day and has been doing that for a long time now. You can not expect to keep both that speed and also keep the trunk 100% stable all the time.
Creating feature trunks from known-to-be-stable points is a much easier approach for everyone involved.
it's simply ignorance! (Score:2, Interesting)
The kernel devs don't do development on master! However, git's fast-forward-merge will, by default, push development/intermediate commits onto master. Those intermediate commits are extremely useful for code-inspection/code-review and bisect-based debugging. They're are not meant for starting a new dev branch and that why they're not tagged! There's nothing new or interesting in that article other than a bunch stupid comments at the bottom. The whole thing smells like a disguised advertising for PlasticSCM
Re: (Score:1)
I must admit i do not often work with multiple branches, but one problem i see here is that patches are kept very long in a branch, because some developers do not like it to be called stable. As soon as it is stable they should stop tinkering and get assigned something else OR other people will see their bugs in stable.... and they just as well could keep in in the branch.
Developers use tools with out thought. News at 11 (Score:2)
Branch-then-merge is NASTY (Score:1, Insightful)
Even the best merge tools can't guarantee the logic of the merged code is correct no matter how stable/good your branch point is.
Re: (Score:2)
No single tool can guarantee the logic of any code is correct. Your only options in that department are mathematical proof, unit tests, integration tests, and field tests.
In other words, scm, like all other incredibly useful tools, do not constitute a silver bullet in software development.
Re: (Score:2)
And what is a company with multiple shipping releases of the same code supposed to do? Good merge tools with good engineers using them are their only hope. When a bug is discovered in 5.0 and it needs to be fixed in 5.1, 5.2 and 6.0 three merges are necessary, no mat
Re: (Score:2, Interesting)
Never underestimate the stupidity of some people. I've seen some VOBS get royally hosed and take a day or two to go through the version-trees of individual elements to untangle their merge history. This was all due to two things: 1) OzPeter's Point [slashdot.org] 2) Lazy CM that didn't want to provide simple scripts and lock down a standard method for view/config-spec management.
Re: (Score:2)
+a million, spot on.
One of the guys I worked with at my first job working with ClearCase put it thus: "ClearCase is great, it's powerful and flexible. And it gives you plenty of rope to hang yourself with."
Re: (Score:2, Interesting)
ClearCase identified and solved all these problems in the commercial world long before Free software. The problem is that the sort of people who do kernel work don't work for the sort of companies that can afford the ClearCase licensing fees - or at least *mostly* don't.
Well, I'll go with that if you expand the definition to include those who's companies went with ClearCase and who learned first hand how much of a royal PITA it is.
It remains that for many smaller organizations (or teams), something boring but 100% predictable and almost maintenance free like SVN gets the job done just fine. But its not sexy/expensive...
Re: (Score:2)
SVN is neither.
Re: (Score:2)
SVN is neither.
Ooh, tres trendy. In reality, SVN has powered a staggeringly large number of software projects, as did CVS before it. In the same way that 'ant' (or heck, even 'make') remains remarkably effective for many projects. When you look at 10-500 dev years in a project, your source code control and build maintenance costs can often be in the 10-20 hour range (over the project lifetime). Try that with ClearCase and its brethren. And yes, I've worked on several projects in the 3-10 million LOC range that did ju
Isn't this kind of obvious? (Score:3, Insightful)
The whole story seems to be summed up by: "Don't just branch from some random point. Wait until your code is stable and branch from that." and "Create these stable points from which you can branch as often as is practical". I'm sorry but I've never been tempted to branch from an unstable point, and I'd be horrified if anyone on my time tried to do so.
As for only adding features to a stable release I find that depends on the size, complexity and maturity of the project. Early on nothing is feature complete and everyone tends to work on an unstable head/trunk/master/whatever-your-scm-calls-it. Once development has settled down and there's been a release, it's much more controlled and people do tend to add their code from a stable point.
I'm sorry I just don't see any life changing revelations here.
Re: (Score:2)
> I'm sorry but I've never been tempted to branch from an unstable point, and I'd be horrified if anyone on my time tried to do so.
What?
This would happen just checking out the latest version, then start writing some code. How is that a practise that you should be "horrified" by ?
Re: (Score:2)
> I'm sorry but I've never been tempted to branch from an unstable point, and I'd be horrified if anyone on my time tried to do so.
What?
This would happen just checking out the latest version, then start writing some code. How is that a practise that you should be "horrified" by ?
What on earth are you talking about? No, you do not automatically branch every time you check in or check out code. I think you need to go back and read up on the concept of a branch.
Re: (Score:2)
And you need to read up on the concept of git.
Seriously, this isn't SVN.
Typical work flow in git is:
1) Clone remote repository
2) Make branch of the clone's master (origin/master), say, and calling it "master".
3) Add your commits to your branch.
4) Continue until you're happy.
5) Merge your changes with any changes that other people have done.
6) Push your changes onto the remote server.
What Linus is saying is that step 5 can cause trouble. Instead of making a branch of clone's master, you should use the lates
Re: (Score:2)
And you need to read up on the concept of git.
Seriously, this isn't SVN.
I've used git exactly once, and that was last weekend to pull down Spring Security source. So no argument I need more time in git to comment on git specific process. But did you miss the article summary stating that this should be discussed regardless of the source control flavour?
Typical work flow in git is...
What Linus is saying is that step 5 can cause trouble. Instead of making a branch of clone's master, you should use the latest tag instead. This is not the way you'd do it on any small or medium pro
Re: (Score:2)
Unless you are using a VCS that does mandatory per file locking, then you are conceptually branching every time you check out code, since before you check the code back in, somebody else could come along and check in other changes.
Granted that most version control systems don't label a working directory as a branch. The only real difference is that a working directory does not have a series of commits while a ranch does. Of course, even that is not much of a difference, since good practice would be to exami
Re: (Score:2)
Unless you are using a VCS that does mandatory per file locking, then you are conceptually branching every time you check out code, since before you check the code back in, somebody else could come along and check in other changes.
Sorry, but no. "Conceptually" branches and tags have very specific properties that aren't fulfilled by checking out. A key feature being that you can go back to a precise version of the whole code base which you have labelled (hopefully clearly).
Re: (Score:2)
That depends very much on the VCS. Many modern VCS's have the property that any completely checkout has a distinct revision identifier, meaning that you can always go back the the same version of the entire codebase. Furthermore, there are Version control systems out there that do not permit whole repository branching, but do support per file or per directory (non-recursive) branching. Those are rather esoteric systems, but they do exist.
Re: (Score:2)
I don't know, I never really put much thought into branch policy, namely, what exactly does the "master" branch do and is it necessarily always a safe place to base new revisions? Practice seems to vary.
Also this seems to bring into question how you recognize "good" commits in the system versus "in progress" ones. If I were adding a new feature to source in a repository, would I necessarily start from a release tag, and then merge commits from master since then to get everything that changed since the rel
Should have linked to the actual article (Score:5, Informative)
Heisenberg as applied to SW development (Score:5, Insightful)
Some devs know where STABLE is located, some devs know what direction their new code is going, and a successful merge is where a dev violates the Heisenberg Uncertainty Principle and accomplish both at the same time.
branch/merge is sux (Score:1)
Re: (Score:2, Interesting)
Do you really want to do all the validation testing every time you put back to the trunk? Including Installation testing? There are only so many scenarios you can catch with test-first development. The rest are usually discovered by the testers. That's why they still have jobs and why you can actually accomplish anything in a CI environment. There will always be the heroic tales of development teams that are on version 11000 on the trunk and have never busted a customer. Maybe you are him/her? >.
Re: (Score:1)
BAH! No edit button !
Yes, good SCM teams let you do this and protect you from the worst of your merge nightmares.
Re: (Score:1)
Anonymous Coward (Score:1, Funny)
From TFA:
"Vaselines are key"
I totally agree.
The original email complained about failed bisect (Score:3, Interesting)
git allows you to bisect from known-good and known-bad kernels to try and find the source of the problem. The original complaint was that some of the intermediate changes don't build.
The problem here is not necessarily branching/merging, but that maintainers and developers do something along the line of "commit bad change, notice problem, commit fix" in their own private branch. Then, rather than clean up their private branch that whole history gets merged into the main kernel tree.
This has the advantage of showing more details of development, but has the downside that a bisect that hits the "bad change" commit won't build and will require some manual action to select a "nearby" commit that will build.
As I view it, it's less about rebase/merge and more about developers/maintainers being more diligent about keeping their trees clean before merging back to the mainline.