Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
×
Programming Linux

Linus On Branching Practices 90

rocket22 writes "Not long ago Linus Torvalds made some comments about issues the kernel maintainers were facing while applying the 'feature branch' pattern. They are using Git (which is very strong with branching and merging) but still need to take care of the branching basics to avoid ending up with unstable branches due to unstable starting points. While most likely your team doesn't face the same issues that kernel development does, and even if you're using a different DVCS like Mercurial, it's worth taking a look at the description of the problem and the clear solution to be followed in order to avoid repeating the same mistakes. The same basics can be applied to every version control system with good merge tracking, so let's avoid religious wars and focus on the technical details."
This discussion has been archived. No new comments can be posted.

Linus On Branching Practices

Comments Filter:
  • by Anonymous Coward
    [...] so let's avoid religious wars and focus on the technical details.

    Challenge Accepted!
    • by Anonymous Coward on Tuesday November 30, 2010 @12:22PM (#34390676)
      Agreed. Except when it comes to mercurial which is the sux0rs.
    • [...] so let's avoid religious wars and focus on the technical details.

      Challenge Accepted!

      You are incredulous that there is a possibility that choice of of version-control could lead to a holy war?

      Man, you haven't developed much code then ... even CVS versus Source Safe can lead to fisticuffs. Don't even mention Subversion of Perforce unless you're ready for a bit of a row. Some of us are old enough to have used RCS in our home folders.

      This is serious business, and everybody has a feature set they feel th

      • Hence why it's a challenge to avoid religious wars over the issue?

        I'm not sure who deserves the whoosh here, you or me.

      • I find that neither emacs nor vi qualifies to be a proper editor; that simplifies discussions very much. More serious contestants for the title include nano and mcedit, but roughly any GUI editor wins from CLI editors if you ask me.

  • Yeah (Score:4, Funny)

    by truthsearch ( 249536 ) on Tuesday November 30, 2010 @12:23PM (#34390696) Homepage Journal

    so let's avoid religious wars and focus on the technical details

    Hahahaha... Good one!

  • by Anrego ( 830717 ) * on Tuesday November 30, 2010 @12:24PM (#34390716)

    Which I imagine makes sense, as the kernel is very complicated from a dev standpoint.

    For most projects I’ve been involved with, the path to success is keeping the trunk in a stable state, and using _that_ as the baseline. Dev code should never be in the trunk imo... the trunk should always be in a ready to release (or proceed to formal testing, or whatever) state. Everyone branches from the trunk.. everyone can update their branch to the latest trunk.. and everyone merges back down into the trunk when it’s good and ready.

    Resisting the temptation to make “quick fixes” in the trunk is also important. Additionally, dev platforms should be setup so the system can be run from any branch as easily as the trunk (making it a pain to test out the system from a branch is a great way to ensure unstable code ends up in your trunk).

    Obviously in the case of the kernel.. they probably have branches off branches off branches, but I think for most reasonably sized projects, that shouldn’t be necessary.

    • Re: (Score:3, Insightful)

      I think you actually restated the point that Linus made in the original thread. Which was: Don't branch and start new development from an unknown state.

      For you, the stable baseline is equal to the trunk. For Linus, the stable baseline is equal the labeled release build node.

    • by gstoddart ( 321705 ) on Tuesday November 30, 2010 @01:05PM (#34391472) Homepage

      For most projects I’ve been involved with, the path to success is keeping the trunk in a stable state, and using _that_ as the baseline. Dev code should never be in the trunk imo... the trunk should always be in a ready to release (or proceed to formal testing, or whatever) state. Everyone branches from the trunk.. everyone can update their branch to the latest trunk.. and everyone merges back down into the trunk when it’s good and ready.

      He's also saying that everybody should branch from the exact same point along the branch or trunk. That way everybody has a set of diffs against the same baseline to merge back in.

      If you always branch from trunk, then as more stuff gets added, you start from a different point than you might otherwise.

      The specifically labeled "point in time" means that three separate changes can more readily be integrated as they'll be all from the exact same baseline.

      If the trunk is ready for formal testing, and it affects your other branches, you have a harder time if you fix things and need to push them back into those branches.

    • by MtHuurne ( 602934 ) on Tuesday November 30, 2010 @01:32PM (#34391928) Homepage
      I think that the development process should be selected to match the particular project and the stage it is in. There is no perfect process that applies to every project, or even to one project forever. A team of 4 in a single room working on a demo for a new product idea will have very different requirements from a team of 20 working in two locations on an improved version of a product that is already in production...

      There are two conflicting goals: to avoid breaking the main branch (trunk) and to get changes out to the other developers soon. A broken main branch wastes the time of other developers on the project. But integrating changes late has its own inefficiencies: Problems in the modifications will only be raised after the work is done. It is more likely for one set of modifications to conflict with another set if both are being developed in parallel for a longer time. Other developers might have to wait for a full set of changes to arrive while they only need a subset, or they might start merging the subset from each other's development branches, creating a confusing mix of versions.

      Committing directly into trunk can be acceptable and even desirable depending on the project. It depends on how likely commits are to break the code: How many developers are there? How many mistakes do they make? (a combination of experience and carefulness) Is there decent test coverage before committing? How fragile is the code base; are there many unexpected side effects? And it depends on how much damage a broken main branch does: How long does it typically take to find and fix a problem? How modular is the code base: will a bug in one part be a nuisance to developers working on another part? And it also depends on how much there is to gain from early merging: Is the project in the start-up phase where it is likely that other developers are waiting for new core functionality, or is the code base mature and are most changes done on the edges of the program? Are all design decisions made before code is written or are developers doing design and implementation work at the same time?
    • by i_ate_god ( 899684 ) on Tuesday November 30, 2010 @01:42PM (#34392092)

      I go in the reverse.

      Trunk is dev, branches are stable. We haven't had much trouble with this set up at all.

      • Re: (Score:3, Interesting)

        by Anonymous Coward

        We did this where I worked previously too. It was also the MO for the artists building the artwork for the game.

        Your trunk is the "Main line", a boiling pot of all the changes and can change on a minute by minute bases right near crunch time; This is good because you fail early if your change is not compatible with other changes instead of at the end of the day or whatever. This is very important for artwork.

        The last known good build is tagged/labelled (or branch if you prefer) and was generated by an auto

      • Trunk is dev, branches are stable. We haven't had much trouble with this set up at all.

        I think the problem is that this does not really work when you have people all over the world trying to do different things on a huge base of code over different timescales. If everyone is working towards a single release date then this makes more sense as you can implement things like feature freezes.

        In Kernel development though there can be people working on a refinement that will take a very long period of time so there will be several releases that go by in between. There will also be be people starting

      • by mgiuca ( 1040724 )

        That's the model that ended up causing the "Source" engine to be named so.

        From what I heard, near the end of the development of Half-Life 1, Valve had their "src" directory (their mainline) and wanted to make some more radical engine improvements. These improvements were too last-minute to make it into the game, so they created a branch for the "gold" version of Half-Life, called "goldsrc", which was only used to commit stable code and polish the game, and the experimental changes were committed to the trun

    • by kbielefe ( 606566 ) <karl@bielefeldt.gmail@com> on Tuesday November 30, 2010 @02:00PM (#34392420)

      I hate to break it to you, but even if your trunk is clean, you will still have this problem in some other branch. Let's examine a very common situation where you have an interface being changed, one or more implementations of that interface, and one or more users of that interface. Developers are working simultaneously on both sides of that interface in order to meet a deadline.

      Because of your clean trunk rule, none of the changes can be checked into the trunk until all of the changes are ready, but they still need to be shared among the people working on it, or they will have no idea if it is "good and ready." So those developers create their own branch, which of necessity is sometimes in a temporarily broken state. You might not think of it as a branch, if it's John's working directory and the "checkout" procedure is him emailing files around, but it's conceptually a branch nonetheless.

      Linus is simply acknowledging that temporary brokenness is inevitable when multiple people integrate changes to the same code, and therefore whatever branch contains that messy integration should use tags to communicate the best branch points. I'm not saying keeping a clean trunk isn't a good idea, just that you have to deal with broken branch points one way or another, even if it's just John deciding when the best time is to email out the new header files to his team.

  • Or you do what we do: keep trunk pristine - no development should EVER occur in your trunk and it should always be possible to push a stable release from it.
    Seems reasonable to me... Don't know why this wouldn't solve the problem, or any other reason why it's not desirable.
    • by Americano ( 920576 ) on Tuesday November 30, 2010 @12:55PM (#34391244)

      Yep, this is standard practice if your scm support knows what they're doing. The only reason it's not "desirable" to only branch off of stable, 'known-good' baselines is developer laziness. It can take more time setting up the branch, and sometimes that quick checkout-edit-checkin on the trunk is just SOOOO tempting as a shortcut. I see this a lot in groups working on new products, too - "it's never been released to production, so we'll just branch from wherever, and call it a day." Usually they grow out of this type of practice after they spend a few days untangling a mess they've created, but there are some die-hards who just hate having to deal with anybody else, and insist on doing their own thing.

      This is why it's important to have:
      1) Management / leadership that understands the value of proper configuration management, and expects good practices to be used;
      2) Support for your SCM system that knows how to set up these practices and is empowered to enforce them;
      3) Mature developers who understand that "fastest" isn't always "best";

      (Full disclosure: part of my role in my current job involves clearcase admin, and i've also worked with svn, cvs, pvcs, and (shudder) vss in varying capacities)

      • by gstoddart ( 321705 ) on Tuesday November 30, 2010 @01:15PM (#34391620) Homepage

        Usually they grow out of this type of practice after they spend a few days untangling a mess they've created, but there are some die-hards who just hate having to deal with anybody else, and insist on doing their own thing.

        And those people get smacked on the knuckles with a ruler. If they keep failing to abide by your policies, you smack 'em on the ass with the ruler. If they keep going like that, you get rid of them.

        There are very few things more destructive to a development team than some prima donna who won't follow the rules and procedures. In the long run, if they won't play by the rules laid down, they'll do more harm than good.

        Source Code Management and "cowboys" can't really coexist if you want to be able to have maintainable software. I've seen someone who would apply changes to any old branch and more or less decree it was someone else's problem to get them onto main -- buh bye, if you're sabotaging the build process, we don't need you.

        • by Americano ( 920576 ) on Tuesday November 30, 2010 @01:29PM (#34391876)

          I agree, and if the choice were mine, there are some people I work with who would be pink-slipped immediately... but, politics at a large-ish company being what they are, it's a matter of demonstrating to managers that the actions are counter-productive and costing us time and money... then letting them draw the proper conclusions. In a well-run meritocracy, these people would be gone for violating the "No Asshole" rule.

          The problem is, some of the managers are over-promoted cowboys themselves - I've heard, no exaggeration, the following from a manager when I was arguing for locking down one of our production systems because people kept making changes live: "I know it's good policy, but as soon as policy slows down my developers, the policy goes out the window."

          The technical problems are easy. It's this political maneuvering that requires the patience of a saint.

          • by gstoddart ( 321705 ) on Tuesday November 30, 2010 @01:34PM (#34391982) Homepage

            I've heard, no exaggeration, the following from a manager when I was arguing for locking down one of our production systems because people kept making changes live: "I know it's good policy, but as soon as policy slows down my developers, the policy goes out the window."

            Run. Run fast, run far.

            If managers are going to support the notion of un-tracked changes on a production server in the name of getting things done, then eventually someone will be looking to lay blame for something that went horribly wrong.

            Failure to understand why people have change procedures for live systems is pretty significant. And, depending on your industry ... un-tracked fixes and tweaks can actually get you in legal trouble. Think Sarbanes-Oxley.

            In almost any sane shop, failure to follow the change procedures can be a grounds for immediate dismissal.

            • by ishobo ( 160209 )

              In almost any sane shop, failure to follow the change procedures can be a grounds for immediate dismissal.

              Most companies are not sane. I once worked for a forex provider where policies were not followed and never enforced by management. It was a reactive and chaotic environment, where engineering had direct access to production and build/release was responsible for production operations versus an actual operations/mis team. I blame this squarely on the youth culture in the IT world, were discipline is rare.

        • by wurp ( 51446 )

          s/Source Code Management/Software Configuration Management/g

      • Shouldn't your continuous integration's regression suite assure that mainline is virtually stable? Why isn't the call for better regression tests?
        • Re: (Score:3, Insightful)

          by Americano ( 920576 )

          I'm sorry, how does "automated testing of the main line via a CI tool after the changes are committed to the main line" assure that your main line stays stable?

          "Virtually stable" is not "stable". When you work for a financial services firm whose livelihood depends on the market data and trading systems your team builds, "virtually stable" is nowhere near "stable" and doesn't even begin to approach "good enough".

          • Re: (Score:2, Insightful)

            It's not the CI tool that assures that mainline is stable; it's the quality of the regressions.
          • by ebuck ( 585470 )

            I'm sorry, how does "automated testing of the main line via a CI tool after the changes are committed to the main line" assure that your main line stays stable?

            "Virtually stable" is not "stable". When you work for a financial services firm whose livelihood depends on the market data and trading systems your team builds, "virtually stable" is nowhere near "stable" and doesn't even begin to approach "good enough".

            How are you going to know that the mainline is stable unless you are going to test it?

            How are you going to ensure the testing was complete and consistently applied unless you're going to automate it?

            You NEED CI to assure the testing is done as part of the process on every mainline check-in. You NEED it because it runs the suites of tests which PROVE the main line is stable.

            If your test suite doesn't prove the mainline is stable, then it's a fault of the test suite, not the CI / build / automated testing sy

      • Yep, this is standard practice if your scm support knows what they're doing.

        And I have yet to see it done right in practice. Especially with ClearCase. Every config spec I've seen includes /main/LATEST or similar, instead of working off of labels.

        The closest I've seen was with Subversion, where the policy was to only branch off of the tags directory. But even then most people just worked off of trunk. It was a mess.

        If you're using Subversion, the right way for a project of any decently large size is to only work off of a branch and only branch off a tag.

        • Re: (Score:3, Informative)

          by Americano ( 920576 )

          You can do it right with a 4-line config spec. The config spec needs to include that /main/LATEST clause at the bottom because new elements being added to the branch aren't labeled with the baseline you're branching from.

          The config spec should take the form of:

          element * CHECKEDOUT
          element * .../branch/LATEST
          element * BASELINE -mkbranch branch
          element * /main/LATEST -mkbranch branch

          The only time the /main/LATEST rule will ever be evaluated is if an element is added to the branch after the BASELINE is applied,

          • If you want to be strict about it, you could always replace the /main/LATEST rule with a /main/0 rule, like so:

            element * CHECKEDOUT
            element * .../branch/LATEST
            element * BASELINE -mkbranch branch
            element * /main/0 -mkbranch branch

            • If you're labeling properly, that'll have more or less the same effect - the only time you should be 'falling through' to the /main/X clause is if it's a new element added after label "BASELINE" was applied, and if that's the case, then /main/0 should be /main/LATEST unless you're adopting this config spec halfway through a dev cycle and scared you'll pick up things you don't intend to.

              • Agreed, that's why I said "If you want to be strict about it".

                Let's say you had a uncaught labelling error on a source file (foo.c) when you create your baseline label. I know, you're checking for that, but just for the sake of argument, let's say it happened. With a /main/LATEST catch-all rule, you would be getting whatever version of foo.c happens to be /main/LATEST. When you run a build, you may or may not get an error, but you run the risk of introducing a bizarre bug in your build image, which cou

    • Sure, you can do that if you have 5-10 commits per day. However, Linus merges on *average* around 100 change sets to the Linux kernel trunk every day and has been doing that for a long time now. You can not expect to keep both that speed and also keep the trunk 100% stable all the time.

      Creating feature trunks from known-to-be-stable points is a much easier approach for everyone involved.

    • by bogolisk ( 18818 )

      The kernel devs don't do development on master! However, git's fast-forward-merge will, by default, push development/intermediate commits onto master. Those intermediate commits are extremely useful for code-inspection/code-review and bisect-based debugging. They're are not meant for starting a new dev branch and that why they're not tagged! There's nothing new or interesting in that article other than a bunch stupid comments at the bottom. The whole thing smells like a disguised advertising for PlasticSCM

    • by leuk_he ( 194174 )

      I must admit i do not often work with multiple branches, but one problem i see here is that patches are kept very long in a branch, because some developers do not like it to be called stable. As soon as it is stable they should stop tinkering and get assigned something else OR other people will see their bugs in stable.... and they just as well could keep in in the branch.

  • Thats what it amounts to. Developers are happily branching and branching off branches with no concern of whether what they branched off was stable in the first place.
  • by Anonymous Coward

    Even the best merge tools can't guarantee the logic of the merged code is correct no matter how stable/good your branch point is.

    • No single tool can guarantee the logic of any code is correct. Your only options in that department are mathematical proof, unit tests, integration tests, and field tests.

      In other words, scm, like all other incredibly useful tools, do not constitute a silver bullet in software development.

    • Not always. It's quite useful if your group is doing potentially destabilizing new work and you still need to keep up with bug fixes in a release branch. What's the alternative, a gazillion "#ifdef MY_NEW_FEATURE"s, twiddled makefiles, etc?

      And what is a company with multiple shipping releases of the same code supposed to do? Good merge tools with good engineers using them are their only hope. When a bug is discovered in 5.0 and it needs to be fixed in 5.1, 5.2 and 6.0 three merges are necessary, no mat
  • by syousef ( 465911 ) on Tuesday November 30, 2010 @12:49PM (#34391128) Journal

    The whole story seems to be summed up by: "Don't just branch from some random point. Wait until your code is stable and branch from that." and "Create these stable points from which you can branch as often as is practical". I'm sorry but I've never been tempted to branch from an unstable point, and I'd be horrified if anyone on my time tried to do so.

    As for only adding features to a stable release I find that depends on the size, complexity and maturity of the project. Early on nothing is feature complete and everyone tends to work on an unstable head/trunk/master/whatever-your-scm-calls-it. Once development has settled down and there's been a release, it's much more controlled and people do tend to add their code from a stable point.

    I'm sorry I just don't see any life changing revelations here.

    • > I'm sorry but I've never been tempted to branch from an unstable point, and I'd be horrified if anyone on my time tried to do so.

      What?

      This would happen just checking out the latest version, then start writing some code. How is that a practise that you should be "horrified" by ?

      • by syousef ( 465911 )

        > I'm sorry but I've never been tempted to branch from an unstable point, and I'd be horrified if anyone on my time tried to do so.

        What?

        This would happen just checking out the latest version, then start writing some code. How is that a practise that you should be "horrified" by ?

        What on earth are you talking about? No, you do not automatically branch every time you check in or check out code. I think you need to go back and read up on the concept of a branch.

        • And you need to read up on the concept of git.

          Seriously, this isn't SVN.

          Typical work flow in git is:

          1) Clone remote repository
          2) Make branch of the clone's master (origin/master), say, and calling it "master".
          3) Add your commits to your branch.
          4) Continue until you're happy.
          5) Merge your changes with any changes that other people have done.
          6) Push your changes onto the remote server.

          What Linus is saying is that step 5 can cause trouble. Instead of making a branch of clone's master, you should use the lates

          • by syousef ( 465911 )

            And you need to read up on the concept of git.

            Seriously, this isn't SVN.

            I've used git exactly once, and that was last weekend to pull down Spring Security source. So no argument I need more time in git to comment on git specific process. But did you miss the article summary stating that this should be discussed regardless of the source control flavour?

            Typical work flow in git is...

            What Linus is saying is that step 5 can cause trouble. Instead of making a branch of clone's master, you should use the latest tag instead. This is not the way you'd do it on any small or medium pro

        • by Tacvek ( 948259 )

          Unless you are using a VCS that does mandatory per file locking, then you are conceptually branching every time you check out code, since before you check the code back in, somebody else could come along and check in other changes.

          Granted that most version control systems don't label a working directory as a branch. The only real difference is that a working directory does not have a series of commits while a ranch does. Of course, even that is not much of a difference, since good practice would be to exami

          • by syousef ( 465911 )

            Unless you are using a VCS that does mandatory per file locking, then you are conceptually branching every time you check out code, since before you check the code back in, somebody else could come along and check in other changes.

            Sorry, but no. "Conceptually" branches and tags have very specific properties that aren't fulfilled by checking out. A key feature being that you can go back to a precise version of the whole code base which you have labelled (hopefully clearly).

            • by Tacvek ( 948259 )

              That depends very much on the VCS. Many modern VCS's have the property that any completely checkout has a distinct revision identifier, meaning that you can always go back the the same version of the entire codebase. Furthermore, there are Version control systems out there that do not permit whole repository branching, but do support per file or per directory (non-recursive) branching. Those are rather esoteric systems, but they do exist.

    • I don't know, I never really put much thought into branch policy, namely, what exactly does the "master" branch do and is it necessarily always a safe place to base new revisions? Practice seems to vary.

      Also this seems to bring into question how you recognize "good" commits in the system versus "in progress" ones. If I were adding a new feature to source in a repository, would I necessarily start from a release tag, and then merge commits from master since then to get everything that changed since the rel

  • by Shandalar ( 1152907 ) on Tuesday November 30, 2010 @12:50PM (#34391162)
    Here is the actual article that the submitter should have linked to. [lkml.org] It's Linus's post. Instead, the submitter linked to his or her advert site, which is a blog that has ads which hawk their own, non-git source control system, all of which you get to read before you are given the link to Linus's actual post.
  • by vlm ( 69642 ) on Tuesday November 30, 2010 @12:53PM (#34391220)

    Some devs know where STABLE is located, some devs know what direction their new code is going, and a successful merge is where a dev violates the Heisenberg Uncertainty Principle and accomplish both at the same time.

  • Branching and merging is a huge waste of time. I'm not sure why continuous integration is considered just a delaying of the problem. With a CI/streaming model merging happens continuously, not all at one time while trying to hit a moving target. And if CI is not allowed to break, no one has any problems downstream.
    • Re: (Score:2, Interesting)

      by Mordstrom ( 1285984 )

      Do you really want to do all the validation testing every time you put back to the trunk? Including Installation testing? There are only so many scenarios you can catch with test-first development. The rest are usually discovered by the testers. That's why they still have jobs and why you can actually accomplish anything in a CI environment. There will always be the heroic tales of development teams that are on version 11000 on the trunk and have never busted a customer. Maybe you are him/her? >.

      • BAH! No edit button !

        Yes, good SCM teams let you do this and protect you from the worst of your merge nightmares.

      • There are different modes of release. CI makes sense where I work because we are on very short continuous release cycles. We don't have 6 months to engineer something that will be sent to Jupiter without the capability for updates. I'm suggesting streaming rather than using branches.
  • by Anonymous Coward

    From TFA:

    "Vaselines are key"

    I totally agree.

  • by Chirs ( 87576 ) on Tuesday November 30, 2010 @02:05PM (#34392512)

    git allows you to bisect from known-good and known-bad kernels to try and find the source of the problem. The original complaint was that some of the intermediate changes don't build.

    The problem here is not necessarily branching/merging, but that maintainers and developers do something along the line of "commit bad change, notice problem, commit fix" in their own private branch. Then, rather than clean up their private branch that whole history gets merged into the main kernel tree.

    This has the advantage of showing more details of development, but has the downside that a bisect that hits the "bad change" commit won't build and will require some manual action to select a "nearby" commit that will build.

    As I view it, it's less about rebase/merge and more about developers/maintainers being more diligent about keeping their trees clean before merging back to the mainline.

Think of it! With VLSI we can pack 100 ENIACs in 1 sq. cm.!

Working...