Stories
Slash Boxes
Comments

News for nerds, stuff that matters

Slashdot Log In

Log In

Create Account  |  Retrieve Password

Linus on GIT and SCM

Posted by kdawson on Sat Jun 02, 2007 09:13 PM
from the strong-opinions dept.
An anonymous reader sends us to a blog posting (with the YouTube video embedded) about Linus Torvalds' talk at Google a few weeks back. Linus talked about developing GIT, the source control system used by the Linux kernel developers, and exhibited his characteristic strong opinions on subjects around SCM, by which he means "Source Code Management." SCM is a subject that coders are either passionate about or bored by. Linus appears to be in the former camp. Here is his take on Subversion: "Subversion has been the most pointless project ever started... Subversion used to say, 'CVS done right.' With that slogan there is nowhere you can go. There is no way to do CVS right."
+ -
story
This discussion has been archived. No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More
Loading... please wait.
  • Source Safe (Score:5, Funny)

    by EraserMouseMan (847479) on Saturday June 02 2007, @09:25PM (#19367699)
    Well Linus didn't have anything bad to say about MS Source Safe. . .

    [ducking] Sorry, I couldn't resist the urge. ;-)
  • Why winge? (Score:5, Funny)

    by gilesjuk (604902) <{giles.jones} {at} {zen.co.uk}> on Saturday June 02 2007, @09:31PM (#19367737)
    CVS and Subversion are open source projects, Linus should fix them.
    • Re:Why winge? (Score:4, Insightful)

      by Brandybuck (704397) on Saturday June 02 2007, @09:36PM (#19367757) Homepage Journal
      The problem with CVS and Subversion are one of fundamental design. At least, that is what Linus is suggesting. You can't fix them without rewriting them completely from the ground up.
    • Re:Why winge? (Score:5, Insightful)

      by zzatz (965857) on Saturday June 02 2007, @09:47PM (#19367821)
      Linus isn't saying that CVS and Subversion have fixable bugs or missing features. It's not about the code.

      He is saying that they solve the wrong problem. The Subversion team wants to solve Problem A, and Linus wants to solve Problem B. No amount of code will turn the solution to Problem A into a solution for Problem B. Bothering the Subversion team with code addressing Problem B will only irritate them, since they're working on Problem A.

      The right way to handle differing goals is to start a different project. That's what he did.

      Don't be confused by the labels. Source Code Management means different things to different people, and there isn't always much overlap in how each person defines it. Ships and airplanes are both 'vehicles', but that doesn't mean that a few changes will turn one to the other.
      • Re:Why winge? (Score:4, Insightful)

        by Eil (82413) on Sunday June 03 2007, @01:24PM (#19372775) Homepage Journal
        Linus isn't saying that CVS and Subversion have fixable bugs or missing features. It's not about the code.

        I think it's more about bashing some thing or another to gain attention.

        I liked Linus, and I've held him in high regard for more than a decade for all that he's helped accomplished while still being mostly modest about it. But he seems to be slowly evolving into another Stallman lately. Everything has been, "I don't like x and therefore x is stupid and you're a mentally-retarded asshat if you don't agree with everything I say."

        Just last week he started a flamewar on LKML about software suspend. Linus threw a shitfit when some bugs in suspend-to-disk were affecting suspend-to-RAM. After insulting a few other kernel developers without provocation, he basically ended the conversation by saying that the whole thing was going to be ripped out and redone only as suspend-to-RAM because he didn't use suspend-to-disk. Since he himself didn't use it, he postulated that it was a completely useless waste of time for anyone to implement it.

        The Subversion team wants to solve Problem A, and Linus wants to solve Problem B.

        So why couldn't he have simply said, "GIT solves the problem that I need it to solve, which is different from Subversion's"? Oh, right, because that wouldn't be interesting enough to make the front page of Slashdot. Too bad if the alternative alienates the half of the open source community that likes Subversion for what it does.
    • Re:Why winge? (Score:5, Informative)

      by RedWizzard (192002) on Saturday June 02 2007, @09:49PM (#19367829)

      CVS and Subversion are open source projects, Linus should fix them.
      He did fix them: he wrote GIT. He's no really whinging, he's saying "I wrote this tool because the other options are crap".
  • how to learn git? (Score:5, Informative)

    by zojas (530814) <kevin@astrophoenix.com> on Saturday June 02 2007, @09:36PM (#19367759) Homepage
    I've tried to use git, and I feel like if you want to do anything more than commit, you have to jump off a cliff which has serious spikes at the bottom. seriously, if you want to learn how to do more than 1 or 2 of the simplest operations with it, you have to invest serious time. I tried, and never could get there.

    anybody have a good tutorial? (not the crappy one which comes with it)

    I'm not an SCM rube either. I've competently used tla (arch), darcs, and of course CVS. but git just seems too hard to use. damn fast though.

    • by Omnifarious (11933) * on Saturday June 02 2007, @09:59PM (#19367881) Homepage Journal

      My favorite, of course, is Mercurial [selenic.com]. My main draw is that I had been interested in distributed SCMs for years, but had never found one that made any sense to me whatsoever. I was on the hunt again and stumbled on Mercurial, and I've been hooked ever since.

      Of the various distributed SCMs, Mercurial is the easiest to use one I've found. And it's pretty fast, though not quite as fast as git (though I have some ideas on how to fix that). And since it's written in Python with only a very small C component it runs on many platforms.

    • Re:how to learn git? (Score:5, Informative)

      by Anonymous Coward on Saturday June 02 2007, @10:28PM (#19368025)

      # set up new project
      cd project
      git init
      git add .
      git commit -a -m "Initial commit"
       
      # edit a file
      vi file.php
      git commit -a
       
      # add a file
      vi new.php
      git add new.php
      git commit -a
       
      # see the log
      git log
       
      # make a branch
      git branch working
      git checkout working
      # or in one step
      git checkout -b working
       
      # add some changes to this branch
      vi file.php
       
      # see what you changed
      git status
       
      # check it in
      git commit -a
       
      # see all branches
      git branch
       
      # go back to the first branch (initial branch is called "master" by default)
      git checkout master
       
      # make some other changes
      vi other.php
      git commit -a
       
      # merge the working branch into this one
      git merge working
       
      # see the branches and merges in a graphical browser
      gitk --all
       
      # let's do a log of all commits in "working" that don't exist in "master"
      git log master..working
       
      # hmm let's undo that last merge (tip of branch is HEAD, one commit back is HEAD^.. we are "popping" one commit)
      git reset --hard HEAD^
       
      # push your changes out (push the tip of local "master" branch to remote "incoming" branch)
      git push foo.bar.com:~/myrepo master:incoming
       
      # pull changes from another repo (remote "feature1" into local "feature1" branch)
      git pull baz.bar.com:~/otherrepo feature1
       
      # move the branch point of the "working" branch to the top of the "master" branch
      git checkout working
      git rebase master
      It can get a LOT more complex of course.

      When you're starting out, just remember "git commit -a" and you'll be fine. Also check out "git reflog" to see the linear history of your repo. The pulling/pushing stuff can get a lot more complex but it's damn powerful. If you can figure out Arch (yeesh) you can figure out git!

      SLASHDOT SEZ: you have too few characters per line. Okay, slashdot, here's part of the man page for git-rebase:

      If is specified, git-rebase will perform an automatic git checkout before doing anything else. Otherwise it remains on the current branch. All changes made by commits in the current branch but that are not in are saved to a temporary area. This is the same set of commits that would be shown by git log ..HEAD. The current branch is reset to , or if the --onto option was supplied. This has the exact same effect as git reset --hard (or ).If is specified, git-rebase will perform an automatic git checkout before doing anything else. Otherwise it remains on the current branch. All changes made by commits in the current branch but that are not in are saved to a temporary area. This is the same set of commits that would be shown by git log ..HEAD. The current branch is reset to , or if the --onto option was supplied. This has the exact same effect as git reset --hard (or ).If is specified, git-rebase will perform an automatic git checkout before doing anything else. Otherwise it remains on the current branch. All changes made by commits in the current branch but that are not in are saved to a temporary area. This is the same set of commits that would be shown by git log ..HEAD. The current branch is reset to , or if the --onto option was supplied. This has the exact same effect as git reset --hard (or ).
  • by rustalot42684 (1055008) <rustalot42684.gmail@com> on Saturday June 02 2007, @09:37PM (#19367775)
    We ALL know that the people who use CVS and SVN are version control Nazis!
  • I've used CVS, SVN, and GIT in serious projects and I can say I far prefer SVN to GIT, and GIT to CVS. GIT was incredibly confusing to use, and it may just have been the way the repository was administered was poor, but I never knew if I was synched with everyone else's checkouts and the command names made no sense. Its been over a year so I don't remember the details of GIT, but I remember having to do a lot of things "twice". Need to do a checkout? Two commands. Need to commit? Two commands. It was a bitch to use and I am glad I'm done with it. SVN, on the other hand, I felt very comfortable with from the start and most important of all, I trusted SVN to do what I wanted it to and to keep me from screwing up. In a year of using it, it has failed to lose my trust.

    I'm not trying to say SVN is better than GIT. The best repository depends on the type of project and type of development. But defaming SVN in favor of GIT is not, I believe, a valid statement. Especially when (I'm pretty certain) many, many more projects use SVN rather than choosing to use GIT.
    • by Black Acid (219707) on Saturday June 02 2007, @10:57PM (#19368167)

      Its been over a year so I don't remember the details of GIT, but I remember having to do a lot of things "twice". Need to do a checkout? Two commands. Need to commit? Two commands. It was a bitch to use and I am glad I'm done with it. SVN, on the other hand, I felt very comfortable with from the start

      Most distributed version control systems exhibit this phenomena, because by "checking out" you are actually doing two operations: pulling the latest changes from someone else, and updating your workspace. For example, in Monotone you would type (I imagine git operates similarly):

      mtn pull
      mtn update


      The first command retrieves revisions from the server, and the second updates your workspace with those new changes. To "commit" a change, in a distributed version control system you first 1) commit the change to your local repository and then 2) push it to someone else:

      mtn commit
      mtn push


      It is often useful to keep these operations separate. For example, you can commit without pushing. Make a bunch of changes, commit each one separately, and only push once you're satisfied with the result. Other developers can still see each change you made individually, but only after you've pushed, so they won't be stuck with an incomplete in-progress version of the tree.

      Similarly, by being able to update without pulling, you can revert to any revision you would like without contacting the network. Likewise, since commit does not require network access, it is no extra effort to work offline. Once an Internet connection is available, you can synchronize your repositories, but in the meantime you can make any change you want - even with no network connection.

      The main disadvantage of a decentralized version control system is that it requires workflow changes [pidgin.im] to get the most out of it. If you are only familiar with centralized version control systems, it will take some time getting used to. But I'm glad to say, an increasing number of projects are making the change to distributed version control [slashdot.org], among them, Mozilla and Pidgin. They are not using Git (but Mercurial and Monotone, respectively) but they're all distributed. Git is being used by the Beryl [beryl-project.org] project, among others. Subversion has momentum in FOSS because it is familiar for those used to centralized version control (everyone knows CVS), and SourceForge [sourceforge.net] provides free SVN hosting. Once a free open source hosting site provides hosting for a distributed version control system, I expect more low-resource open source projects to use it.
  • by suv4x4 (956391) on Saturday June 02 2007, @09:41PM (#19367793)
    No one said that if you're famous and contributed something incredible to the world (such as Linux) you can't speak out of your ass most of the time, just because you enjoy how everybody listen and try to decipher if they should care about it, or just laugh and pass by.

    I use SVN if a medium sized team and see SVN used extensively in all kinds of projects around the globe with great success. I personally love the workflow of SVN.

    The only thing that they need to work is merging of branches, and incidentally I've talked to the developers, they're quite aware of this flaw of SVN and working on it. We'll see new versions that can track changes in each branch and even attempt automated merges with good success.

    I know a guy who has the same personality like Linus. The guy is very smart, he single-handedly is coding an application which is very popular in its area (won't mention it since that's internal stuff). He keeps bitching all the time: about customer feature request, about random products and how sucky they are, how people can't see that. And he could also change his opinion overnight for no apparent reason and go in the other extreme. But he's a friggin' programming genius and what he does is great, despite is takes a lot of effort to deal with him.

    Well, probably those two go together: being an amazing creator, and being an amazing ass with huge ego. Who knows.
    • by True Vox (841523) on Saturday June 02 2007, @10:37PM (#19368083) Homepage
      Well, probably those two go together: being an amazing creator, and being an amazing ass with huge ego. Who knows.

      I disagree entirely that those two traits must go together. I'm living proof that you're wrong, in fact. I don't have a creative bone in my body.
        • by SnowZero (92219) on Sunday June 03 2007, @05:30AM (#19369739)

          No, wait: I recognize the benefits if his system. The problem is, his system has benefits with open source projects at most.
          I have several projects which have not been released as open source, and I can state for a fact there are benefits.

          But here's what: in OSS, he can afford to "reject 99% of the branches out there". This is because he "believe[s] most of you guys are incompetent idiots".
          What you may not realize is that those branches still feed code to his tree, through delegation. Linus trusts 10 people, and they in turn trust 10 people, and in a few levels you can accept code from all those trees through an established web of trust.

          In a small team, we don't throw 99% of our work out or keep a consistent base of developers who we believe are incompetent idiots.
          Can you give brand-new interns commit access to your core system? probably not. Can I code review a patch from a brand-new intern and accept it if it meets the quality standards for our core system? Yes I can. A distributed VCS can take advantage of possibly less skilled developers because its about trusting the well-defined patch, not the person. While such a workflow is not always precluded by centralized systems, distributed VCS's make this workflow very easy and natural.

          Instead, we work together, frequently communicate, have fast turnaround times, and often work with files that can't be actually merged together (such as design related files, AI, PSD etc.).
          Centralization of the VCS itself has little or no effect on this. How would a distributed VCS inhibit your team from acting this way?

          I clearly see the benefits of his system, but it shines for his own needs, SVN shines for the needs of the majority of small teams out there, and for more linear/classic style organized projects.
          Like I said, you just don't know what you are missing. If a developer has ever checked in something to mainline that wasn't done, just so another person could use it, you've just found a case where you really wanted a distributed VCS. If you've ever wanted to commit code on an airplane, so that you could easily revert if a new idea didn't work, you really wanted a distributed VCS. If you've ever wished you could test two patches together before merging them into the mainline, to avoid affecting other developers if the integrated version was buggy, you really wanted a distributed VCS. I've done all those things, and our project has had only 2-5 developers. At the same time, if you want to, you can use a distributed VCS in a centralized and linear manner; The only difference is that your team now has the choice of workflow models.

          It also works for the majority of small OSS projects, which can't afford to be spread in hundreds of branches at a time, as features are clearly defined, priorities as well, so there's no need to spread what we do in hundreds of branches by definition.
          You aren't being forced to have hundreds of branches, just like SVN doesn't force you to make hundreds of checkouts. This is a red herring.

          Also one of the benefits he mentions, basically everyone has his own branch and can diff locally with the other revisions, until someone "pulls" from him. That's handy but in its very basic core is the same as SVN cache. I can diff locally, save files as much as I want, revert files to the locally stored revisions etc., before I commit to the central repo. Not quite all the features he has, but as long as it does the job, that's all that takes.
          How about commit so that you have a logical development history, separating changes into logical parts, such as a new feature and a random bugfix you found along the way that you'd like to also push to stable -- all while disconnected. Like you say, if you don't run into that, it's not a problem. However, when I used CVS, I didn't realize all the times where I would have used a distributed VCS feature, because at the time I didn't think in those terms. Ignorance was bliss. Now that I use a distributed VCS, I use it even for single developer projects; The extra features do come in handy, and they aren't a hindrance.
  • by paroneayea (642895) on Saturday June 02 2007, @09:46PM (#19367815) Homepage
    ... And that is that CVS/SVN are centralized, while GIT is distributed, like GNU Arch.

    There are appropriate uses to both of these, and in kernel development I think it makes sense to have distributed development. However, in smaller projects, which really *need* a very specific direction (example, Wesnoth, I would think would not have gotten where it is today if there were so many branches where people were all making their own art).

    Linus is enough of a famed leader that he's going to be listened to, and thus kind of pulls the community around him as a central source of development. That's not necessarily going to happen everywhere.
    • Re: (Score:3, Informative)

      Nothing about git prevents you from establishing a repository and telling all your developers that it's the central integration point for your project. It supports svn-style centralized development just fine. (In fact, it even interoperates bidirectionally with existing svn repositories, though you lose some of the advanced features.) The difference is it doesn't force you into a centralized model.
      • by SnowZero (92219) on Sunday June 03 2007, @03:39AM (#19369311)

        Even bigger projects need specific directions, witness GCC. I am sorry but everyone who thinks distrubuted SCM are a good thing. I am going to say they are a bad thing for free software because they allow people to develop stuff in private. Yes this already happens. This is why for an example GCC's rules for branches are that they are free for all and they are used a lot.
        I'm sorry but this appears to be self-contradictory. If branches are good, distributed VCS's are good, because they are based around branches as a core concept. You seem to be under the mistaken belief that a branch in a distributed VCS must be private. In fact, you can push or publish any branch if you'd like to do so, and pull and merge any branch anyone makes public. All you need is for the development website to support multiple trees (this happens on kernel.org with multiple Git trees).

        Take a look at http://gcc.gnu.org/svn.html [gnu.org] and see how many branches there are, though most of them are inactive but some are very active (and getting ready to be merged into the trunk). I guess distrubuted SCM allows for people not to develop in public any more which I say is a bad thing. I have been working on a branch of GCC and who ever says merging is hard is wrong (I am dealing with an IR change which touches all front-ends and 50% of the middle-end, the tree level and I am able to merge at least once a week and the merge sometimes fix some of the regressions I was trying to fix before :)).
        For the most part you seem to be arguing for a distributed VCS, which makes branching easier. If you think CVS/SVN merging is easy, then these other systems would appear to be trivial. A not insignificant thing is that you can merge changesets from any branch, not just the trunk, and everything still works when you merge with the trunk later (think branch B which needs a feature from branch A, before A is ready to merge). No VCS forces people to work in public, and in fact distributed VCS are much better about encouraging people to check in early and often, rather than waiting until a feature is completely done (a common CVS model), or checking in too early when a feature has not been fully tested.

        I don't know about you but having a policy of branches are free for all is a good thing and it causes what distrubuted SCM will cause which is more development and a "private" tree (though it is public but the branch is yours to deal with).
        Nothing forces anyone to hide their trees. You could require all developers to keep up-to-date trees on a centralized machine if it really matters.

        I guess people don't see the bad side of distrubuted SCM that much because they don't deal with projects like GCC. The kernel has the same issue and I think Linus does not get the idea that public development is a good thing.
        I don't see how you draw that conclusion, or what evidence your conclusion is based on. Why do you feel this way?

        I could have developed all of my branch (pointer_plus) in a private local tree but I would not get some of the testing I have been getting from people I did not expect to be testing my branch. Plus I don't have all the resources that other people are putting into the testing that they do.
        Nothing stops you from keeping the same workflow with a distributed VCS. Just put your tree somewhere where people can pull from it. You can even have people continuously following and aggregating/merging branches, allowing you to know about integration issues ahead of time.
  • by Anonymous Coward on Saturday June 02 2007, @10:01PM (#19367891)
    I took a look at git a while ago and was completely underwhelmed. The UI was so bad it was useless, and it didn't "seem" to do anything that Darcs didn't do. (I used to love Darcs because of the automatic patch dependency computations).

    Now that all the "next generation" SCM tools have matured somewhat, I took a look at all of them again. I had to stop using Darcs because of the "patch of death" problem, which basically is this: after using Darcs on a project with long-lived parallel branches, the repository may eventually enter a wedged state you can't get out of, due to exponentially complex patch dependencies. Oops.

    At this point I had an idea of what an SCM should do, how it should work, what the "mental model" should be. I want to create changesets, add them to branches, combine multiple branches (and keep track of renames and so forth between branches), re-order changesets, collapse multiple changesets into one, discard old branches, etc.

    Of course, CVS and close cousin Subversion are SO UTTERLY USELESS I didn't even consider them. Seriously, Subversion is like gold-plated shit. Looks nice but it's still shit. Reading people say stuff like "Subversion is awesome" makes me wince. How can something that doesn't have "real" branches, and doesn't have tags OF ANY KIND, be useful for anything? How do you keep track of multiple merges between branches? Answer: you don't. Or you keep track of revision numbers using svnmerge and pray it all works. Even the Subversion docs sortof hand-wave this away. I.e., they hand-wave away one of the FUNDAMENTAL ASPECTS of source code management: branching and merging. It's like hearing people talk about OO databases. They mean well but they just don't comprehend the generality of the underlying problem.

    That's why I was so excited about Darcs: the author "gets it". Unfortunately the implementation is flawed.

    I checked out a few more (Mercurial, bzr) but finally settled on git because it let me do all the things I needed to do, and it did them FAST. Once I figured out the underlying model I was pretty impressed. Git can be viewed at many levels: very low-level plumbing, or UI-level, or in between. The UI and documentation is still pretty shitty, but thankfully they are working on improving it and are moving away from the idea of having interchangeable UIs. Just focus on improving "core git".

    One great thing about git is that so much of it is just files in the .git dir and shell scripts that combine very simple low-level functions. For instance, you can create a branch just by saving the SHA1 ID of the tip into a file in .git. You can branch off any point in the history this way, including branches you've deleted in the past (git keeps all the old commit objects by default, even ones that aren't pointed to by any branch or tag.. this is very simple and understandable model, like reference-counting in a way).

    The other great thing about git is how easy it is to sling changes around and reorder them and combine them. For instance let's say you add a file to your project as commit "A". Then you add some code that uses this file as commit "B". Then you fix a bug in the file as commit "C". So you have A-B-C. Now you'd like to combine A and C into a single patch A', and put B on top of it, like this: A'-B. In git, this is super-easy. I can think of two ways to do it off the top of my head.

    I was checking into a CVS project the other day (for a client) and wanted to do this. Then I realized, you can't move things around in CVS like this *twitch*. So nowdays I do everything in git and only after the changes are beautiful and self-contained and well-commented do I check them into CVS one at a time.

    Okay so they point is, check out git (or honestly? Checkout out ANYTHING that isn't CVS or svn). Even if you think Linus is an asshole (which he is) or you don't like the git UI (it's not that bad now), check it out anyway.

    And if you don't use SCM at all? You suck. Start learning. It's a best practice that you can't live without, once you start.
        • by Senjutsu (614542) on Sunday June 03 2007, @12:25AM (#19368501)

          Making a copy is not the same as making a branch. ... And for fucks sake Subversion, creating a copy in a directory called "tags" is not the same as making an actual tag.
          The way subversion does "copies" (there is no duplication of shared data between copied directories), there is no difference in practice.
  • by Black Acid (219707) on Saturday June 02 2007, @10:24PM (#19368003)
    The ultimate reason why Linus dislikes SVN, CVS, etc. is that it is centralized. Everyone checks out source from a central server and commits their changes to the same centralized area. This has problems: your workspace is not versioned. By this I mean, you cannot track local changes to your workspace without committing them to the central server.

    A common pattern in development is to try one approach, test it, tweak it, and possibly try another approach if the first did not work out, perhaps reverting to a prior approach. With decentralized version control, you can commit your changes to a local repository and work from there. All the locally changes you make are versioned, and be committed, checked out, examined all without contacting a central repository. This is ideal, because you often want to try various options to find the one that works best, before pushing your changes to the rest of the world. In centralized version control, you can use a branch for this purpose, but often branches in these systems are difficult to either create, merge, or maintain, so they are rarely used. The end result is that with centralized version control, developers version their workspace in their head. DVCS systems remove the mental burden.

    Fortunately, FOSS developers are realizing the usefulness of DVCS and major projects are converting to some form of DVCS. Mozilla is switching to Mercurial [mozillazine.org]. The Pidgin [pidgin.im] project, which just released 2.0.1, is using Monotone [pidgin.im]. (Linus favorably mentioned both of these distributed version control systems in his Git talk, as they are both are distributed).

    Once you accept that DVCS is better than the centralized model (which may not be true for some situations), only a few (but growing number of) version control systems are viable. This is currently a hot area in open source development, with software such as GNU Arch, Monotone, Mercurial, Git, Darcs, Bazaar, and more paving the way. Many open source DVCS's are still in development and not ready for general usage. I can't speak for Mercurial, but Monotone doesn't have the greatest performance, instead preferring integrity over speed. This led Linus to write git, since speed is very crucial for a large project like the Linux kernel.

    Whatever the actual program (git, Mercuial, or Monotone), more and more open source developers are realizing the advantages that distributed version control can offer. I encourage all developers that haven't used any DVCS to try it -- once you do, you won't go back.
    • Richard Dawkins spent a good deal of time in his book, "The Blind Watchmaker" talking about what the gradualist and the punctuationist view of Darwinism is. His gripe was that the latter was sold as a whole new theory, opposing the old gradualist view. Dawkins was rightly pissed about this, because the latter is merely an improved version of the former. I feel the same about the Centralized vs. Distributed topic. The distributed system is basically a centralized system where EVERY COPY HAS FULL REVISION HISTORY.

      There is still a central or main copy, otherwise you'd be herding a lot of slowly diverging forks! Most projects want to produce a release eventually and there is a main copy of sourcecode which the release is produced from.

      Imo, the reason Linus dislikes SVN and CVS and pretty much everything else is because of speed, because most SCMs lack the ability to work with merging different copies of repositories and work on a commit level instead, and do not allow for easy development routing around the central copy.
      • Re: (Score:3, Informative)

        So, how do you take that diff, revert half of it back to server's version, begin coding a completely new direction, realize you were right the first time, go back to the original dif you took, then pull in half the stuff you did while doing the wrong thing, finish coding and push the commit back to the server?

        You can't because subversion has no client side version control.
      • by Black Acid (219707) on Sunday June 03 2007, @12:45AM (#19368571)
        Monotone's inode prints [monotone.ca] (which, incidentially, Linus was a major contributor of [mail-archive.com]) can speed up some things, but the initial pull of a large repository is still unacceptably slow. The Pidgin developers have worked around [pidgin.im] this performance bottleneck by supplying bzip2'd Monotone databases via http, which the developer then can sync with the latest repository on pidgin.im to obtain an up-to-date database with the latest changes. Partial pulls should partially fix this problem in a future release of Monotone, or so I hear.

        For what it's worth, I use Monotone daily and find the performance acceptable. For the record, Linus used Monotone at a particularly bad time it its development cycle [mail-archive.com], when it was very slow and the main designer was on vacation. Nonetheless, the Monotone developers emphasize correctness and integrity over speed, and Mercurial and Git were direct responses to the performance of Monotone. Still, the performance of Monotone is always improving.
  • Linus talks about his distributed model, how everyone has a branch, and how this avoids politics associated with who gets commit access. He claims (and I admit I've seen this happen in some) that many projects have quite the internal politicking on who has CVS commit access. But then he claims that Git's special sauce eliminates these internal politics. Ok, I was intrigued, so I listened on.

    Essentially, he explains, the secret with Git is that everyone has commit access on their own branch - they do whatever they want. He says that the way it works is that someone does something cool with their own branch, then they start hollering to say "Hey, I have a good branch, merge mine" and it will get merged. Politics over.

    Ok, so now I'm scratching my head. How is this a fundamentally different paradigm? In CVS, basically anyone can check out the whole tree and make any changes the like. They can then say, see, my changes are good and ask for them to get committed or ask for commit access themself. In Git, this commit access bottleneck is just moved from the commit stage to the merge stage. You make your changes, commit them to your separate and unique branch, and then ask someone with to merge it, or give you the ability to merge it in to mainstream. How exactly does this eliminate the politics? You are still going to have some people with "the power" and some people without. In any project where you have people who are going to fight about who gets commit access, you'll just have a fight about who has the ability to merge into mainstream.

    So, ok, distributed is nice (though for some projects central may be preferred) but I don't see how this magic system bypasses politics. In fact, I can potentially see more internal politics over this method. I can see factions gathering to support this or that branch, arguing about which is better, fighting about which one gets merged in. I can see the potential for branches going longer between merges, and more changes happening at once, making it harder to track problems. I don't claim these scenarios are more likely, but I do claim that this changing from a commit access to a merge access paradigm is just renaming the problem.

    • You may be saying this in the context of large FOSS projects, but for most projects, not allowing all the team members to commit changes seems like a really bad idea. If you don't trust them, why are they on your team?

      Complaining about the occasional inefficiencies of file locking while forcing some developers to waste time waiting for permission to commit, seems really ironic to me.
    • Re: (Score:3, Interesting)

      I tend to agree - what becomes the "official" code (i.e. what would go into a release tarball) is a social problem without technical solution. A coordinated release requires AGREEMENT, however that agreement is arrived at.

      What GIT does differently, as I understand it, is it makes flipping around branches much easier than before. CVS and SVN have the concept of a central server, so if two developers are trying to resolve differences in their branches before either can get their changes into the main tree t
    • Re: (Score:3, Informative)

      > In any project where you have people who are going to fight about who gets commit
      > access, you'll just have a fight about who has the ability to merge into mainstream.

      I really wish he would have addressed that question a little more directly, too.

      I think the problem is that you're thinking about it from a classic centralized development model. I have some trouble getting my head around it, too.

      Basically, from a truly distributed SCM perspective, there is no "mainstream". All branches are equal. Obvi
    • Re: (Score:3, Informative)

      There are a couple things.

      Everyone has a complete tree so everyone can push patches between themselves. Linus doesn't have to accept it into his own tree. That cuts down on the politics. Before everyone had they're own tar ball and push patches around but you lose history and it takes more work.

      The other thing is that it's easier to delegate political questions. Lets say Linus pulls networking patches without even looking at them. The networking maintainer gets to deal with all the political issues. T
    • by iabervon (1971) on Sunday June 03 2007, @01:08AM (#19368641) Homepage Journal
      The advantage is that MergePrivileges can be fine-grained: there can be many answers for "merge into what?" There's a -mm tree, a -stable tree, a -linus tree, a -rt tree, and a lot of vendor and distro trees. Each of these has a different maintainer, and can have a different idea of what is acceptable. And only the maintainer can merge things into their tree, and they can decide based on a variety of features of the things they're considering. For example, Linus only merges from a few people directly: maintainers of various subsystems. And he doesn't even trust them completely; if the SD/MMC maintainer has a change which changes x86 architecture code in the tree Linus is asked to merge, he'll notice and ask what's up with that. And if there are changes that look too intrusive for the current point in the development cycle, he'll put it off until the next cycle, and ask for a tree with just fixes. And -linus isn't special, except that almost everybody trusts him implicitly and merges his stuff into their trees (the main exception being -stable, which is why a new 2.6.20.x kernel isn't derived from 2.6.21; and vendor and distro kernels are generally based on -stable of some sort, and only get new stuff from Linus when they go to a new series). Also, maintainers of subsystems know the people who work in their areas, and can apply the same sorts of rules: the guy from Intel who works on their network drivers can get e100 changes into the the -netdev tree, because the maintainer knows they know what they're doing for e100 changes. And Linus sees that the e100 changes are coming in through -netdev, and the network maintainer knows what policy to apply to the drivers around there, so they're fine, even if Linus has no clue who should be allowed to do what in e100.

      It's not that the politics go away. It's that the policy is no longer a binary "yes or no" decision, so the technical arrangement mirrors the social arrangement. This doesn't work with CommitAccess because people wouldn't commit the same change everywhere they should, and they couldn't be restricted to only making changes they're trusted to make (there are people who are trusted to correct spelling in comments in any file in the tree, and Linus can look through the total changes they send and verify that they only change spelling in comments).
  • by ClosedSource (238333) on Saturday June 02 2007, @10:29PM (#19368029)
    If you have a project that has thousands of developers all of the world like Linux does, a SCM system that is focused on merging makes a lot of sense. Unfortunately, there is a tendency for some people to overdo merging on small projects when they don't really need to. If the application is designed in a modular fashion and developers are assigned specific modules, than merging is rarely needed. Of course, many control freaks don't like this approach because it makes it harder for them to "correct" other developer's code.
  • by sohp (22984) <snewton.io@com> on Saturday June 02 2007, @11:42PM (#19368321) Homepage
    Distributed version control the way git does it (conceptually, not necessarily the implementation) is the best idea in SCM since concurrent development and optimistic merge conflict resolution on check-in.

    Notice how, even years after better ideas superceded the lock-modify-unlock paradigm, many tools and shops still use exclusive-lock SCM.

    It could be quite a while before you see anything like the way git does SCM in use in the majority of programming shops.
  • Actual Youtube link (Score:5, Informative)

    by beegle (9689) on Sunday June 03 2007, @06:10AM (#19369907) Homepage
    http://www.youtube.com/watch?v=4XpnKHJAok8 [youtube.com]

    This is the video from the article. You can either watch it in the tiny embedded window, or you can go to youtube and click the button to watch it full-screen.

    Look, posters: if you're going to point to a video that's hosted on YouTube (or another video hosting site), just link to that site. Don't link to some random web page that has the video embedded in it.
    • Re: (Score:3, Insightful)

      by Anonymous Coward

      I hope you're working for one of my company's competitors, if you are so eager to hamstring your developers and limit their productivity! Having to wait for someone else to finish a major piece of development before I can fix a bug in an unrelated section of a file they happen to be modifying... yeah, that's the way to turbocharge your development process.

        • by starwed (735423) on Saturday June 02 2007, @09:46PM (#19367817)
          You missed the point of the thread; to discuss git, not to be one.
        • The thing is, you've got the wrong solution to the problem. Rather than not allowing branches, you need to control when and how often they're made, and how long they're allowed to survive. Your fixing a policy problem with technology, which never works well. If the branches are kept under control, you don't have the last-second merge problem. Merges should be happening constantly throughout the process so everyone stays in sync. If someone isn't committing their work at least once a day, that's when they get a stern talking to from the lead developer. Because if a developer needs to coordinate with another developer to change one line of code, then you've wasted two people's time instead of one.
          • by Black Acid (219707) on Saturday June 02 2007, @10:35PM (#19368063)
            You hit the nail on the head. Distributed version control often comes with superior merging, making the process less painful and encouraging it to occur frequently. Monotone employs a 3-way merge [wikipedia.org], Codeville has an innovative merging algorithm [zooko.com], and some may even support 5-way merging [nongnu.org] ("left's immediate ancestor, left, merged, right, right's immediate ancestor") in the future.

            In my experience, nearly all merges occur automatically and cleanly. Only if two developers modified code in conflicting areas of the source code do you have to merge manually--and even then, only one person has to do it. It is much better to have merging operate automatically and transparently when possible, than to have to have two people manually coordinate each and every one of their changes beforehand.
          • by KyleCordes (10679) on Saturday June 02 2007, @10:41PM (#19368103) Homepage
            I wrote about Linus's talk a few weeks ago:

            http://kylecordes.com/2007/05/17/linux-git-distrib uted/ [kylecordes.com]

            Looking back at that, and at your comment, some things come to mind:

            * the tool Linus is pushing, greatly facilitates the idea of frequent, easy merges, and Linus mentions that a tool with great, fast merges, helps you merge early and often.

            * on the other hand, your comment is about "you need to control when and how often [branches] are made...", while a big point of distributed SC tools is the opposite of that control: these tools make the power of the tool fully available to all users. A "main" repository may (and probably should) have permissions/hooks set to enforce some policy about what happens to what branches. Individual users can always create local quasi-branches by simply not checking things in; with a tool like they can can create real (local) branches too, which can then be promoted to official status (i.e. on a blessed central repository) if needed.
        • Re: (Score:3, Funny)

          by Anonymous Coward

          Yep. $1.2 million.
          Well if that's you're measure of correctness then you'll have to admit Linus is right and you're wrong because his house is worth more than that.
    • So what you are saying is that RCS was done right and everything done since is wrong...
    • by Frankie70 (803801) on Saturday June 02 2007, @10:58PM (#19368171)

      So don't do it


      Wow! I bet you have never worked on anything other than hobby
      projects.

      Most projects I have worked on cannot do without branching &
      branching big & I am not talking about branches created for
      individual devs.

      What do you do if you have make patches on an earlier release(s)?
      What do you do if your project team has 50 devs working on
      5 different modules inside? If one guy makes a buggy submit
      it will break every one else? Typically each team does weekly
      sanity tests & then propagates the changes to the main.

      Yeah - and I agree with Linus - CVS is rubbish.

      Have used CVS, Clearcase & Source Depot. Source Depot
      is a Microsoft internal Source Control system. Microsoft
      licensed Perforce & developed on it. I used to work with
      MS long back & Source Depot was the best Source Control
      System I have ever used.

      CVS lacks too many features.
      1) Atomic checkins/submits
          I am trying to submit changes in 5 files as a single bugfix.
      A submit/checkin should either succeed for all 5 or fail for all 5.
      CVS doesn't do this. The end result is that I may end up submitting
      a change in the header without submitting a correspond change in the
      implementation file.

      2) Changelists
          After checking in multiples files together, at any point in time, I should
      be able to find out all the changes that were checked in at the same time.
      CVS has no way of doing this - Submitting 5 files together is the same as
      submitting 5 files separately as far as CVS is concerned.

      3) More Changelist features for non-submitted changes
      Let us say I am working on 3 different bugfixes. Source Depot allows me
      group together my changes in different changelists even before I
      submit the changes. That is I can create changelist A B & C.
      In changelist A - I have files a.c & a1.c changed, in changelist
      B, I have b.c & b1.c changed & so on. So I decide I am done with
      all the changes required in the subset A, I can submit it very easily
      or undo all changes in changelist B.

      4) Merges
      Merges between branches are a breeze with Source Depot. With CVS it's
      a pain. Source Depot stores a lot of information about merges which have
      already happened which in invaluable. In CVS, merges between branches
      are very little more than changes manually copied from one branch to
      another.
      I can do a lot of stuff which I can't do with CVS
      - I can very trivally merge Bugfix 1111 (comprising of 5 files
      checked into changelist XXXX) from a branch to another branch or
      the main trunk.
      - Because Source Depot stores information about merges, I can do periodic
      single command merges very easily between a branch & the trunk - Source Depot
      will not try to merge in changes which have already been merged the last
      time I did a merge.

      I could go on & on, but the point is that something Source Depot makes
      a developers life so much more easier. I could work around all these
      things in CVS (i.e. do it in multiple steps) but the ease is something
      worth paying for I think. If Microsoft ever released Source Depot
      as a commercial product, it would be great, but I don't suppose their
      license with Perforce would allow it.

    • by Timothy Brownawell (627747) <tbrownaw@prjek.net> on Saturday June 02 2007, @11:14PM (#19368213) Journal

      So don't do it.

      The Wise adapts himself to the world. The Fool adapts the world to himself [revctrl.org]. Therefore, all progress depends on the Fool.

      It's always done late in a development cycle, in the rush to get the project out the door.

      Why? It doesn't have to be. At least if you use something that isn't horribly broken [red-bean.com].

      So don't branch, and DON'T allow concurrent checkout of any code - FORCE the DEVELOPERS who need to work on the same code to COORDINATE their work EARLY in the development cycle. Of course they'll bitch.

      Yes, they will. Because this is a monumentally stupid idea. Because the entire *purpose* of revision control systems (note: "CVS" stands for "Concurrent Versioning System") is to make it possible for developers to work on things at the same time. The idea is that you can get more benefit from the concurreny than you get difficulties from merging.

      If your technical leadership has the spine to show prima donna twits who won't follow development rules the door. Of the entire company.

      Rules like "merge early, merge often", perhaps? Fixes the problem, and *doesn't* cripple development horribly like your idea would.

      • Re:git (Score:5, Interesting)

        by Anonymous Coward on Saturday June 02 2007, @09:36PM (#19367765)
        He is only human. Just because he is the head of a huge software project doesn't make him infallible.

        Just look at the whole 'RMS vs Linus' thing.

        His opinions should carry some weight, especially since he should know more than anyone what the limitations of SCM software is when it comes to larger projects like the linux kernel. But a lot of SCM comes down to the way a project is managed, the preferences of the people involved, and how they deal with their project. I doubt there is a blanket solution... a 'one SCM package to rule them all' so to speak.

        Especially in the software industry you can always find someone just as good as yourself that strongly holds opinions that are the polar opposite of yours.
        • Re:git (Score:5, Insightful)

          His opinions should carry some weight, especially since he should know more than anyone what the limitations of SCM software is when it comes to larger projects like the linux kernel.

          The thing is, Linux is actually a pretty small project. Much larger projects would include FreeBSD, which uses CVS not only for the kernel but for every line of source of the entire OS. Now, Linus is a smart guy, but I don't know why he thinks CVS (and SVN by extension) won't work for large projects. It clearly can. It may not be suitable for the way he wants to run his project, but that's a different issue.

          • Re:git (Score:4, Insightful)

            by SnowZero (92219) on Sunday June 03 2007, @02:31AM (#19369049)
            Yeah and luckily the whole "haves versus have nots" on who gets CVS commit access rights has never, ever, been a problem in *BSD or XFree86. Right?

            Seriously, centralized version control fails for large open source projects for political reasons, not technical ones. That's really Linus' main point, although his lack of tact in presentation is going to cause many people to miss that insight. With a changeset-based distributed version control system, you only have to trust patches and code, not people. The whole concept of "the chosen few who get commit access" goes away, and problems like the XFree86/X.org fork or the EGCS/GCC semi-fork disappear.
      • Re:git (Score:4, Interesting)

        by Anonymous Coward on Sunday June 03 2007, @02:38AM (#19369071)

        I was at the talk and I have to say he lost a HUGE amount of respect from me (and other people in the room whose job has to do with source control).

        The way git works as a decentralized solution with a chain of trust is simply not useable for really large, multiple projects with interdependencies. And it's even worse when you need to control access to certain portions of the code.

        I see Git as a pyramid scheme [wikipedia.org] with Linus sitting on top. I can't start imagining the job of the poor release engineer in a big corp who would need to merge the changes of sub-engineers and the chain of trust involved to reach the top ! What I see is that everyone would code and test on out of sync code, a bit like Vista's development was.

        Git is a solution that is fine tuned to Linus specific needs, but it's ages away from a solution that's flexible for most of the industry's needs.

        I'm a big fan of subversion, and while I'll admit it's far from perfect it's way better than cvs could ever be. It does the job well most of the time, and SVK [bestpractical.com] is filling some of the holes.

      • Re:Linus knows it. (Score:5, Informative)

        by CastrTroy (595695) on Saturday June 02 2007, @10:18PM (#19367977) Homepage
        You might want to check out TortoiseSVN if you're using svn on windows. It makes version control really easy, and you don't even have to touch the command line.
        • Re:Linus knows it. (Score:5, Informative)

          by statusbar (314703) <jeffk@statusbar.com> on Saturday June 02 2007, @10:31PM (#19368043) Homepage Journal
          I use SVN on windows, mac os x, linux (ubuntu, debian, fedora) as well as netbsd. TortoiseSVN works great on windows especially for the point and click style users who need to use SCM. SvnX works great on Mac OS X. Altium PCB designer works great with the svn command line tools and shows graphical diffs of our circuit boards. But for some reason, Tortoise SVN and svn.exe are unable to access a GIT repositiory.

          In addition, git works well for simple projects but not so well for projects that have many different related subprojects which share code.

            For instance, our SVN repository holds everything needed for an entire product, including embedded linux with busybox, initrd and custom software and libraries - as well as DSP source code for two different add on cards, the GUI for mac, windows, and linux, the docutils xml file for the various manuals, and manufacturing and test code.

          I'd love to use git once it attains the required maturity level so that I can do what I need with it.

          --jeffk++