Linus on GIT and SCM
Posted by
kdawson
on Sat Jun 02, 2007 09:13 PM
from the strong-opinions dept.
from the strong-opinions dept.
An anonymous reader sends us to a blog posting (with the YouTube video embedded) about Linus Torvalds' talk at Google a few weeks back. Linus talked about developing GIT, the source control system used by the Linux kernel developers, and exhibited his characteristic strong opinions on subjects around SCM, by which he means "Source Code Management." SCM is a subject that coders are either passionate about or bored by. Linus appears to be in the former camp. Here is his take on Subversion: "Subversion has been the most pointless project ever started... Subversion used to say, 'CVS done right.' With that slogan there is nowhere you can go. There is no way to do CVS right."
This discussion has been archived.
No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
Source Safe (Score:5, Funny)
[ducking] Sorry, I couldn't resist the urge.
Re:Source Safe (Score:4, Funny)
(http://www.hae.com)
aaaaaaab
aaaaaaac
aaaaaaad
aaaaaaae
aaaaaaaf
.
.
.
1,712,928 Files...
Why winge? (Score:5, Funny)
Re:Why winge? (Score:4, Insightful)
(http://www.usermode.org/ | Last Journal: Tuesday April 17 2007, @09:13PM)
Re:Why winge? (Score:5, Insightful)
He is saying that they solve the wrong problem. The Subversion team wants to solve Problem A, and Linus wants to solve Problem B. No amount of code will turn the solution to Problem A into a solution for Problem B. Bothering the Subversion team with code addressing Problem B will only irritate them, since they're working on Problem A.
The right way to handle differing goals is to start a different project. That's what he did.
Don't be confused by the labels. Source Code Management means different things to different people, and there isn't always much overlap in how each person defines it. Ships and airplanes are both 'vehicles', but that doesn't mean that a few changes will turn one to the other.
Re:Why winge? (Score:4, Insightful)
(http://bityard.net/ | Last Journal: Thursday August 08 2002, @04:18PM)
I think it's more about bashing some thing or another to gain attention.
I liked Linus, and I've held him in high regard for more than a decade for all that he's helped accomplished while still being mostly modest about it. But he seems to be slowly evolving into another Stallman lately. Everything has been, "I don't like x and therefore x is stupid and you're a mentally-retarded asshat if you don't agree with everything I say."
Just last week he started a flamewar on LKML about software suspend. Linus threw a shitfit when some bugs in suspend-to-disk were affecting suspend-to-RAM. After insulting a few other kernel developers without provocation, he basically ended the conversation by saying that the whole thing was going to be ripped out and redone only as suspend-to-RAM because he didn't use suspend-to-disk. Since he himself didn't use it, he postulated that it was a completely useless waste of time for anyone to implement it.
The Subversion team wants to solve Problem A, and Linus wants to solve Problem B.
So why couldn't he have simply said, "GIT solves the problem that I need it to solve, which is different from Subversion's"? Oh, right, because that wouldn't be interesting enough to make the front page of Slashdot. Too bad if the alternative alienates the half of the open source community that likes Subversion for what it does.
Re:Why winge? (Score:5, Informative)
how to learn git? (Score:5, Informative)
(http://www.desertsol.com/~kevin)
anybody have a good tutorial? (not the crappy one which comes with it)
I'm not an SCM rube either. I've competently used tla (arch), darcs, and of course CVS. but git just seems too hard to use. damn fast though.
Re:how to learn git? - answer, don't! (Score:5, Interesting)
(http://www.omnifarious.org/~hopper/ | Last Journal: Tuesday October 02, @12:21PM)
My favorite, of course, is Mercurial [selenic.com]. My main draw is that I had been interested in distributed SCMs for years, but had never found one that made any sense to me whatsoever. I was on the hunt again and stumbled on Mercurial, and I've been hooked ever since.
Of the various distributed SCMs, Mercurial is the easiest to use one I've found. And it's pretty fast, though not quite as fast as git (though I have some ideas on how to fix that). And since it's written in Python with only a very small C component it runs on many platforms.
Re:how to learn git? (Score:5, Informative)
When you're starting out, just remember "git commit -a" and you'll be fine. Also check out "git reflog" to see the linear history of your repo. The pulling/pushing stuff can get a lot more complex but it's damn powerful. If you can figure out Arch (yeesh) you can figure out git!
SLASHDOT SEZ: you have too few characters per line. Okay, slashdot, here's part of the man page for git-rebase:
If is specified, git-rebase will perform an automatic git checkout before doing anything else. Otherwise it remains on the current branch. All changes made by commits in the current branch but that are not in are saved to a temporary area. This is the same set of commits that would be shown by git log
Godwin's law (Score:4, Funny)
Well, speaking from my own experience... (Score:5, Interesting)
(http://www.ece.utexas.edu/~olsen)
I'm not trying to say SVN is better than GIT. The best repository depends on the type of project and type of development. But defaming SVN in favor of GIT is not, I believe, a valid statement. Especially when (I'm pretty certain) many, many more projects use SVN rather than choosing to use GIT.
Re:Well, speaking from my own experience... (Score:5, Informative)
Most distributed version control systems exhibit this phenomena, because by "checking out" you are actually doing two operations: pulling the latest changes from someone else, and updating your workspace. For example, in Monotone you would type (I imagine git operates similarly):
The first command retrieves revisions from the server, and the second updates your workspace with those new changes. To "commit" a change, in a distributed version control system you first 1) commit the change to your local repository and then 2) push it to someone else:
It is often useful to keep these operations separate. For example, you can commit without pushing. Make a bunch of changes, commit each one separately, and only push once you're satisfied with the result. Other developers can still see each change you made individually, but only after you've pushed, so they won't be stuck with an incomplete in-progress version of the tree.
Similarly, by being able to update without pulling, you can revert to any revision you would like without contacting the network. Likewise, since commit does not require network access, it is no extra effort to work offline. Once an Internet connection is available, you can synchronize your repositories, but in the meantime you can make any change you want - even with no network connection.
The main disadvantage of a decentralized version control system is that it requires workflow changes [pidgin.im] to get the most out of it. If you are only familiar with centralized version control systems, it will take some time getting used to. But I'm glad to say, an increasing number of projects are making the change to distributed version control [slashdot.org], among them, Mozilla and Pidgin. They are not using Git (but Mercurial and Monotone, respectively) but they're all distributed. Git is being used by the Beryl [beryl-project.org] project, among others. Subversion has momentum in FOSS because it is familiar for those used to centralized version control (everyone knows CVS), and SourceForge [sourceforge.net] provides free SVN hosting. Once a free open source hosting site provides hosting for a distributed version control system, I expect more low-resource open source projects to use it.
Well, Linus is an ass, what's new. (Score:4, Insightful)
I use SVN if a medium sized team and see SVN used extensively in all kinds of projects around the globe with great success. I personally love the workflow of SVN.
The only thing that they need to work is merging of branches, and incidentally I've talked to the developers, they're quite aware of this flaw of SVN and working on it. We'll see new versions that can track changes in each branch and even attempt automated merges with good success.
I know a guy who has the same personality like Linus. The guy is very smart, he single-handedly is coding an application which is very popular in its area (won't mention it since that's internal stuff). He keeps bitching all the time: about customer feature request, about random products and how sucky they are, how people can't see that. And he could also change his opinion overnight for no apparent reason and go in the other extreme. But he's a friggin' programming genius and what he does is great, despite is takes a lot of effort to deal with him.
Well, probably those two go together: being an amazing creator, and being an amazing ass with huge ego. Who knows.
Re:Well, Linus is an ass, what's new. (Score:5, Funny)
(http://truevox.net/)
I disagree entirely that those two traits must go together. I'm living proof that you're wrong, in fact. I don't have a creative bone in my body.
Re:Well, Linus is an ass, what's new. (Score:4, Funny)
Re:Well, Linus is an ass, what's new. (Score:4, Insightful)
There's a difference between GIT and SVN (Score:5, Informative)
(http://www.lingocomic.com/)
There are appropriate uses to both of these, and in kernel development I think it makes sense to have distributed development. However, in smaller projects, which really *need* a very specific direction (example, Wesnoth, I would think would not have gotten where it is today if there were so many branches where people were all making their own art).
Linus is enough of a famed leader that he's going to be listened to, and thus kind of pulls the community around him as a central source of development. That's not necessarily going to happen everywhere.
Re:There's a difference between GIT and SVN (Score:4, Funny)
Cvs is already done right (Score:2, Funny)
Cvs is already done right. These would-be improvements are pointless.
git is pretty cool, take a closer look (Score:5, Interesting)
Now that all the "next generation" SCM tools have matured somewhat, I took a look at all of them again. I had to stop using Darcs because of the "patch of death" problem, which basically is this: after using Darcs on a project with long-lived parallel branches, the repository may eventually enter a wedged state you can't get out of, due to exponentially complex patch dependencies. Oops.
At this point I had an idea of what an SCM should do, how it should work, what the "mental model" should be. I want to create changesets, add them to branches, combine multiple branches (and keep track of renames and so forth between branches), re-order changesets, collapse multiple changesets into one, discard old branches, etc.
Of course, CVS and close cousin Subversion are SO UTTERLY USELESS I didn't even consider them. Seriously, Subversion is like gold-plated shit. Looks nice but it's still shit. Reading people say stuff like "Subversion is awesome" makes me wince. How can something that doesn't have "real" branches, and doesn't have tags OF ANY KIND, be useful for anything? How do you keep track of multiple merges between branches? Answer: you don't. Or you keep track of revision numbers using svnmerge and pray it all works. Even the Subversion docs sortof hand-wave this away. I.e., they hand-wave away one of the FUNDAMENTAL ASPECTS of source code management: branching and merging. It's like hearing people talk about OO databases. They mean well but they just don't comprehend the generality of the underlying problem.
That's why I was so excited about Darcs: the author "gets it". Unfortunately the implementation is flawed.
I checked out a few more (Mercurial, bzr) but finally settled on git because it let me do all the things I needed to do, and it did them FAST. Once I figured out the underlying model I was pretty impressed. Git can be viewed at many levels: very low-level plumbing, or UI-level, or in between. The UI and documentation is still pretty shitty, but thankfully they are working on improving it and are moving away from the idea of having interchangeable UIs. Just focus on improving "core git".
One great thing about git is that so much of it is just files in the
The other great thing about git is how easy it is to sling changes around and reorder them and combine them. For instance let's say you add a file to your project as commit "A". Then you add some code that uses this file as commit "B". Then you fix a bug in the file as commit "C". So you have A-B-C. Now you'd like to combine A and C into a single patch A', and put B on top of it, like this: A'-B. In git, this is super-easy. I can think of two ways to do it off the top of my head.
I was checking into a CVS project the other day (for a client) and wanted to do this. Then I realized, you can't move things around in CVS like this *twitch*. So nowdays I do everything in git and only after the changes are beautiful and self-contained and well-commented do I check them into CVS one at a time.
Okay so they point is, check out git (or honestly? Checkout out ANYTHING that isn't CVS or svn). Even if you think Linus is an asshole (which he is) or you don't like the git UI (it's not that bad now), check it out anyway.
And if you don't use SCM at all? You suck. Start learning. It's a best pract
Re:git is pretty cool, take a closer look (Score:5, Informative)
Distributed version control gaining ground in FOSS (Score:5, Informative)
A common pattern in development is to try one approach, test it, tweak it, and possibly try another approach if the first did not work out, perhaps reverting to a prior approach. With decentralized version control, you can commit your changes to a local repository and work from there. All the locally changes you make are versioned, and be committed, checked out, examined all without contacting a central repository. This is ideal, because you often want to try various options to find the one that works best, before pushing your changes to the rest of the world. In centralized version control, you can use a branch for this purpose, but often branches in these systems are difficult to either create, merge, or maintain, so they are rarely used. The end result is that with centralized version control, developers version their workspace in their head. DVCS systems remove the mental burden.
Fortunately, FOSS developers are realizing the usefulness of DVCS and major projects are converting to some form of DVCS. Mozilla is switching to Mercurial [mozillazine.org]. The Pidgin [pidgin.im] project, which just released 2.0.1, is using Monotone [pidgin.im]. (Linus favorably mentioned both of these distributed version control systems in his Git talk, as they are both are distributed).
Once you accept that DVCS is better than the centralized model (which may not be true for some situations), only a few (but growing number of) version control systems are viable. This is currently a hot area in open source development, with software such as GNU Arch, Monotone, Mercurial, Git, Darcs, Bazaar, and more paving the way. Many open source DVCS's are still in development and not ready for general usage. I can't speak for Mercurial, but Monotone doesn't have the greatest performance, instead preferring integrity over speed. This led Linus to write git, since speed is very crucial for a large project like the Linux kernel.
Whatever the actual program (git, Mercuial, or Monotone), more and more open source developers are realizing the advantages that distributed version control can offer. I encourage all developers that haven't used any DVCS to try it -- once you do, you won't go back.
Re:Distributed version control gaining ground in F (Score:4, Informative)
For what it's worth, I use Monotone daily and find the performance acceptable. For the record, Linus used Monotone at a particularly bad time it its development cycle [mail-archive.com], when it was very slow and the main designer was on vacation. Nonetheless, the Monotone developers emphasize correctness and integrity over speed, and Mercurial and Git were direct responses to the performance of Monotone. Still, the performance of Monotone is always improving.
Re:Distributed version control gaining ground in F (Score:4, Insightful)
There is still a central or main copy, otherwise you'd be herding a lot of slowly diverging forks! Most projects want to produce a release eventually and there is a main copy of sourcecode which the release is produced from.
Imo, the reason Linus dislikes SVN and CVS and pretty much everything else is because of speed, because most SCMs lack the ability to work with merging different copies of repositories and work on a commit level instead, and do not allow for easy development routing around the central copy.
~$ mv CommitAccess MergePrivileges (Score:5, Insightful)
(http://excelcia.org/)
Essentially, he explains, the secret with Git is that everyone has commit access on their own branch - they do whatever they want. He says that the way it works is that someone does something cool with their own branch, then they start hollering to say "Hey, I have a good branch, merge mine" and it will get merged. Politics over.
Ok, so now I'm scratching my head. How is this a fundamentally different paradigm? In CVS, basically anyone can check out the whole tree and make any changes the like. They can then say, see, my changes are good and ask for them to get committed or ask for commit access themself. In Git, this commit access bottleneck is just moved from the commit stage to the merge stage. You make your changes, commit them to your separate and unique branch, and then ask someone with to merge it, or give you the ability to merge it in to mainstream. How exactly does this eliminate the politics? You are still going to have some people with "the power" and some people without. In any project where you have people who are going to fight about who gets commit access, you'll just have a fight about who has the ability to merge into mainstream.
So, ok, distributed is nice (though for some projects central may be preferred) but I don't see how this magic system bypasses politics. In fact, I can potentially see more internal politics over this method. I can see factions gathering to support this or that branch, arguing about which is better, fighting about which one gets merged in. I can see the potential for branches going longer between merges, and more changes happening at once, making it harder to track problems. I don't claim these scenarios are more likely, but I do claim that this changing from a commit access to a merge access paradigm is just renaming the problem.
Re:~$ mv CommitAccess MergePrivileges (Score:5, Informative)
(http://iabervon.org/~barkalow/ | Last Journal: Saturday May 31 2003, @02:01AM)
It's not that the politics go away. It's that the policy is no longer a binary "yes or no" decision, so the technical arrangement mirrors the social arrangement. This doesn't work with CommitAccess because people wouldn't commit the same change everywhere they should, and they couldn't be restricted to only making changes they're trusted to make (there are people who are trusted to correct spelling in comments in any file in the tree, and Linus can look through the total changes they send and verify that they only change spelling in comments).