Become a fan of Slashdot on Facebook


Forgot your password?
Programming IT Technology

Performance Tuning Subversion 200

BlueVoodoo writes "Subversion is one of the few version control systems that can store binary files using a delta algorithm. In this article, senior developer David Bell explains why Subversion's performance suffers when handling binaries and suggests several ways to work around the problem."
This discussion has been archived. No new comments can be posted.

Performance Tuning Subversion

Comments Filter:
  • Re:Why binaries? (Score:2, Insightful)

    by eikonoklastes ( 530797 ) on Wednesday May 23, 2007 @03:57PM (#19243497) Journal
    Tracking images/graphics while developing a web site?
  • Re:Why binaries? (Score:5, Insightful)

    by jfengel ( 409917 ) on Wednesday May 23, 2007 @04:04PM (#19243615) Homepage Journal
    It's really nice to be able to have your entire product in one place and under version control. Third party DLLs (or .so's or jars), images, your documentation... just about anything that's part of your product.

    That way it's all in one place and easily backed up. If you get a new version of the DLL/jar/so you can just drop it into a new branch for testing. If your customer won't upgrade from version 2.2 to version 3.0, you can recreate the entire product to help fix bugs in the old version rather than just saying, "We've lost it, you've got to upgrade."

    Basically, by putting your entire project under version control, you know that it's all in one place, no matter what version it is you want. Even if the files don't change, you know how to reconstruct a development installation without having to dig around in multiple locations (source in version control, DLLs in one directory on the server, etc.)

    Yeah, so it costs some extra disk to store it. Disk is cheap.
  • Re:Why binaries? (Score:5, Insightful)

    by autocracy ( 192714 ) <> on Wednesday May 23, 2007 @04:05PM (#19243643) Homepage
    Oh, I shouldn't feed trolling... but he does have an account... The target audience and main users of Subversion are not "high level network techs." Software developers / coders is where you want to look. That said, I'm disappointed in the article... I was hoping for tweaks rather than "use a tarball." The information / stats provided was interesting, though.
  • Re:Why binaries? (Score:5, Insightful)

    by javaxman ( 705658 ) on Wednesday May 23, 2007 @04:11PM (#19243743) Journal
    1) you want deployment without the need to build
    2) you have proprietary build tools limited to developer use, or release engineers unable to build for whatever reason ( similar to #1, I know... )
    3) images, of course.
    4) Word, Excel, other proprietary document formats are all binary.
    5) third-party binary installation packages, patches, dynamic libs, tools, etc.

    You're just not trying, or you're thinking of version control as something that only programmers would use, and that they'd only use it to store their text source. There are as many reasons to store binary files in version control as there are reasons to have binary files...
  • by OverlordQ ( 264228 ) on Wednesday May 23, 2007 @04:13PM (#19243773) Journal
    Subversion fails to follow symbolic links that point to code that other projects share for the sake of a minority that still develops using Windows (which doesn't have real symbolic links).

    I am an SVN newbie, but that kinda sounds like Externals [].
  • by Vellmont ( 569020 ) on Wednesday May 23, 2007 @04:38PM (#19244195) Homepage

    If SVN is so great... why is the majority not using it? It's not like it is entirely new.

    Momentum for the most part. CVS is good enough 95% of the time, so it takes some reason to change over. I've recently started using svn after using cvs for years. I'm still not as familiar with svn as I am with CVS.

    Personally I don't really like the different branching/tagging behavior in subversion, but I also think I just don't know it as well. Someday I'll have to find some decent documentation on how to use it properly.
  • by Cee ( 22717 ) on Wednesday May 23, 2007 @04:54PM (#19244451)
    Yes, version control is more difficult than not using any tool at all, but that goes for most stuff in life. There are certainly areas where usability can be improved.

    Fiddling with stuff you are not supposed to fiddle with is generally a no-no when using source control. I found though that I got used to the Subversion way to do things (learned that the hard way). For example Subversion on the client side does not really handle server side rollbacks of the complete repository since the files are cached and hashed locally. One way to make source control more transparent to the user could be to let the filesystem handle it.
  • by Anonymous Coward on Wednesday May 23, 2007 @05:28PM (#19245013)
    You mentioned that perforce allowed you to directly make changes to files and later reconcile them. Generally speaking that's a version control nightmare (as it's expected tht you check in and check out copies) and many users do exactly that.

    You sound like someone who's only used to the VSS way of doing things. Lock-Edit-Release. Try this with a parallel development shop where different teams are on different continents, throw in production support and bug fixing, and you'll quickly see where the true nightmare lies. (Especially if you don't do any branching or tagging)

    SVN/CVS users normally do optimistic locking, i.e. Copy-Edit-Merge.

    I personally prefer to have my local copy completely disconnected from source control, allowing me to edit files willy-nilly. (maybe to test some changes or do some debugging)

    Generally speaking, it's only a nightmare if you don't know what you're doing, or don't know how to merge.
  • by iangoldby ( 552781 ) on Wednesday May 23, 2007 @05:34PM (#19245071) Homepage
    If you put the toolchain into CM, do you also put the operating system in? Just as the sourcecode is no good if you don't have the right toolchain to build it, the toolchain is no good if you don't have the right OS to run it.

    I suspect the answer (if you really need it) is to save a 'Virtual PC' image of the machine that does the build each time you make an important baseline (or each time the build machine configuration changes). Since the image is likely to be in the GB size range, you might want to store it on a DVD rather than in your CM system.
  • Re:Why binaries? (Score:5, Insightful)

    by rblancarte ( 213492 ) on Wednesday May 23, 2007 @06:03PM (#19245369) Homepage
    I was thinking the same - especially since I use Subversion.

    But taking a quick look at the article, I get an idea - storing your binaries at different version levels w/ it. Say I am developing a software package, us SVN for each level of revisions. With major releases I could store the produced binaries with the package to prevent the need to recompile when I am pulling down a version. Basically it would truly version control your binaries as well.

    In some ways the article makes me wish I did that with the project I am currently working on. I might start doing it now.

  • by jgrahn ( 181062 ) on Wednesday May 23, 2007 @06:11PM (#19245447)

    You are not alone, but I think the problem is intrinsic (or nearly so). VC is one more thing you have to worry about that is not actually doing your work.

    If it isn't about doing your work, then why do you do it?

    Of course it is about doing your job. If you're a programmer, it's analogous to asking your C compiler not to suppress warnings. You would have to find those bugs anyway, and you would do a much worse job without the help.

    In my work, version control (or whatever fancy name ending in "management" you like to put on it) relieves me of enormous burdens. It lets me do separate work in isolation. It lets me plan and replan my work, reschedule so that feature B gets delivered before feature A. It lets me review other people's changes, and it lets others review mine. It lets me track the root cause of a bug, created years ago. It lets me know exactly what I delivered to some poor guy.

    Note though that you need more than a tool. You need to have a common view on how to use it in your environment.

    And you cannot have people who think it's useless non-productive non-work, because they won't care -- and quite soon they will turn it into useless non-productive non-work by taking "a few shortcuts" which negate all the positive effects of version control, making it analogous to wearing an expensive Armani suit and leaving the fly open.

  • Re:What about git? (Score:3, Insightful)

    by javaxman ( 705658 ) on Wednesday May 23, 2007 @06:40PM (#19245753) Journal

    The only reason I'd ever choose native Subversion over a newer system like git or Mercurial is if I needed some tool that had builtin Subversion integration and didn't support anything else. Absent that criterion, IMO if you choose Subversion it's a sign you don't really understand version control too well.

    What if you have a bunch of developers working with some ( unfortunately, let me say that ) Windows-only tools for historical reasons ? Are you really saying that I should have a team of VisualStudio users install cygwin on their systems ?

    git is great for Linux kernel developers, but 'install this massive compatibility layer to use this product' will fail to make you a lot of friends, especially in a Windows-friendly corporate environment. I say that as an avid, daily CygWin user and longtime Windows hater. We could have maybe picked Mercurial, but a year ago when we looked, it didn't even hit our radar as a possibility.

    Subversion has some little issues, but it's getting lots of attention, and the problems aren't bad. I'm a little suspicious that the performance claims of Mercurial might not be measuring apples-to-apples... an 'svn commit' is both an 'hg commit' and 'hg push', if you want to be fair.

  • by GrievousMistake ( 880829 ) on Wednesday May 23, 2007 @08:08PM (#19246655)
    Monotone is my current favourite also, but it's pretty different from the CVS/SVN style of work, and not nearly as widespread, which makes it harder to use in a team project. Git borrows a lot from it and gets exposure from being used for Linux kernel VC.

    There are still some reasons for choosing SVN over monotone though, the major one for me is partial checkout, which you learn to appreciate once you've been stuck behind dialup or on a cell phone. (On the other hand, SVN doesn't do complete checkouts.)

    People tread carefully when dealing with their version control. I think both Sourceforge and Gnome only relatively recently went from CVS to SVN. If you're still using CVS for current projects (or God forbid, Visual SourceSafe), it may make sense to get them switched over to SVN, and use monotone for small sandbox projects until you can make a good case for using it in a new, bigger project (especially one where you anticipate a lot of branched work, maybe with parallelly mantained branches).
    It seems simpler to develop and integrate tools with monotone than with CVS, and there's development going on for things like trac support, so I have high hopes for the eventual availability of a large number of tools for working with monotone.
  • by Anonymous Coward on Wednesday May 23, 2007 @08:25PM (#19246797)
    *sighs* So learn to use the tools you use properly then.

    >1) You want to make a copy of trunk to send to somebody:
    > tar cvf project.tar .

    tar cvf project.tar --exclude .svn .

    That excludes the subversion metadata. But, if you do that, you are most definitely doing the wrong thing. Never, ever, send things to third parties without checking things in and noting the revision of the stuff you send. Doig otherwise defeats one of the purposes of version control, to keep track of what is happening. If you send somebody a copy of your working tree, three weeks later you have absolutely no idea of what you actually sent him. If whatever you have checked out is broken and you don't want to break the trunk, create a branch, commit on that branch, and do an export from that branch instead. Branches are almost free in Subversion (only some metadata gets copied) and the advantages of knowing what you sent off are immense.

If I had only known, I would have been a locksmith. -- Albert Einstein