Forgot your password?
typodupeerror
Programming IT Technology

Performance Tuning Subversion 200

Posted by ScuttleMonkey
from the geek-tweaks dept.
BlueVoodoo writes "Subversion is one of the few version control systems that can store binary files using a delta algorithm. In this article, senior developer David Bell explains why Subversion's performance suffers when handling binaries and suggests several ways to work around the problem."
This discussion has been archived. No new comments can be posted.

Performance Tuning Subversion

Comments Filter:
  • Re:Why binaries? (Score:4, Informative)

    by autocracy (192714) <slashdot2007NO@SPAMstoryinmemo.com> on Wednesday May 23, 2007 @03:57PM (#19243491) Homepage
    First answer: Images. Many other possible answers... :)
  • Re:Why binaries? (Score:4, Informative)

    by teknopurge (199509) on Wednesday May 23, 2007 @03:59PM (#19243537) Homepage
    release management - you can store _compiled_ application bundles, ready-to-go.
  • Re:Why binaries? (Score:5, Informative)

    by Anonymous Coward on Wednesday May 23, 2007 @04:00PM (#19243557)
    putting a toolchain under CM control, so that you can go back to not only an earlier version of your own code, but the version of the toolchain you used to compile the code at that point in time. Absolutely necessary to be able to recreate the full software environment of a past build, without relying on that version of the toolchain still being publicly available (not to mention including any patches/mods you made to the public toolchain).
  • by scribblej (195445) on Wednesday May 23, 2007 @04:04PM (#19243617)
    You ever try to move a directory structure full of source code from one place to another in CVS -- or even to move or rename a single file...?

    HINT: When you do it the way CVS provides, you will lose all of your revision history.

    SVN does not have this fatal flaw.

  • by frodo from middle ea (602941) on Wednesday May 23, 2007 @04:08PM (#19243681) Homepage
    My solution, use svn+ssh and keep a ssh connection to the svn server in Master mode. All svn+ssh activity tunnels through this master connection , no need for ssh handshake each time or for that matter no need to even open a socket each time.

    Plus if the master connection is set to compress data ( -C ) , then you get transparent compression.

    Now if only I could expand all this to fit 2 pages....Profit!!!

  • by Anonymous Coward on Wednesday May 23, 2007 @04:21PM (#19243905)
    You ever try to move a directory structure full of source code from one place to another in CVS -- or even to move or rename a single file...?

    HINT: When you do it the way CVS provides, you will lose all of your revision history.

    SVN does not have this fatal flaw.


    What the hell are you talking about? You just log into the CVS server and move the directory/file in the repository.

    Having to write config files by hand to route around non-existant symbolic links support on a platform that does support symbolic links is what I call a "fatal flaw".

    If SVN is so great... why is the majority not using it? It's not like it is entirely new.

    I can tell you why. Because developers are still angry with that wet-script-kiddie-dream-called-autoconf it selfimportant complaints about M4-here and can't-find-AC_blablabla-there. They don't want to run into the next selfimportant barrier on their way to actually get their project done. CVS just WORKS! For many years now. And if you have problems moving files/directories because your project is hosted on SF then that's the consequence of your choice and not CVS's fault.

    But maybe it's more about configuring the projects development environment these days than getting work done.

  • by eli173 (125690) on Wednesday May 23, 2007 @04:25PM (#19243965)

    Anyway, it's safe practice to check in the trunk modifications before you merge.

    I think you missed his point... he'd committed all his changes. The problem is that if you merge a file or directory deletion in, where that file or directory had modifications committed, Subversion won't tell you about the conflict, but will delete the file or directory including the new modifications.

    You wanted to delete it, so who cares, right?

    Subversion represents renames as a copy & delete. So now, you rename a file or directory, and do the same dance as above, and the renamed file or directory does not have changes that were made on trunk under their previous names. So renaming a file can re-introduce a bug you already fixed.

    No big deal, the devs will fix it soon, right? Wrong [tigris.org] and wrong again [tigris.org].

    That is the problem.

  • by weinerofthemonth (1027672) on Wednesday May 23, 2007 @04:40PM (#19244247)
    Based on the headline, I was expecting some great method for tuning Subversion for increased performance. This article was about performance tuning your processing, not Subversion.
  • Re:Why binaries? (Score:1, Informative)

    by Anonymous Coward on Wednesday May 23, 2007 @04:45PM (#19244317)
    For this you should create a software repository that you store your jars / exes / binary files and include their version number in the name or directory. Then back it up.

    Version Control is for when you can actually see a difference in versions.

    If you have jars checked into CVS / SVN you should move to using something like Maven so you can store your internal jars on a web server.
  • by Crazy Taco (1083423) on Wednesday May 23, 2007 @04:53PM (#19244439)

    For many open source projects, finding good documentation is hard. In the case of Subversion, it couldn't be easier. In fact, the Subversion team has taken documentation to such a level that they should be considered THE model for documentation in the open source community. They have written a book (published in print by O'Reilly, but maintained and posted for free by them on the Internet) that documents their system, and it is very good. My job at the last company I worked for was to write wizards for the Eclipse platform that would automate several of the most common tasks that a Subversion user would try to do, and that book was the only reference I needed. You can find the book on their site here: http://svnbook.red-bean.com/ [red-bean.com] . They even do nightly builds of the book, so not only is their documentation complete and useful, it is also incredibly thorough and up to date.

    If anyone on here hasn't read it, DO IT, because the first half will teach you why you want Subversion rather than CVS or some other alternative, and how to use it and how to get the most out of it (second half is lower level stuff you may not care about). It even includes best practices. Once you really learn how to use Subversion, you won't want to use anything else. And this is the way to get started.

  • by LionMage (318500) on Wednesday May 23, 2007 @05:20PM (#19244903) Homepage

    So at worst it would reintroduce a bug you would be able to find and fix later - but who merges without checking it worked?

    What if the merges are done by someone who isn't familiar with all the code changes and the expected associated application behaviors? What if there are dozens or even hundreds of code changes in a branch being merged to trunk? What if your QA work is being done by people who are not developers and who have no involvement in the merge process?

    These are not just hypothetical issues. I work on a team which espouses the agile methodology, and many times we've missed bug fixes in merges because of the way Subversion treats moves (copy + delete instead of truly changing the parent directory of a given file), or because Subversion's merge facility got confused (especially when changes were made both to the branch and trunk versions of a file).

    Recently, I was put in charge of merging a branch to the trunk for my team's project, and discovered that some methods were duplicated because one of our programmers had deleted the original version of a given method, then pasted in a completely different implementation into a different location in the same source file. It was easy enough to catch this with Java classes (since they won't compile correctly if you have two instances of the same method signature in the same class), but JavaScript was a slightly different story...
  • by javaxman (705658) on Wednesday May 23, 2007 @05:35PM (#19245089) Journal
    At least in a general case, I couldn't expect the developers I work with to gzip their binaries before checking them into version control.

    Doing so means you have to unzip them to use them. Not very handy. Most users want to use Subversion the way they should be able to use version control- a checkout should give you all of the files you need to work with on a given project, with minimal need to move/install pieces after checkout. Implementing the 'best' suggested workaround would mean needing a script or other way to get the binaries unpacked. Programmers are often annoyed enough by the extra step of *using* version control, now you have to zip any binaries you commit to the repository?

    I'm unimpressed by their performance testing methodology... they give shared server and desktop performance numbers, but have no idea what 'else' those machines were doing? Pointless. I'd like more details regarding what they're doing in their testing. Their tests were done with a "directory tree of binary files", but don't say what size or how many files?

    My tests on our server show a 28MB binary checkout ( LAN, SPARC server, Pentium M client ) takes ~20 seconds. Export takes ~2sec. That must be a big set of files to cause a 9 minute *export*... several gigs, am I wrong? It'd be nice for them to say. Most of us, even in a worst case, won't have more than a few hundred MB in a single project.

    The only *real* solution will be a Subversion configuration option which lets you say "please, use all my disk space, speed is all I care about when it comes to binary files". CollabNet is focused enough on getting big-business support contracts that it shouldn't be long before we see this issue addressed in one manner or another. You -know- they're reading this article!

Reference the NULL within NULL, it is the gateway to all wizardry.

Working...