Forgot your password?
typodupeerror
Programming IT Technology

Tips on Managing Concurrent Development? 256

Posted by Cliff
from the dealing-w/-many-hands-in-the-cookie-jar dept.
An Anonymous Coward queries: "I work on a fairly large-sized project with at least a dozen developers. Advanced tools like CVS and ClearCase allow concurrent development, and provide merging tools to merge in different changes to the same file. This can be a significant productivity gain, particularly with files that are unavoidably common to several developers (C header files, most notoriously). During crunch times, such as before delivery deadlines, we often find that we are checking in changes to the same file several times a day, often hourly. The problem does not seem to be with conflicting changes to the same lines of code, but rather with developers knowing the sequence in which concurrent changes will be checked in. It is not possible to always be aware of who is checking in what and when, so programmers submitting patches to the baseline often have to redo those patches multiple times in a day in order to have them applied. Have other programming projects developed solutions for dealing with this problem?" The submitter proposes another solution, below, how well would it work?

"Take, for example, the extreme case of something like Linux (not only concurrent development, but geographically distributed development), how is this managed? One solution we were contemplating was to try to do an 'air traffic control' type of sequencing and conflict resolution. As early as possible in the development stage, we try to identify what will be finished when, and assign a one-up sequence number to each patch. Developers then know that they will be patching against the baseline that was patched by the patch with the previous sequence number. It is hoped that this prevents a lot of rework of patches. A potential problem with this approach is the need for a responsive central authority to assign sequence numbers. Also, such sequence numbers may have to be rearranged in the face of last minute advances and setbacks in developer progress. Despite careful scheduling and detailed design, it may be impossible to know the exact check-in sequence of patches more than a week or two in advance.

Will such an idea be successful, or is it fatally flawed? Are there better solutions to the problem with less effort? Are we treating symptoms and not the disease (i.e., should we be planning better so that we know patch sequences and dependencies early on)? Management likes to keep staff productively occupied and working up until deadlines, so this usually means a lot of checkins within a short period of time, rather than staged checkins. Can checkins be spread out over time while keeping developers productively occupied?"

This discussion has been archived. No new comments can be posted.

Tips on Managing Concurrent Development?

Comments Filter:
  • by Anomolous Cow Herd (457746) on Friday March 15, 2002 @05:19PM (#3170630) Journal
    Even though it is backed by CVS (and you could possibly get away with using just that), SourceForge OnSite (c) (sold by VA Software at a reasonable price) makes managing CVS and concurrent code development a snap! Just plug it in and code away with the knowledge that you are paying for the support services of one of the leading vendors of enterprise-grade Linux solutions.

    I wouldn't, however, recommend working with anything from Microsoft. Benchmarks and real-life statistics have shown that their source control solutions are not only slower, but are also less stable and more likely to corrupt your source tree. I hope you have backups!

  • BitKeeper (Score:2, Interesting)

    by EricKrout.com (559698) on Friday March 15, 2002 @05:21PM (#3170645) Homepage
    http://bitkeeper.com/Products.BitKeeper.html [bitkeeper.com]

    If it's good enough for Linus & friends, it's good enough for me ;-)

    MONOLINUX.com :: All Linux. No ads. [monolinux.com]
  • Subversion! (Score:3, Interesting)

    by Pointer80 (38430) on Friday March 15, 2002 @05:21PM (#3170650)
    Check out subversion [tigris.org].
    It's CVS, but better and based on WebDAV for RPC and BerkeleyDB for storage.

    Cheers,

    pointer
  • Extreme Programming (Score:5, Interesting)

    by Frank Sullivan (2391) on Friday March 15, 2002 @05:28PM (#3170698) Homepage
    Check out the development techniques of Extreme Programming (just search Google, silly, and buy a book or three). They have a real solid handle on concurrent rapid development.

    The real heart of Extreme Programming is "test-first" programming. The entire development process revolves around unit and integration tests, for extremely fine-grained control over code quality. Any changes that might impact other code should break a test. You fix the stuff that breaks, check in your changes, and move on.

    Multiple programmers touching the same C files many times a day sounds like you have either design issues, structural issues, or both. That just should not happen, crunch time or not. Heck, crunch shouldn't happen if you're managing your development correctly.

    If you're using cvs, conflicts with source checkins should be very easy to resolve. Even if two programmers touch the same file, they shouldn't be in the same function. If they are, you're back to management and architecture problems, and you need to fix those NOW before work grinds to a complete halt.
  • by jordan (17131) on Friday March 15, 2002 @05:38PM (#3170742) Homepage
    This is a common problem most easily managed by a human, not automated tools like CVS. On a per-tree, per-branch or even per-module basis, each chunk has a person responsible for managing changes to the tree, which I traditionally call the "Patch Master". This alleviates the common problem of multiple patches wiping each other out, as described.

    Patches are either sent directly to the patch master, diff'd against a base or branch, or are committed on a per-developer branch, after which the patch master is notified either by built-in CVS mechanism or email. In both cases, it is the Patch Master's responsibility to merge changes from diff or from branches. Merging is a tedious process, but this alleviates the productivity problems affecting everyone on the devteam, limiting it to just one person and allowing everyone else to progress with further development.

    Some people complain that having one person manage patches does not scale (i.e. "Linus does not scale"), but what I'm suggesting is a more collaborative, distributed, team-oriented approach -- perhaps you have a team of 10 developers with 5 "modules" in active development; each module is assigned a "team lead" as patch master and they are responsible for managing commits.

    --jordan
  • by dant (25668) on Friday March 15, 2002 @05:41PM (#3170760) Journal
    Yikes! You have the process problems of a group five times your size.

    If you have several people changing the same file in a given day, then one of two things (probably both) is wrong:

    • Coordination between features/projects. Somebody should be keeping an eye on the list of fixes/enhancements that are coming down the line and making sure you don't get too many in the same neck of the woods. This person doesn't necessarily have to be a developer (but should be able to speak developerese), and their whole job is to tell people, 'No, I'm sorry, but there's no room in the schedule for feature X that you want. Can I interest you in feature Y, which is in a different part of the code?'

      Also, if your code is fairly big (more than a few hundred thousand lines), you need to break it into logical chunks and assign somebody to watch every checkin to each chunk. That person is a developer and responsible for making sure new code gets reviewed and unexpected changes aren't being made. If your code is smaller, one person can probably do that.

    • Code Architecture. If several functionally-unrelated features end up needing to change the same file, then something is wrong with that file. There's too much going on in it--you've got to be dilligent about keeping your components small and keeping each component in a separate file. See the excellent Lakos book [barnesandnoble.com] for tips on how and why to do that.
    Most likely, your organization went way too fast at some point in the course of setting up the core code architecture and the processes by which you decide what does and does not go into a release. You need to get started fixing both--or this problem will keep getting worse and worse until you're unable to move forward through your own inertia.
  • by Rogerborg (306625) on Friday March 15, 2002 @05:51PM (#3170803) Homepage
    • To put this in perspective - while at Oracle with 1000s of engineers working on the same tree, we used ClearCase and it was awesome. The difference here is that there was much steeper a learning curve, and no normal engineers could actually do complex tasks - i.e. create branches etc. We had a complete groud dedicated to ClearCase.

    To put this in perspective, I currently handle the Clearcase side of a transatlantic development effort, with maybe 200 developers. The other side uses Continuus (office politics, don't ask). They have a complete config/build group. They even have a tools group that does nothing but evaluate, purchase and support tools for the config/build group. Until very recently, I handled the Clearcase side on my own. Part time (I'm a developer). It got to the stage where I would actually take the source from Continuus, import it to Clearcase, produce reports, perform a build and test it before the Continuus team could do it, and my builds got used in preference to theirs.

    Just goes to show, there's always a worse system, or other alternatives to explore. The developers who're used to using Continuus are all in love with Clearcase, and rebellion is brewing. One guy said that he'd learned to do in Clearcase in two weeks what it had taken him two years to learn in Continuus. And yet I agree with you: CVS is even easier than Clearcase, and does everything you'd need to do on a typical project!

  • Clearcase branches (Score:2, Interesting)

    by VictimlessChris (562438) on Friday March 15, 2002 @05:52PM (#3170810)
    I work with a number of other software developers on a fairly large project where things are constantly changing. We currently use Clearcase. To avoid multiple submissions in order to have changes accepted once, each developer has their own branch, which is a copy of the main one. Developers work in their branch changing files until they feel they're ready to sumbit changes to the baseline. They merge the current baseline into their branch, grabbing all changes so far, and then merge back into the main branch. This effectively checks out the files, so only one person can make and submit changes at a time.
  • by feloneous cat (564318) on Friday March 15, 2002 @05:56PM (#3170837)
    Oh, did I mention design?

    Most companies find (the ones that actually DO it rather than pay lip service to it) that designing a project PREVENTS the "patches on patches".

    Another thing that many people say they do (but rarely do) is actually have meetings that accomplish REAL goals rather than perceived ones. I have been in "design" meetings that were merely CYA meetings - nothing was designed and it was all a waste of time. On the other hand, I have been in meetings that I was invited to (but really had no business being in) that actually SOLVED problems a) BEFORE they happened or b) reworked the nature of the beast so that it was not nearly so intractable design.

    Communication. Not just CYA, but actually TALKING and LISTENING (you wouldn't believe the number of software engineers that just will talk and talk and never hear a damn thing).

    Making it a death penalty to break someones locks helps too...

  • by Anonymous Coward on Friday March 15, 2002 @06:14PM (#3170910)
    Another note, I get really worried when people say that process problems only show up at the end crunch time. If it is crunchtime it is time to use all of your processes, because the processes should be designed to produce the best bug free code the quickest... otherwise it shouldn't be in the process...

    Of course all real programmers should realise that this attitude is either from someone in academia, someone who never worked on a big project with real deadlines, or he's lying.
  • by Arandir (19206) on Friday March 15, 2002 @06:44PM (#3171049) Homepage Journal
    Where I work we DON'T have this problem. Some minor glitches occur of course, but that's life. But we don't get crunchtime panic.

    So what do we do that's different? I don't know 'cuz I haven't worked for some these chaotic outfits that everyone talks about. Here's some stuff that we do that might help however: the code base is divided into domains, then subdivided into feature sets; if the code in question isn't in your area, you generally don't work on it; only the feature lead checks in code related to a feature; bugs are assigned to individuals according to their area of expertise; if our code affects other areas or other domains, we alert people in those areas that we will be checking in, giving them enough time to freeze their view. Finally, and this may be a shock to some people, we actually have postponed handoff dates if we aren't ready to handoff.
  • Enjoy Coca-Cola! (Score:1, Interesting)

    by Anonymous Coward on Friday March 15, 2002 @06:46PM (#3171061)
    what's worse: having VA Software promoted on a VA property?

    or having it mod'd up to 5, while competitor (bitkeeper) suggestions mod'd down?

  • by conradp (154683) on Friday March 15, 2002 @07:01PM (#3171125) Homepage
    As someone who has used both CVS and ClearCase extensively, I'd say that this points out one of the major problems with ClearCase: checked-in files are seen immediately by all developers. (This is sometimes touted as an advantage, or even a "feature".)

    CVS lets developers update to see other people's changes at their own convenience. But that also means developers need to exercise some discipline to update frequently enough that their code does not remain too far out of sync with the baseline. This, combined with a "checkin early and checkin often" approach, should really minimize the number of conflicts, even for fairly large projects.

    I can't imagine the problems that the original poster described ever happening with proper use of CVS, but perhaps there's something in that "developing patch sets" phrase that he hasn't fully explained to us.

    Just a couple of other thoughts:

    Distributed ClearCase works reasonably (though I wouldn't say well) for projects that have a few interconnected sites, but is not well-suited at all for a project involving many different developers each in a different location. CVS is ideally suited for that type of environment.

    CVS really needs a way to move or rename files, and a way to do atomic checkins of multiple files. When will this happen? I know, "sooner, if I help."

    No version control system should prevent people from fixing code just because the code "belongs" to someone else, or is "being modified" by someone else. This sort of "coordination" and "planning" obstructs progress more than hinders it.

    Although it's possible in theory for an automatic merge to succeed while being semantically incorrect (with either CVS or ClearCase), I've never once seen it happen. If your code is well-written, the dependencies on certain assumptions should be fairly collocated, not spread all over the code where they could get out of sync.

    In a large, well-segmented project where the "frequent checkin" policy is used, it is rare indeed that two people even modify the same file at the same time, let alone modify the same lines.

  • Checkin Token (Score:2, Interesting)

    by Technomancer (51963) on Friday March 15, 2002 @07:04PM (#3171139)
    In one company I worked for we used Perforce and checkin token file. Only the person who had the token was allowed to check in, also he had to smoke test the project before releasing the token. We also had a tradition of adding haikus to the checkin token file. You can read them in one of easter eggs :)
  • by curunir (98273) on Friday March 15, 2002 @07:38PM (#3171268) Homepage Journal
    CVS is most everything you want from revision control

    What about file locking, code promotion, build labels or grouped check-ins? As far as I know, CVS has none of these. These are big issues.

    File locking removes the need for constant branching. Granted CVS's automatic merging capabilities are more advanced than most of its competitors, but branching is the enemy. It should be avoided unless it is absolutely necessary. You lose the ability to have two people work on the same file at once but, from my experience, saving yourself the hassle of losing changes is a big plus.

    Code promotion (as I understand it, I haven't worked too extensively with it) is nice because it allows developers to continue development while their code moves through the QA process and have their bug fixes easily merged back into the source tree.

    Build labels are great because it allows you to group file versions into a logical release (rather than just the current version at a specific date).

    Grouped check-ins are probably the feature that is most lacking in CVS. It amazes me how many people won't call MySQL a real database because of its lack of atomic transactions but are still willing to call CVS a version control system. If all application code was contained in one file, this wouldn't be necessary. However, it is often necessary to make a change to one file that requires a change to another file. If these files are checked in individually (as CVS does it), it is possible to get version conflicts with these files. To make matters worse, if the change needs to be rolled back, you have to remember to roll back both files. The situation gets exponentially worse the more mutually-dependant files you check in.

    The only real advantages of CVS over most commercial versioning software are
    a) free...important for open source projects without funding.
    b) readily available to make your source tree available to people outside your development team...also important for open source projects.
    and c) the large selection of front ends (gui,text, web and otherwise) that have been written for it.

    However none of these features qualify it as being an "advanced" (as the original post called it) version control solution.
  • by tlambert (566799) on Friday March 15, 2002 @08:26PM (#3171386)
    As the author of the original 386BSD 0.1 PatchKit software, I have to say that your "air traffic control" approach will not work.

    The 386BSD 0.1 patchkit used a serialization of patch numbers, with central assignment. The reason for this was that the patch dependency management was done by manually applying patches posted to Usenet, and then diffing the modified version of the code against a version with the previous N-1 applied.

    Effectively, it was a "human CVS repository" system.

    Ir was necessary, because the latency in the Usenet system meant that you couldn't "lock down" a file or set of files for some major change: you had to do what you wanted to do against what you had, which was almost never "the most currnet concensual version" of the code, and then hope someone else didn't win the race to "the repository" (at the time, terry@cs.weber.edu's incoming email, and then, later, Rod Grimes', Nate Williams', and then Jordan Hubbard's... no one wanted it for very long).

    This led to all sorts of problems; the major one was that the patch kit format was "reverse engineered" (not hard; the patch tools, except the creation software itself, were widely distributed), and a group started releasing patches in the "1000+" ID range, under the incorrect impression that the concern was over the patch namespace collision, not topological application problems. This eventually led to a big argument, and other people going off to play in their own sandbox.

    You've probably heard of "NetBSD". A couple (not all, of course) were motivated by communit rejection of the 1000+ numbered patches, which, while they were not colliding in serial number space, seriously blew out topological dependency space for modified files.

    In any case, that's exactly what you are doing with your code, when you plan on assigning patch numbers based on expectation of completion.

    With the number of people you have, the comments about contested interfaces being agreed to beforehand, and the comments about you having no real problem here in the first place are probably accurate.

    You can basically take a couple of approaches.

    The first is: don't accumulate patches, just check the code in. This respolves the problem of stale patches by not permitting them to become stale in the first place.

    The second is: "cvs tag" before any major commits, so that there is a baseline from which to work to resolve conflicts.

    Really, you should not be accumulating large patch sets, with as few people as are involved.

    If you have a huge offline latency from a developer or group of developers (e.g. you send a CDROM to Antarctia, and two months later the send back a CDROM with their patches on it), or if you have a huge number of developers, you should reconsider your chioce of tools.

    The 386BSD patchkit serialization of patch sequence numbers through a couple of human beings was a serious mistake. It had the emergent property of having a tiered set of priviledge. I'm convinced that this is what resulted in the current "core team/committer/less-than-dirt" striation in the BSD camps today.

    I mention this, because CVS has a similar, though somewhat less profound, emergent property of "The One True HEAD Branch". By its nature, it encourages a single direction for all experimentation and all forward looking thought, denying nourishment to any contradictory lines of inquiry, by chopping off the roots. CVS is, in a nutshell, anti-research. It prevents people from going off 90 degrees from where everyone else is headed, and discovering new territory.

    Perhaps you've heard of OpenBSD. It emerged because there was "One True HEAD Branch" in NetBSD (an early adopter of CVS, in Open Source-land), and several people felt strongly enough that the focus of the project should be secure systems research, that the resulting code directions were incompatible.

    Tools issues are at the base of nearly any strong divide you can name in an Open Source community.

    Linux currently has issue, where Linus is investigating the use of Larry McVoy's BitKeeper (Larry was smart, in that early on, he recognized the emergent properties tools choices force onto projects, and tried to design around the problem). It turns out that a single human CVS repository doesn't scale infinitely.

    FreeBSD is in the throes of a "To use Perforce or not to use Perforce" decision. Perforce supports seperate lines of concurrent developement.

    It fosters, as my former boss' boss, Ray Noorda, used to say, "coopetition": help each other make the best implementation according to their design, and then may the best design win.

    Perforce lets this happen, but it also tends to balkanize developement, if not everyone is using the tool. There are complaints in FreeBSD that significant work is taking place in Perforce branches that aren't visible to normal CVS users. The Perforce users complain back that there would be no need for Perforce, if the develeopement were permitted in the main CVS tree -- along with the breakage that would entail. Both arguments have merit. Right now, there is a truce... more of an agreement to disagree, and not force the issue today, but a promise that the battle will be fought to the death at some later date.

    For your project, a tool which supports multiple concurrent "One True HEAD Branches" seems like it fitys the bill (though as I wrote that, I still asked myself why, with so few people, it was an issue for you in the first place).

    Whether the tool you pick is Perforce, Bitkeeper, or some other tool that can support that developement model is irrelevent.

    What is relevent is that you understand that our tools shape the way we think about solving problems, and if you have already arrived at an approach that doesn't -- or *can't* -- fit into the shape dictated by CVS, then it's probably time to look at another tool.

    Not matter what you do, I can guarantee you that layering another, less adequate, tool on top of an already inadequate tool, will not fix your problem.

    I can also guarantee you that if you can't change your model to fit an existing tool, you're going to find yourself in the source code control tools business, instead of the business you intended to be in.

    Probably, you should rethink whatever premise it is that's resulting in large, infrequently integrated patch sets. If it's just your release engineering department not wanting to do their work on a branch, well, that's tough. Branch tag for releases as a matter of policy, and move on. If on the other hand, it's something more profound, perhaps you need to rethink your assumptions in favor of what the tools can do, vs. what you would like them to be able to do.

    Alternately, welcome to the source code control tool business.

    -- Terry
  • Re:BitKeeper (Score:2, Interesting)

    by akc (207721) on Saturday March 16, 2002 @02:45AM (#3172380) Homepage
    The funny thing is: this article [monolinux.com] on the site you mention flames bitkeeper to hell.

    I can't see how the article flames bitkeeper to hell. It selectively provides part of the linux kernel mailing list, with the original petition against the use of bitkeeper, followed by a number of the regular kernel hackers stating that a) bitkeeper is good, b) nobody is forced to use it and c) the orginal author is listing to comments about improving it.

    That definately doesn't seem like flames to me

  • What about Aegis? (Score:2, Interesting)

    by dohmp (13306) <dohmp@nOspAM.yahoo.com> on Saturday March 16, 2002 @04:17AM (#3172506)
    i've searched through the threads on this article, and i'm ABSOLUTELY SHOCKED that not one person has mentioned Aegis. i have used cvs since it's earliest days, and rcs prior to that (not counting a number of commercial offerings), and i must say, i despise it greatly. i find it to be EXCEEDINGLY clunky at managing concurrent development of FEATURES THAT YOU AREN'T CERTAIN ARE GOING TO MIGRATE TOGETHER INTO PRODUCTION.

    see, in my experiences, when dealing with huge software systems, it's rather tough to get all DEPENDENT changes to move in lock-step with your own project's changes when the problem is rather large. given that, we end up having to create branches & associated tags for EVERY FEATURE which we then merge all around into release-branches, which all just becomes a rather kludgy mess. the time spent on source code control begins to grow exponentially, and the skill required to do this safely grows with it.

    Aegis [sourceforge.net] solves these problems like the commercial configuration management tools (clearcase, PCMS/PVCS, etc, etc) in that it allows for deltas to code to be aggregated together and considered as one atomic change (many times relating these together as implementing a specific managed change (say from a bug-tracking system, etc)).

    this is ABSOLUTELY NECESSARY when you have hundreds of software developers on a project in my (admittedly limited in the grand scale of things) experience.

    i've always been shocked that so many open source developers were simply willing to put up with cvs given all its warts.

    hopefully this article and the discussions surrounding it force some folks to stand up and demand that the "state of the art" be advanced. (i realize that the state of the art is really far beyond what cvs does, but cvs HAS MIND SHARE).

    btw: aegis is GPL'ed and has been around for a LONG time. in addition, it's core concepts are similar to almost every other CM tool i've used (even including cvs, not counting the advanced features) so most people can get up to speed with it quickly.

    i'm also confident that the SIMPLER aspects of source code control / configuration managment could be integrated into most IDE's by building cvs-command-compliant wrappers around Aegis if the time-to-market for the integration were deemed more important than a native integration...

    oh yeah, last thing: aegis heavily entices you to check in test cases for your code. i've found this to be a simple but effective mechanism to aid people in building good regression test suites...

    i'm curious to see whether i'm alone in my opinions on these topics or not...

    cheers

    Peter

Genius is ten percent inspiration and fifty percent capital gains.

Working...