Slashdot Log In
Designing a New Version Control System?
Posted by
Cliff
on Tue Jul 16, 2002 07:17 AM
from the replacing-rcs-and-obsoleting-cvs dept.
from the replacing-rcs-and-obsoleting-cvs dept.
tekvov asks: "When Linus Torvalds decided to use BitKeeper as the version control system for Linux there seemed to be a lot of controversy and many challenges to create a better system than CVS. My question is exactly what would this 'better system' look like? How is the subversion project, Tigris, doing at creating a new version control system? Basically, does the Open Source Community need new tools in this aspect of development? And if so, how should these new tools look?"
This discussion has been archived.
No new comments can be posted.
Designing a New Version Control System?
|
Log In/Create an Account
| Top
| 538 comments
(Spill at 50!) | Index Only
| Search Discussion
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
(1)
|
2
(1)
|
2
pretty gui's (Score:3, Funny)
Re:Clearcase... (Score:5, Informative)
(Try scaling a clearcase server and you'll see how bad the design is... Hint: Adding more CPUs won't help you.) No, most people won't care, but you do if you need to scale it to several thousands of active developers.
Even though I dislike the product, its has more functions than you'll ever need. Integration with different platforms and products are superb, if you're willing to pay...
However it lacks in areas where the developer isn't fully connected (i.e. with LAN access to the view and vob servers).
IMNHO, what the open source community, by definition, needs is something that'll work in a distributed (and disconnected) environment. Clearcase does NOT come even close to delivering that, CVS does, but functionality wise, BitKeeper blows them both away.
I haven't looked at SubVersion in a long time (before it was self hosting) and it looked promising, but IIRC it lacked some of the more advanced functionality that BitKeeper has.
Personally I'd much prefere using a completely free version. Not because I don't like to support the BitKeeper team (I'll buy the product if I use it commercially!), but because of the open logging function.
It just comes down to the fact that I like my privacy...
-oswa
Shome mishtake? (Score:4, Informative)
Shurely, the Tigris project subversion (http://subversion.tigris.org/)??
Simple answer. (Score:3, Insightful)
You said it, BitKeeper. It's there, it's very good, don't people have anything better to do than nagging about other people just charging for their own work?
If you want to give away your work, please do (I'm happy to use it) but you are not BitMovers (the company) mom and have no business telling them what to do.
Re:Simple answer. (Score:4, Funny)
I nominate:
'ByteLoser'
Who wants to slap up the SourceFarce page and start working on the icon?
We use Perforce at work (Score:5, Interesting)
Re:We use Perforce at work (Score:5, Informative)
And in case you don't like their fortcomming linux GUI (I hadn't heard about that before, thanks WPIDalamar) they do provide you with an API so you can make one of your own (KPerforce ^_^), which shouldn't take that long really.
The pricing seems very high for an individual, but their pricing is real cheap for this kind of software (for companies) and you can use it without a license but then with max two client specifications. They also have good support (something that is not common unfortunatly).
http://www.perforce.com -- go there and check it out. If you hate paying and want to make your own set of tools you can learn a lot from Perforce.
And I agree, source safe is icky, and so is CVS and source offsite. I haven't had a reason to try out BitKeeper so I unfortunatly don't know how it stacks compared to Perforce.
Re:We use Perforce at work (Score:5, Informative)
The choice is clear (Score:5, Funny)
More open-source revision control systems (Score:5, Informative)
OpenCM [opencm.org]
arch [regexps.com]
Stellation [ibm.com]
PRCS [sourceforge.net]
Re:More open-source revision control systems (Score:4, Informative)
Re:More open-source revision control systems (Score:4, Informative)
There's also Vesta [vestasys.org], which has some pretty cool features [vestasys.org]
one word: patchsets (Score:3, Interesting)
add that to cvs and make it actually work correctly and it would be pretty good.
at least that's what I miss the most when using cvs - the ability to change several files and commit them at once and when I do an update on a file it sould figure out all the dependencies on all other files ant update those as well. how sould it figure this out? simple - all the files that were commited at one time sould be also updated together, because it is bloody likely that they depend on each other.
of course this process should be repeated on all files that are a part of the patchset so that after updating a single file to a new version all the other files are compatible with it.
and yes, I know this could be theoretically done with tagging but then I would have to tag all files when commiting every time and it still does not handle the case when one file of the patchset depends on some other patchset.
One thing must exist (Score:4, Interesting)
Need new languages (Score:5, Interesting)
Most people will probably hate this, but for instance, if a comment for a specific line/block of code always had to appear in a specific area or syntactically consistant way such that the version control system can recognize that if a piece of code changed, but not the comments for that code, it could ask if the comments for the code need to be updated as well. Or if a function's parameters or return value have changed, whether or not all instances/uses of that function have also been changed, etc.
That is not to say that you cannot create a great system on top of existing languages, but that perhaps making some minor tweaks in the language to make the language itself easier to manage/version, then this may open up new tool possibilities.
Version control system minimum requirements (Score:5, Interesting)
1. atomic commits - your change happens only if all the
files can be processed. This prevents a corrupted workspace
when CVS processes half your files in a commit and then exits
on an error throwing the other half of your files on the floor.
2. change list management - all commits have a unique
reference number. CVS process files by directory instead of
by workspace, so it is impossible to tell which files are
associated with a commit.
3. access control by workspace or workspace directory - the
ability to give certain users or groups access to certain
workspaces or directories. Ideally, access control can be by
done by bug id.
4. graphical resolve of conflicts - a graphical three-way
diff is the only way to resolve complex conflicts
5. The ability to move files and directories and maintain
file history and label integrity from the client. CVS
requires the whole workspace to be locked so that moves can
be performed on the server side and does not maintain label
integrity.
6. web viewer and graphical difference viewer - the ability
to browse via the web change set lists to see what files
changed and what the actual differences were.
7. the ability to integrate workspaces across projects - the
ability to arbitrarily merge/integrate any source code from
any project to any other project.
8. powerful labeling features (parallel development and
prior version support).
9. rollback or undo multiple changes - this is great way to
recover from a developer commit disaster.
10. multi platform support - must run on all platforms.
11. command line and graphical interface. Command line for
scripts and graphical interface for those who can't work
without it.
12. push and pull notifications - built in support for e-mail
and news group notification of changes in the workspace.
Your humble build servant
Re:Version control system minimum requirements (Score:5, Interesting)
13. Version control on a sub-file granularity.
While I agree that this is a difficult problem, a typical use case is the "split a file" problem, which is supported by none of the available VC systems.
Most renames of files I have seen in large projects are not simple renames, but splits, where a file's code is moved to separate files due to a refactoring. Only one of those files can be associated with the old file using a rename-aware version control system. The revision histories of the functions in the other files are lost.
I don't have experience with implementing version control, so I don't know how solvable this is, but I can dream, no?
C-C
Re:Version control system minimum requirements (Score:5, Funny)
some thoughts (Score:3, Insightful)
Generally speaking, stuff like commit emails need the addition of specific wrappers (see http://cvsreport.sourceforge.net for instance), and CVS doesn't scale well to big projects
It's quite usable
Yes, it is time for a new tool... (Score:5, Informative)
And if that doesn't convince you, well, it's not for nothing that some of the primary developers of CVS are now working on subversion.
Now, of the new crop of tools, the only one I've played with extensively is subversion---but I am absolutely blown away by how well it seems to make common operations simple. Even with its documentation in a very rough state, and despite its many architectural differences from CVS (with which I have several years of experience), I was able to figure out how to maintain a vendor branch and local modifications, perform updates on both, merge them, tag releases, etc., very quickly and easily. Its developers have obviously looked at CVS to see what things it does not do well that people do frequently, and designed accordingly.
Is subversion for you? Who knows. But if you use CVS a lot---especially if you find yourself cursing CVS a lot---you should do yourself a favor and look at some of the alternatives. A lot of lessons have been learned, and you should avail yourself of the benefits.
I think the question is wrong (Score:4, Insightful)
Now, what an ideal system would be? I don't think one size fit all. You need very quick local net access (bye bye CC), and you need infrequently, losely connected internet developpers. But not at the same time. So I don't think tere is one unique response to your question.
From my perespective:Need Windows Support (Score:4, Insightful)
Key Feature: directory awareness (Score:5, Informative)
- fake out CVS by doing a remove/add pair on every file you want to move (which means you lose the revision history of each such file!), or
- manually move files around in the repository (which entirely defeats the purpose of using a revision control system in the first place!)
If anyone out there creates a successor to CVS, please fix this fatal flaw!There's a big difference between (Score:4, Interesting)
- The file you want versioned.
- The archive that holds it.
- The workfile you extracted from an archive.
- The shadow file automatically extracted from an archive.
- A directory.
- A project, which is not always 1:1 with a directory.
- A view, which is not a subset of files or directories.
For instance, I may have the file archive.c,v which I check out as myfile.c, which is shadowed as mainfile.c, which exists in multiple projects, inside different subdirectories, exposed whenever I have a view of a particular time on a particular branch for a given subset of a module.Everytime a version control system tries to combine things you run into problems. Take the GUI version of PVCS, which called Projects a collection of files (from different directories) -- which ended up enforcing that all filenames had to be unique, even if in different directories. And what they call Views is actually a subset of the list of available filenames.
Ever get the idea developers are so into archiving versions of a file that they gloss over the fact the file organization itself is a structure that also needs preserving?
Look to ClearCase for some pointers (Score:5, Informative)
However ClearCase has some -very- good features, and here is what I would arrive at (ideally):
1) Make your repository a mountable file system, supporting multiple types of connection, NFS, SMB, Active Directory, FTP, etc.. When connecting you must specify a profile to be used.
2) Make every user have a number of profiles (Min:1) (like ClearCase views), these profiles contain -all- the info needed to access file versions correctly. They should allow sharing ('base my profile X on the profile Y created by user Z'). And support concepts such as labelling, conditional branching, etc..
3) All profiles are managed from a central server (redundancy?) via a web interface (to achive cross-platform conformity) and command-line interface (SSH based) for scripting/power-users.
I could go on forever, but I think the three above points are the things that matter most to me. Obviously you also need security, administration, storage, etc.. but I think that making files available simultaneously via many common file sharing protocols would produce the greatest benefit.
Finally: MAKE DIRECTORIES VERSIONABLE/BRANCHABLE!, yes it causes some potential headaches, but it's benefits easily outweigh them.
10 problems with CVS (Score:5, Interesting)
2. Updates don't always work as expected. They won't grab new directories and a few other quirky things.
3. Empty directories should be pruned by default in a checkout or update.
4. I'm tired of seeing a CVS directory everywhere I look. How about
5. Access control is poorly handled. It's good that you can map virtual user names, but it would also be useful to control access by groups.
6. Local CVS tree file ownership is by user, not the CVS owner. This opens up all manner of problems for users with a local CVS repository. Repository data should be in a non-user account, checkout should force authentication, and the server should handle who has access to what. This would not be tremendously hard to manage, since in the general case a user has access to a project or not. Fine-grained access control of the repository isn't a common necessity.
7. Plays badly with (most) IDEs. When I want to work on a project in an IDE, it floods my checked out directories with all manner of crap I don't want in the repository. You can set up refuse files to clean these out, but it might break your IDE project. This is more a fault of IDEs than CVS, really.
8. Needs smarter add functionality. I don't like writing stuff like 'find
9. CVS is a boring acronym.
10. I can't think of a tenth thing.
What Exactly IS Wrong With CVS? (Score:5, Interesting)
Re:What Exactly IS Wrong With CVS? (Score:5, Interesting)
But a couple of day-to-day common tasks are painful (or just plain impossible).
Personally, sharing source files across multiple projects is a real pain. We do it with soft links in the repository (gag) so it can be done, but it's ugly.
Let's say you want to reorganize your directory structure without screwing up your history. Well, that's hard to do with CVS, so instead we'll just let the organization continue to be cluttered and confusing.
Heck, let's say you just want to rename a file, let alone a directory:
cp foo.c bar.c
cvs add bar.c
cvs remove -f foo.c
cvs ci -m "renamed foo.c to bar.c"
It just gets really annoying, and now bar.c can't be reverted version-wise unless you KNOW that its previous contents were in foo.c. It's a manual, error-prone, and tedious process if you ever need to do that.
I've been running a subversion server for months now just to test out. I can't wait to move to it. I like being able to say:
svn mv foo.c bar.c
svn ci -m "renamed foo.c bar.c"
and keep my history intact.In fact, writing this makes me want to just start migrating stuff by hand today! Subversion's important bugs (it is still alpha I think, it's slashdotted so I can't check the status as of right now) are almost all in features that CVS doesn't have anyway.
That said, I haven't really tried any of the other open source projects such as arch which have similar features. The main draw of subversion for me is the fact that I had to learn almost nothing to use it. As an experienced CVS user, subversion is trivial to learn. The effort they have put in to keeping things the same as long as there is no good reason to do otherwise is well-spent (at least from my point of view).
Plus, the subversion code is super readable and well-commented--honestly the best source I've seen.
Why "The Open Source Community?" (Score:3, Interesting)
Why only ask about the open source community. Do programmers need new configuration management tools?
CVS works fine for me. BitKeeper seems nice too. What I hate is that there's so much controversy just because BitKeeper isn't open source.
I once started coding a version control system... (Score:3, Funny)
I've never seen my code since...
I would love this: (Score:3, Interesting)
Of course a strict bug submission policy would be required to make this possible, but surely something like this could be done?
An added benefit would be clearer bug submissions which would help development to no end...
Yet Another useless discussion about CVS. (Score:5, Informative)
Yes, King.
I would not hesitate to say that it has it's share of difficulties, but there is no way anything is going to replace it anytime soon. There are many meta-features of CVS that make it unable to be replaced:
1. Multi-platform: I don't mean 3 or 4 or even 5 or 6, bla bla bla. I mean EVERYWHERE. I've seen CVS on more places that anything besides emacs and gcc. And really, anyplace gcc or emacs goes, cvs is the third guy there.
2. Massive Acceptance: CVS is everywhere. 10 million people use it with sourceforge. Another few million elsewhere. It is the common thread that binds us together (kinda like the force!)
3. Massive, Massive Tool support: This is my favorite. You can use it about a hundred different ways. Not 1 gui, but 50!. It goes into command line apps like great!. Show me another tool that has integration with the windows explorer (via TortoiseCVS) like it has. You Can't. (Don't even try that god-awful Bitkeeper's integration:yuk!)
4. SimplicityIt's REALLY simple to use. It's not that complicated. If CVS throws you for a loop, maybe software devleopment really isn't where you should be working. The incompetence among developers is what makes all software look bad.
5. Protocols: You can run CVS thru SSH, RSH, PServer, File Access, and more... It fits into every environment. It works across any damn network. It can jump tall buildings in a single bound!
Really, until someone makes something that trounces CVS in all those areas, AND provides features that "I can't live without" CVS will Rule.
When I first read the story title... (Score:3, Funny)
Can you hear me now? Good...
What about Aegis? (Score:4, Informative)
Every time the issue of version control and source code management comes up here, I've never seen anyone mention Aegis [sourceforge.net], which appears to have been designed to address the missing functionality in tools such as CVS which focus solely or mostly on simply maintaining multiple versions of a source base concurrently. Here's an excerpt from the CVS comparison in the CVS Transition Guide [sourceforge.net]:
1.5.1. Why should I change from CVS to Aegis?
The software seems to be pretty mature (currently at version 4.5, first released in 1991). Has anyone here used it?
One word (Score:4, Informative)
- Scratch pad versions. Ever needed to play around
with a piece of code (put in debug statements,
or change part of it temporarily to help debug
something) but didn't want to check it out
and have the threat of making the changes
accidentally permanent? Envy had the ability
to make a "scratch" version of a file - letting
you edit it, but not worry about accidental
check ins, or forgetting that you had made a
file writeable.
- Version/Releases. Not only could you label
a specific state of an application a "version"
but you could also label a version of an
app a "release". This allowed some subtle
distinctions between "ok here's a workable
version we can get back to (demo)" and "here's
the real, outgoing released version".
- Manager. Code could be given specific people
that were the manager, or "owner" of a piece
of code. If you wanted to enter your changes
into the code base in general, you had to get
the owner to do it. This control could be
anywhere from every check-in, to version or
releases. An owner could give permissions to
other people as well.
- Multiple checkouts. Envy recognized that
sometimes people have to work on the same
file, as much as its best prevented. So,
it allowed multiple check-outs, with facilities
to integrated the files back together on
check-in.
It was quite complex, but looking back at it I now understand why many of the facilities were there and die to have them for my team. We're using SourceSafe (blech), and it works ok, but something like Envy would be great.Get rid of the file system completely - simplify! (Score:5, Interesting)
Let's get rid of the file system/directory stucture schema and go with a completely revamped code storage method.
This has a ton of implications, but one thing that everyone seems to ask for that is difficult to solve on the old model is easy to work with if you remove the files and directories - sub-file VC. Being able to move modules from file to file, split files, move directories, etc.
The files and directories are there to help us understand the structure of the project, they were not meant to dictate the structure to us. We've locked ourselves into them so much so that we can't restructure the project without losing a lot of the benefits of VC.
Let's stuff our code into a database (which is like a more powerful file system, if you can't get your head around the idea). Atom updates can be built in. Symlinks are simple. Shifting a piece fo code to another 'file' is simple and the VC is not lost.
I can't be the first person to have thought of this - why hasn't it been done? Possible cons are:
Until the compilers and IDEs understand the new schema (regarding header files, includes, etc) the VC will also have to provide scripts to combine portions of code into files that the compilers can use.
How do we store the data in the database - it would depend largely on the language. Would we put a function in a blob of a record, or maybe even do line by line records. In highly OO languages (java) we could structure the database so there are class records that link to member records that link to variable and function records, etc.
Eventually the toolchains will attach to the DB directly.
Consider how this would aid huge and tiny projects alike.
I swear, the sooner we get rid of the file system (as is) the better - not just for this, but for all our information. But let's not get ahead of ourselves.
-Adam