Open Source Code Maintainability Analyzed 264
gManZboy writes "Four computer scientists have done a formal analysis of five Open Source software projects to determine how being "Open Source" contributes to or inhibits source code maintainability. While they admit further research is needed, they conclude that open source is no magic bullet on this particular issue, and argue that Open Source software development should strive for even greater code maintainability." From the article: "The disadvantages of OSS development include absence of complete documentation or technical support. Moreover, there is strong evidence that projects with clear and widely accepted specifications, such as operating systems and system applications, are well suited for the OSS development model. However, it is still questionable whether systems like ERP could be developed successfully as OSS projects. "
At least... (Score:5, Insightful)
Documentation? (Score:2, Insightful)
The disadvantages of OSS development include absence of complete documentation or technical support.
Yeah, it's nothing like closed source software, which always has complete documentation. I mean, look at Windows itself. All of that documentation about all of those API calls, lots of useful specifications about interoperating with the underlying kernel, plenty of specifications about the NTFS file system...
Oh, wait. It's all kept "secret". Nevermind.
Disadvantage? (Score:3, Insightful)
And this differs from commercial software, how?
I've spent 20 hours trying to figure out how undocumented or broken features behave in Rational's Enterprise Product Suite 2003. And that's expensive software.
I'll choose the software whose source code I can examine any day of the week. Granted, I'm a developer. But it's much worse to lack both documention and source code.
Re:Ah yes. (Score:3, Insightful)
No suprise, some projects are best suited for OSS (Score:4, Insightful)
This is why accounting software, office software and lots of general use applications "suck" in the OSS word. The "motivation" is not there, even "ego" is not a good enough motivation. My fellow hackers will give me more props for some lousy 500 line python hack which does something weird and not so useful than a complete accounting software suite.
What would be interesting is to see a group of companies start an OSS project from the ground up, pour their own money, pay programmers. But then again, there is no motivation for that! Big companies are only interested in jumping on OSS projects that happen to have gained fame...
ERP (Score:3, Insightful)
Yeah, it's also still questionable whether systems like ERP can be developed successfully at all. I'd like to see statistics on the number of ERP implementations that go horribly wrong and wind up crippling or even bankrupting companies.
Why not open-source ERP? (Score:3, Insightful)
Custom Enterprise Resource Planning software sometimes includes parts no boss would want the IRS or other authorities to know. With Open Source they become blatantly obvious. In this case Security Through Obscurity is the only safe model.
Sure a HONEST resource planning software can be open source. But it won't ever make the company as successful as one with some... extras.
Re:Was this really a surprise? (Score:5, Insightful)
In fact, I'd go so far as to extend this to software in general. Even when the comments can really matter, like API docs for libraries, the documentation sucks as often as not. I see no advantage to OSS here, but I don't see a disadvantage either.
Re:At least... (Score:3, Insightful)
But it's usually illegal. The copyrights to that program are still owned by somebody somewhere, collecting dust and mold.
maintainability index = bullshit (Score:2, Insightful)
If you look at the desription, you'll see that the equation was mainly "calibrated" based on a bunch of projects at HP. But fitting such an equation to a handful of self-selected projects doesn't give you any idea of how statistically valid it is.
Furthermore, the maintainability index contains measures that you would expect to go up as software systems become bigger; therefore, it isn't even a meaningful comparison of software systems of different size (or a single software system over time): maintaining a 1MLOC project is just a lot harder than maintaining a 100kloc project, but you may be doing equally well on both of them.
Particularly amusing is a term of 50 * sin (sqrt(2.4 * perCM)) in the maintainability index, where perCM is the average percentage of comment lines per module. As perCM ranges from 0 to 100, the argument for the sin(.) function will range from 0 to 15 (ponder for yourself what that means about how much you should comment).
Until someone produces sound maintainability data with hundreds of software projects, the use of MI is just bullshit.
Re:You don't want to know what goes into sausage (Score:1, Insightful)
Re:At least... (Score:5, Insightful)
Without appropriate documentations, you end up doing what has been done all over again, studying the software to understand how it works, which can be taxing. Go look at somewhat complex OSS projects, try hacking gcc to spit out a different binary format without reading any documentation. Try understanding postgres without documentation. GUI applications like a CAD system are even harder to make sense out of. If you are actually talented enough, the sheer effort you will poor into understanding the system, you might as well spend it designing from the ground up.
Most people are not hackers, If they were, why would they need source code? crackers don't need source code to add functionality to any system, it's a matter of patching the object code, having a section of the code jump to your own code and return. But it's ugly, having source code makes it a little bit prettier but not much.
Documentation is the key!
Re:Was this really a surprise? (Score:5, Insightful)
Here's a nice experiment for you:
1. Select a random project, preferably one that's slightly buggy during ordinary use.
2. Subscribe to project's mailing list.
3. Politely inquire if the project has any kind of automated test suite.
4. Observe stumped reaction.
5. Kindly explain the absolute necessity of such a system in any non-trivial app.
6. Go down in flames.
That attitude needs to change.
Re:At least... (Score:4, Insightful)
Re:Bleh (Score:3, Insightful)
The difference, though, is that commercial products can't exist (or at least by all economic rights SHOULDN'T exist) without a userbase. SourceForge is littered with stuff that's so bad it's completely unusable. You can't get away with that with a commercial product, although that doesn't necessarily mean the project is MAINTAINABLE ;-)
And I didn't think you were blaming OSS, just picking up your thread and running with it.
Re:Results would be fairer (Score:4, Insightful)
Anyone who has used perl knows that if you need anything beyond POD, you're not ready for perl.
Perl is "unclean" for a reason - it is there just to get the job done quickly, not necessarily cleanly.
Just try and start documenting perl - since there is more than one way to do things, you'll end up giving in to the urge to change code as you document it - so the documentation never gets written, and the code never gets finished. It's a fool's errand, a sisyphian task, the modern equivalent of a "bucket of steam". Sort of like the distraction of commenting while meta-modding.
Teach a man perl - he writes code to do the task in a few minutes. Ask him to document perl - he writes code forever - and still leaves the job unfinished.
Some tasks, and some languages, just weren't made for documentation. [tt]
Re:Design docs (Score:3, Insightful)
Not knocking inline documentation - I think it is a great idea, but you have to make sure that developers buy into it.
Really there is a lot of common sense that can go into coding standards to help reduce recurring bugs in "problem functions". Rules for initializing and using globals, rules for maximum method length, code ownership, and small group code walkthroughs can do a lot to prevent the kind of problems you mention.
Corporate OSS is an Ad-hoc Corporate Alliance (Score:4, Insightful)
The problem with a software company filling this role is that their system is proprietary and unmodifiable by the client. Most companies *do* have the resources to hire a programmer or a contractor to add a feature to a piece of OSS.
Anyone have any ideas on how to prevent abuse of such a system? That is, too many people using the system and not enough people contributing?
Re:At least... (Score:5, Insightful)
Sort of ... kind of ...
There comes a point where, particularly without design documentation, the bar is raised so high that the effort involved in maintaining something is more than that involved in moving to a new product. There's a scaling problem here. What works with small, simple direct programs doesn't work with large, complex or indirect programs.
And some OSS code is simply completely undocumented, not even a comment -- apart from the licence. Something I discovered wandering through the XFree86 XKB code.
See http://firstmonday.org/issues/issue4_12/bezroukov/ index.html [firstmonday.org]
for a discussion some of the weaknesses of the open source model when it comes to program comprehension.
Re:Extremely Ridiculous Publishing (Score:5, Insightful)
It used to mean the combination of MRP ("Material Requirements Planning") + Accounting. Then along came PeopleSoft and kinda changed it to HR + Accounting. Then along came Siebel and everyone scurried to make it MRP + HR + accounting + CRM (not quite there yet, though). Then they noticed Kronos and they all scurried to make it MRP + HR + Accounting + CRM + Time & Attendance. And failed, because Time & Attendance is a big pain in the butt. Heh. So they partnered with Kronos instead.
The march of "embrace and extend" continues. Next app up: Expense Reporting (say bye-bye to Concur, etc., that's an easy app). Already on deck: data warehousing (say bye-bye to Cognos, Business Objects, etc., say hello to SAP BW). Soon to come: business process automation (say bye-bye to Ariba, etc.)
And so on, if you believe the pundits.
"ERP" has become a meaningless acronym, an umbrella under which every business app known to man is rammed into the same stinking pile of multi-million dollar shit. At some point it will probably implode from its own weight, and we'll go right back to the "best of breed" interoperable software model.
But it will be a while yet. I suspect in the meantime there will be some Open Source alternatives. I sure hope so.
Re:GUI (Score:5, Insightful)
You're actually trying to claim that Winamp's design is good?
Winamp and other players which try to emulate the look and feel of a "new wave" stereo do nothing but piss me off. Stereo systems have the bad interfaces they do because of an inherent lack of physical space; something that's still a concern with computers, but much less of one.
Here's to more programs like Rhythmbox and iTunes which have the *important* controls accessible, allow for easy categorisation of songs, and use screen space nicely. All that without having to resort to 6pt fonts.
Re:Was this really a surprise? (Score:5, Insightful)
Sin(Sqrt(comments_in_percent)) ??? (Score:3, Insightful)
They take sin(sqrt(mumble_percent)).
Now, I'm all for emperical data, but that is just bistromatics and totally insane.
They don't even say if the argument to the sine function is in degrees or radians and one is left to wonder if they even know themselves...
I have no doubt that if you take a piece of code and does a before&after check after some major rewriting it may tell you something.
But comparing two different pieces of code with this formula is just plain bogus.
Poul-Henning
Quality is still a happy user (Score:3, Insightful)
the works well and hopefully doesn't need a lot of documentation to make it work well. Great software
tends to teach the user how to make it perofmr or at least motivate the user to want ot invest the time to master the software for a particular use.
These guys need to understand that this approach to quality applies to all software, irrespective of
development model behind it. A software product with a lot of customers creates the momentum to maintain and enhance that product. An OSS product can be infused with similar energy due to acceptance by a large community of users (esp if many are programmer's too). The feedback from the users incents the programmers to maintain and enhance the product.
New models can be built from hybrids of OSS (donated programming in the commons) and products
that one must buy. If there emerges an ERP OSS app then there will be a business opportunity to document/train, support/fix/enhance/customize that application... and Oracle will feel the same frustration competing with that model that MS does competing with Linux.
These complaints against OSS as a model (no obtion to buy support or docs) are a business opportunity
that has been put into play by JBOSS, MySQL, and soon to be hundreds of others. The low barrier to entry is the key to high usage... It's try and don't buy (unless you'd like some training, customization, focus product enhancements, etc).
Volume, usage and effectiveness drives the software world. Quality just makes the ride more comfortable. And OSS gets more comfortable everytime the train puls through the station.
Re:This assumes commercial software is any better (Score:4, Insightful)
> dead documents that are outdated as soon
> as the first line of code is written.
So true. Or only one page will be kept up to date - the database schema diagram, because it can be automatically generated from the production database schema.
Meanwhile, new hires are referred to these documents with mumbles of "this is the design documentation, read this and you'll know everything". This statement is usually accompanied with a cynical smile and a shrug, indicating to the new hire the uselessness of the ritual. Ack.
Re:Was this really a surprise? (Score:4, Insightful)
If it needs more comments than code, it's a sign its overly-complicated and you need to rethink what you're doing and how you're doing it. In other words, your algorithm sux the bag.
If you can't write test cases for it because it's too tightly coupled to the rest of your code, you probably misunderstood the problem in the first place (or at least you're approaching it from the wrong direction)...
All the comments and documentation in the world won't make spaghetti understandable or maintainable.
Re:maintainability index = bullshit (Score:5, Insightful)
It's only amusing to people who don't bother to think about why it's there. It's actually a very insightful part of the metric.
First of all, perCM ranges from 0 to 1, not 0 to 100. Yes, that isn't explicitly stated, but it would be ridiculous otherwise. Second, try looking at the damn graph.
As I told somebody else, do it now. Don't pretend to do it, GRAPH the damn thing and look at it: sin(sqrt(2.4*x)) for x=0..1.
That graph makes it completely transparent what they're trying to accomplish with that part of the formula. First off, if comments are 0, the value is 0. Having no comments does not positively impact maintainability! Second, the function PEAKS at around 0.43. This represents an avgCM of 0.03, or 3%. Then, the function begins to go down again, but not as drastically as it rose.
What this is saying is that the benefit of comments has a maximum at around 3%. Having more comments than this tends to DECREASE the maintainability (and this is borne out by experience). However, having too many comments is better than having too few comments, so the function is skewed to the left side by the sqrt() function.
You see, every part of that expression makes total sense if you spend more than 2 nanoseconds thinking about it. Sheesh.
How many projects have died for maintainability? (Score:4, Insightful)
We can't use JSP's, there hard to maintain!
We can't use Javascript, it's loosely typed!
We have to use an Object Broker, SQL is not maintainable!
All the projects that I have been on where code maintainability has been the primary goal have one thing in common. They all failed.
If you spend all of your time worrying about how the code looks, you will never finish the project. Talk to people who have built successful software. (The ones that sold millions of copies.) Very few of them are proud of the code the wrote, but they are happy with the product.
The focus should always be on product quality, not code quality.
Re:Language and coding style (Score:3, Insightful)
I don't follow. How is that any more or less clear than:
object->doThis(that);
object->forThis(that);
Are you trying to say that the former is better because it looks more like English? Weird argument to make, considering the majority of the world's population doesn't speak English as their first language, so it's irrelevant that it happens to look like English.
Any language (except maybe assembler) can be written in a readable way. You just have to learn the idioms.
A Plead For Generic Code (Score:2, Insightful)
That also makes generic code even more important for OS. We have seen huge progress in the last years and most of the lower level stuff has appeared or become more mature.
Unfortunately though, this is also what I still found to be the most lacking in higher level libraries. Most projects go like this:
"We need a really cool search front-end for Google. Let's see whether there's a cool library for this. Nothing yet? Ok, then let's just start from scratch."
And then a library is implemented, with an API for searching using Google. It is perfectly usable for exactly what the developer wanted.
Now this seems reasonable at first, but think about this: In OS development, we are not bound to time lines. Why should we chose a half-hearted attack on a problem that will surely hit others after us again? Instead, the developer could also ask, "if I create a generic library, how generic can I make it? Do we have to limit it's capabilities to requesting stuff from Google? Or maybe I could even create a library that allows to query any search engine? Or maybe, provide an API the lets you query anything, like search engines, your local hard disk, p2p networks and a local database."
In commercial software development, this would be completely unreasonable overkill. In OS however, it's a great way to collaborate. And once implemented, the foundation for a lot of applications has been lied. And it's also fun - writing libraries is fun. It is a great component model either. And it is also a pleasure to see when the work is done, because once the foundation was set writing the actual application is as easy as plugging components together.
There are also good examples, of course. Gstreamer is one. But there is still so much potential. I would like to see this kind of thinking more and more.
Re:Was this really a surprise? (Score:2, Insightful)
I would be pretty certain that if you were maintaining code then you would have access to it...
*sigh*
Re:Results would be fairer (Score:2, Insightful)
Is there any other language that has an equal amount of code equally well documented? CPAN has the best documentation of reusable code in comparison with other languages. Sure, there's always a bad example, but seriously, saying that documenting Perl is hard is certainly far from the truth. -Samuel
cash for quality, cheap for free (Score:2, Insightful)
But ya, I know what you mean. I'm not a coder, just a consumer, but I would be more than willing to pay good cash (not x -hundreds, but maybe x-couple/3 dozen dollars) for an OS (once a year, not 3-4 times a year scamola) that paid all the developers, was still open source, had fewer apps but all of them *worked* and if I had a question or problem still remaining, I could go post on their bulletin board and get an actual for-real honest informed reply, from a paid company person, that answered my question or resolved the problem or at least determined that it was in fact pretty close to impossible to be resolved. Tell me the truth in other words, I can live with it. I'm getting sorta tired of "yep, we gots us a real community volunteer effort and it's gonna be the bestus and it almost works, just you wait until the next release!!11!1" stuff.
There's an open source business model that might work to sell a real joe consumer desktop and still pay the coders. I'll pay for 20 really well written apps with extensive user accessible written in hoo-mann speech docs, said apps that work and serve basic normal computing functionality for this "the masses" guy (moi and probably millions more), but I am not going to pay for two hundred or one thousand "almost works" volunteer apps on 6 cds and a dvd plus download even more!, etc , including the no credible help even if they have a so called wiki or board or mailing list or irc channel. Nope. I used to believe that but not any more. Not prudent any longer. This is 2005. We have entered the "better work right now right when it's installed" era, not the "coming soon" era, that was LAST century.
signed, joe consumer
Re:Was this really a surprise? (Score:3, Insightful)
If you are going to use your own experience as justification for your point, you should at the least state exactly what that experience is. It would be nice if you could also compile some statistics on the total number of open and closed source companies so that you can support your claim that your experience is representitive of each camp.
If you can't or won't do this, please do not presume to speak with such authority.
Some languages invite a more formal approach (Score:2, Insightful)
In Perl's case specifically, the language lends itself to quick scripting and shorthand. This is great for little tasks, but as everyone knows, it doesn't scale well. This isn't Perl's or Larry Wall's fault. It was designed to
However, I would like to point out that some languages actually enforce a clean specification. In particular, the functional languages almost literally *force* you to it. I am thinking here of the likes of SML, Haskell, OCaml, Clean, etc. This is not the place to discuss the fact that they also have a sane type system, where others have a broken one (e.g., C - see: http://perl.plover.com/yak/typing/). Language advocacy sucks, anyway ( http://www.perl.com/pub/a/2000/12/advocacy.html).
There's this story a guy posted once on comp.lang.functional about how once he wrote a Haskell program, handed it to the manager and he said: "great, you wrote a specification, now write the program." (!) The manager actually though what was executable Haskell code was just a "spec." Haskell fully suppports Knuth's literate programming approach.
If you ever tried writing something in those languages you know how they force you to write clean code. It is simply the easier way to break code in small functions. "Factoring", I believe it's called.
My point is that I feel the OSS community could greatly benefit from non-mainstream languages. These languages have seen nearly 2 decades of intense research. Arguments pertaining to performance just don't hold anymore vis-a-vis other mainstream languages like C#, Java, Python, Perl, etc. Clean has a video-game library to prove the point (http://www.cs.ru.nl/~clean/). OCaml, SML and Clean approach C performance (Ocaml coming second to C in some "language shootouts" for most benchmarks, for all its worth). Also, some of these have been used in large "real world" problems. OCaml has been deployed in the C code verification for correctness of the flight control software for the Airbus 340 airplane, for instance (http://www.astree.ens.fr/). It is widely known that Ericsson uses Erlang for their telephony switches (www.erlang.org). The CIL - Infrastructure for C Program Analysis and Transformation is written in OCaml and if more widely known, could prevent the weekly flow of buffer overflows developed in Berkeley and can be used *right now* in large OSS projects (http://manju.cs.berkeley.edu/cil/). They've tested it in the Linux kernel, for example (whether they sent patches or not, I don't know).
Of course, you have to have an open mind, and be willing to learn and throw some old tricks out and work using a different approach/mindset. Learning things thoroughly is always hard work, but the OSS shouldn't dismiss functional languages as "academic" - and for that matter, other serious approaches, like Squeak Smalltalk, for instance.
My 2 cents.
PS: I'm no expert at programming, just a beginner, but I offer my opinion here because I feel some people just haven't been introduced to some facts and haven't heard of some stuff. Not everyone has a big company to promote their language.
Re:Was this really a surprise? (Score:3, Insightful)
And if it not OSS, then either you have access to the code (in which case you can read it also), or you don't (in which case maintainability isn't even the issue, because there isn't any).
Whether the source is open or closed has no (direct) bearing on how good the source and documentation are, for those allowed to see the source and documentation.
Re:Was this really a surprise? (Score:2, Insightful)
I agree, though, that automated test suites are underused.
Automated test suites are vastly overrated. If all your subprograms have cleanly defined functions, cleanly defined I/O, and are small enough, it is very easy to run them through a few critical test cases.
Some people even gone so far as to substitute automated testing for clarity in design - the so called Test Driven Design. That seems like a recipe for disaster to me.