Slashdot Log In
Open Source Code Maintainability Analyzed
Posted by
Zonk
on Tue Feb 15, 2005 04:30 PM
from the i-read-about-this-in-college dept.
from the i-read-about-this-in-college dept.
gManZboy writes "Four computer scientists have done a formal analysis of five Open Source software projects to determine how being "Open Source" contributes to or inhibits source code maintainability. While they admit further research is needed, they conclude that open source is no magic bullet on this particular issue, and argue that Open Source software development should strive for even greater code maintainability." From the article: "The disadvantages of OSS development include absence of complete documentation or technical support. Moreover, there is strong evidence that projects with clear and widely accepted specifications, such as operating systems and system applications, are well suited for the OSS development model. However, it is still questionable whether systems like ERP could be developed successfully as OSS projects. "
This discussion has been archived.
No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
Full
Abbreviated
Hidden
Loading... please wait.
Results would be fairer (Score:3, Funny)
Re:Results would be fairer (Score:4, Insightful)
Anyone who has used perl knows that if you need anything beyond POD, you're not ready for perl.
Perl is "unclean" for a reason - it is there just to get the job done quickly, not necessarily cleanly.
Just try and start documenting perl - since there is more than one way to do things, you'll end up giving in to the urge to change code as you document it - so the documentation never gets written, and the code never gets finished. It's a fool's errand, a sisyphian task, the modern equivalent of a "bucket of steam". Sort of like the distraction of commenting while meta-modding.
Teach a man perl - he writes code to do the task in a few minutes. Ask him to document perl - he writes code forever - and still leaves the job unfinished.
Some tasks, and some languages, just weren't made for documentation. [tt]
Parent
Re:Results would be fairer (Score:3, Interesting)
If it's simple, it doesn't need documentation (remember the bad old days when people insisted on commenting every line of assembler - akkk! :-)
Me, I prefer the "Mobster" approach for perl - no comments:
I tried to read it but... (Score:5, Funny)
Only one man... (Score:3, Funny)
...dared to challenge this article.
(insert rousing action-series music) Hercules!
bah! (Score:3, Funny)
Haven't had a problem yet....
Re:bah! (Score:4, Funny)
Nothing to be ashamed of, that's a pretty average size.
Parent
Was this really a surprise? (Score:5, Interesting)
Re:Was this really a surprise? (Score:5, Insightful)
In fact, I'd go so far as to extend this to software in general. Even when the comments can really matter, like API docs for libraries, the documentation sucks as often as not. I see no advantage to OSS here, but I don't see a disadvantage either.
Parent
Re:Was this really a surprise? (Score:5, Insightful)
Parent
Re:Was this really a surprise? (Score:5, Insightful)
Here's a nice experiment for you:
1. Select a random project, preferably one that's slightly buggy during ordinary use.
2. Subscribe to project's mailing list.
3. Politely inquire if the project has any kind of automated test suite.
4. Observe stumped reaction.
5. Kindly explain the absolute necessity of such a system in any non-trivial app.
6. Go down in flames.
That attitude needs to change.
Parent
Re:Was this really a surprise? (Score:5, Informative)
I agree, though, that automated test suites are underused. Also, not enough programmers (OSS or otherwise) seem to understand the importance of refactoring.
A message to coders: People, if your function is more than 10 lines long, you should start to consider splitting it. If it's more than 100 lines long, you're probably doing something wrong. If you have the same code written with slight modifications two or more different ways, you're probably doing something wrong. Use templates rather than repeating code if your language supports them. If you ever feel "this should probably be commented more", don't comment it - split it up into functions and let the functions be their own comments (if you have to, comment the functions as well). Use const as much as physically possible (in supporting languages). Use array objects that clean themselves up instead of arrays allocated on-the-fly whenever physically possible. If you find certain variables being used often together, group them into an object. If you find a set of functions operating on an object and only that object, make them member functions. Etc.
Just doing basic refactoring can make code far more organized and readable.
Parent
Re:Was this really a surprise? (Score:4, Insightful)
If it needs more comments than code, it's a sign its overly-complicated and you need to rethink what you're doing and how you're doing it. In other words, your algorithm sux the bag.
If you can't write test cases for it because it's too tightly coupled to the rest of your code, you probably misunderstood the problem in the first place (or at least you're approaching it from the wrong direction)...
All the comments and documentation in the world won't make spaghetti understandable or maintainable.
Parent
Extremely Ridiculous Publishing (Score:4, Interesting)
Berkeley Software Distribution (BSD)
are all defined in the article.
But not ERP.
Go figure.
Re:Extremely Ridiculous Publishing (Score:5, Insightful)
It used to mean the combination of MRP ("Material Requirements Planning") + Accounting. Then along came PeopleSoft and kinda changed it to HR + Accounting. Then along came Siebel and everyone scurried to make it MRP + HR + accounting + CRM (not quite there yet, though). Then they noticed Kronos and they all scurried to make it MRP + HR + Accounting + CRM + Time & Attendance. And failed, because Time & Attendance is a big pain in the butt. Heh. So they partnered with Kronos instead.
The march of "embrace and extend" continues. Next app up: Expense Reporting (say bye-bye to Concur, etc., that's an easy app). Already on deck: data warehousing (say bye-bye to Cognos, Business Objects, etc., say hello to SAP BW). Soon to come: business process automation (say bye-bye to Ariba, etc.)
And so on, if you believe the pundits.
"ERP" has become a meaningless acronym, an umbrella under which every business app known to man is rammed into the same stinking pile of multi-million dollar shit. At some point it will probably implode from its own weight, and we'll go right back to the "best of breed" interoperable software model.
But it will be a while yet. I suspect in the meantime there will be some Open Source alternatives. I sure hope so.
Parent
At least... (Score:5, Insightful)
Re:At least... (Score:3, Interesting)
Sure you can. It's easy to forget, but there are people who are fluent in assembly language and can figure out a defunct, proprietary piece of software if necessary.
I agree that it barely meets the definition of "maintainable," but it can be done with some effort. I've done it myself, while trying to find a problem in one of our distribution binaries -- the bug I w
Re:At least... (Score:3, Insightful)
But it's usually illegal. The copyrights to that program are still owned by somebody somewhere, collecting dust and mold.
Re:At least... (Score:4, Insightful)
Parent
Re:At least... (Score:5, Insightful)
Without appropriate documentations, you end up doing what has been done all over again, studying the software to understand how it works, which can be taxing. Go look at somewhat complex OSS projects, try hacking gcc to spit out a different binary format without reading any documentation. Try understanding postgres without documentation. GUI applications like a CAD system are even harder to make sense out of. If you are actually talented enough, the sheer effort you will poor into understanding the system, you might as well spend it designing from the ground up.
Most people are not hackers, If they were, why would they need source code? crackers don't need source code to add functionality to any system, it's a matter of patching the object code, having a section of the code jump to your own code and return. But it's ugly, having source code makes it a little bit prettier but not much.
Documentation is the key!
Parent
Re:At least... (Score:5, Insightful)
Sort of ... kind of ...
There comes a point where, particularly without design documentation, the bar is raised so high that the effort involved in maintaining something is more than that involved in moving to a new product. There's a scaling problem here. What works with small, simple direct programs doesn't work with large, complex or indirect programs.
And some OSS code is simply completely undocumented, not even a comment -- apart from the licence. Something I discovered wandering through the XFree86 XKB code.
See http://firstmonday.org/issues/issue4_12/bezroukov/ index.html [firstmonday.org]
for a discussion some of the weaknesses of the open source model when it comes to program comprehension.
Parent
Bleh (Score:3, Interesting)
Re:Bleh (Score:4, Interesting)
Yeah, but don't blame it on OSS. This is simply another embodiment of the long-tailed distribution of human stupidity. In any human endeavor there are a large number of people who are Unskilled and Unaware of it [phule.net]. These people will try their hand at whatever catches their attention, and the results usually range from mediocre to terrible.
There's a lot of really bad fan fiction out there, too. And terrible amateur cartoons. And naive, uninformed political opinions.
What we witness on SourceForge is merely a demonstration of the inability of most people to accomplish anything of any importance. Nothing specific to OSS.
Parent
Re:Bleh (Score:3, Insightful)
The difference, though, is that commercial products can't exist (or at least by all economic rights SHOULDN'T exist) without a userbase. SourceForge is littered with stuff that's so bad it's completely unusable. You can't get away with that with a commercial product, although that doesn't necessarily mean the project is MAINTAINABLE ;-)
And I didn't think you were blaming OSS, just picking up your thread
Re:Language and coding style (Score:3, Insightful)
I don't follow. How is that any more or less clear than:
object->doThis(that);
object->forThis(that);
Are you trying to say that the former is better because it looks more like English? Weird argument to make, considering the majority of the
This assumes commercial software is any better (Score:5, Interesting)
Many of us have and are working in the "real world" out there, and I've been less than impressed with most documentation on large products.
Not to mention design documents, which end up being dead documents that are outdated as soon as the first line of code is written. To many corporations, there's no big incentive to spend so much money on these types of activities when you can have people just churning out code and finishing the darned product in the end.
I'm not saying commericial development is any worse, but I can't say it's any better for sure either.
Re:This assumes commercial software is any better (Score:4, Insightful)
> dead documents that are outdated as soon
> as the first line of code is written.
So true. Or only one page will be kept up to date - the database schema diagram, because it can be automatically generated from the production database schema.
Meanwhile, new hires are referred to these documents with mumbles of "this is the design documentation, read this and you'll know everything". This statement is usually accompanied with a cynical smile and a shrug, indicating to the new hire the uselessness of the ritual. Ack.
Parent
Mirror this article for closed source development. (Score:4, Interesting)
The real question is whether or not closed source projects are all that better off.
Disadvantage? (Score:3, Insightful)
And this differs from commercial software, how?
I've spent 20 hours trying to figure out how undocumented or broken features behave in Rational's Enterprise Product Suite 2003. And that's expensive software.
I'll choose the software whose source code I can examine any day of the week. Granted, I'm a developer. But it's much worse to lack both documention and source code.
Experience may vary... (Score:3, Interesting)
No suprise, some projects are best suited for OSS (Score:4, Insightful)
This is why accounting software, office software and lots of general use applications "suck" in the OSS word. The "motivation" is not there, even "ego" is not a good enough motivation. My fellow hackers will give me more props for some lousy 500 line python hack which does something weird and not so useful than a complete accounting software suite.
What would be interesting is to see a group of companies start an OSS project from the ground up, pour their own money, pay programmers. But then again, there is no motivation for that! Big companies are only interested in jumping on OSS projects that happen to have gained fame...
Corporate OSS is an Ad-hoc Corporate Alliance (Score:4, Insightful)
The problem with a software company filling this role is that their system is proprietary and unmodifiable by the client. Most companies *do* have the resources to hire a programmer or a contractor to add a feature to a piece of OSS.
Anyone have any ideas on how to prevent abuse of such a system? That is, too many people using the system and not enough people contributing?
Parent
Re:Corporate OSS is an Ad-hoc Corporate Alliance (Score:3, Informative)
Not quite. Most enterprise software comes with source available, and pretty much all of it gets customized once you get into bigger customers. Its actually a real PITA when it comes time to do upgrades. And yes, I'm an architect at an ERP vendor.
You don't want to know what goes into sausage (Score:5, Informative)
I'm sure it's the same with ERP. It's just a huge polished turd, but because you don't have the source code you don't know it's a turd. You only see the polish.
ERP (Score:3, Insightful)
Yeah, it's also still questionable whether systems like ERP can be developed successfully at all. I'd like to see statistics on the number of ERP implementations that go horribly wrong and wind up crippling or even bankrupting companies.
Why not open-source ERP? (Score:3, Insightful)
Custom Enterprise Resource Planning software sometimes includes parts no boss would want the IRS or other authorities to know. With Open Source they become blatantly obvious. In this case Security Through Obscurity is the only safe model.
Sure a HONEST resource planning software can be open source. But it won't ever make the company as successful as one with some... extras.
In theory... (Score:4, Interesting)
On the flip side, a closed source module could be built "top down" to a unified set of coding standards that would help maintainability. But it's not a requirement. I've seen plenty of code bases built just this way that were horrific... But still maintained and not changed because management was willing to throw enough money to keep things going (but not enough money to make it more interoperable).
YMMV.
Open Source ERP (Score:5, Interesting)
I could be mistaken, but isn't Compiere [compiere.org] an established OSS ERP implementation?
I think the questin shouldn't be: 'Can software like ERP be developed as OSS?' But rather: 'Are there enough people in the OSS community interested enough to develop this kind of software without any form of financial support?' I think the answer has turned out to be 'no'. The same goes for things like (good) financial software, and anything that would require heaps of work, high precision and coordination, but no spectaculair result for the common man to brag about.
Sin(Sqrt(comments_in_percent)) ??? (Score:3, Insightful)
They take sin(sqrt(mumble_percent)).
Now, I'm all for emperical data, but that is just bistromatics and totally insane.
They don't even say if the argument to the sine function is in degrees or radians and one is left to wonder if they even know themselves...
I have no doubt that if you take a piece of code and does a before&after check after some major rewriting it may tell you something.
But comparing two different pieces of code with this formula is just plain bogus.
Poul-Henning
Re:Sin(Sqrt(comments_in_percent)) ??? (Score:4, Informative)
Now, I'm all for emperical data, but that is just bistromatics and totally insane.
Metrics are already "black magic." This one is no worse or better than any other dimensionless metric I've seen.
Obviously the input is in radians. The argument to a trig function is always assumed to be radians unless otherwise specified. Now, the sqrt(mumble percent) can only range from 0 to 1, so what we're looking at here is the graph of the sin function from 0 to 2.4 radians.
Do it now. Graph it. Graph the function sin(sqrt(2.4*x)) from x=0..1
Notice that this function (you might call it a transfer function) ramps up and peaks at 0.43 radians. That corresponds to a comment percentage of 3%. Then it begins to go down again. What does this mean? It means that there is a point beyond which more comments are not useful. If more than 3% of your code is comments, there's something wrong. That's all that part of the equation means!
You only classify it as "bistromatics" because you're too lazy to do the thinking and figure out what it's for.
Parent
Quality is still a happy user (Score:3, Insightful)
the works well and hopefully doesn't need a lot of documentation to make it work well. Great software
tends to teach the user how to make it perofmr or at least motivate the user to want ot invest the time to master the software for a particular use.
These guys need to understand that this approach to quality applies to all software, irrespective of
development model behind it. A software product with a lot of customers creates the momentum to maintain and enhance that product. An OSS product can be infused with similar energy due to acceptance by a large community of users (esp if many are programmer's too). The feedback from the users incents the programmers to maintain and enhance the product.
New models can be built from hybrids of OSS (donated programming in the commons) and products
that one must buy. If there emerges an ERP OSS app then there will be a business opportunity to document/train, support/fix/enhance/customize that application... and Oracle will feel the same frustration competing with that model that MS does competing with Linux.
These complaints against OSS as a model (no obtion to buy support or docs) are a business opportunity
that has been put into play by JBOSS, MySQL, and soon to be hundreds of others. The low barrier to entry is the key to high usage... It's try and don't buy (unless you'd like some training, customization, focus product enhancements, etc).
Volume, usage and effectiveness drives the software world. Quality just makes the ride more comfortable. And OSS gets more comfortable everytime the train puls through the station.
Was in "Look at the Numbers!". Positive results. (Score:4, Informative)
OSS is no silver bullet. Their last point is "OSS code quality seems to suffer from the very same problems that have been observed in CSS projects." Er, big surprise, they're all software.
How many projects have died for maintainability? (Score:4, Insightful)
We can't use JSP's, there hard to maintain!
We can't use Javascript, it's loosely typed!
We have to use an Object Broker, SQL is not maintainable!
All the projects that I have been on where code maintainability has been the primary goal have one thing in common. They all failed.
If you spend all of your time worrying about how the code looks, you will never finish the project. Talk to people who have built successful software. (The ones that sold millions of copies.) Very few of them are proud of the code the wrote, but they are happy with the product.
The focus should always be on product quality, not code quality.
Re:How many projects have died for maintainability (Score:4, Informative)
We can't use Javascript, it's loosely typed!
We have to use an Object Broker, SQL is not maintainable!
All the projects that I have been on where code maintainability has been the primary goal have one thing in common. They all failed.
If that is their idea of "maintainable", they didn't fail because they shot for maintainable, they failed because they drank the kool-aide and trapped themselves into software paradigms that only work when oodles of resources are thrown at them. Smaller teams require more agile methods to get results, and that is also the mechanism whereby smaller teams can produce software where larger teams failed. (It goes both ways, I'm not claiming that as an absolute. But that small teams can and have beaten much bigger ones is an unassailable fact.)
Certainly you've got some good facts at hand to learn from, but I think you're taking the wrong lesson away. Projects that simply ignore maintainability fail, too. Can you imagine Mozilla with no concern for maintainability, or the Linux kernel?
The focus should always be on product quality, not code quality.
If you don't have quality code, you don't have a quality product. You may have an adequate product. You may be in a situation where an adequate product is all you need. I have an adequate set of knives in my kitchen, because I can not afford quality knives. But I do not pretend that they are therefore quality knives.
You're calling for a classic short-term focus, and you can and will suffer the classic penalties for short-term focus. I know, I've seen it first hand and dragged software products out of their local optima by the sweat of my brow. It's not easy, but either it happens or the product dies a code-quality death.
You need to use the proper metric for quality. Inappropriately using and paying for a strong type system is anti-quality in my book; that goes for your other two examples as well, when done correctly. (SQL and JSP code both need to be rationally minimized via the application of Once and Only Once, but they are not the cause of the unmaintainability; the abandonment of Once and Only Once is. Once and Only Once is one of the most important aspects of any proper quality metric.) Your quality metric should have functionality built into it.
Parent
The language is very important (Score:5, Interesting)
- C# and Java are more readable than C++
- At the end of this list are functional programming languages.
If you can read source more easily, then maintainability will be better.
This article [paulgraham.com] will tell you why you should be interested in functional programming languages. If you're smart and open minded, you will be convinced.
The best functional languages are Haskell [haskell.org] and Erlang [erlang.org] (click "next" at the bottom of the page).
For example, with Java you prevent bugs by static typing variables, example:
int numberOfTries = 3;
If you later try to fill "numberOfTries" with a string, the compiler will warn you of a bug and you'll have prevented it.
With Haskell, you don't have to type int. Haskell will figure out the type for you, you get the benefit of preventing bugs with the convenience of not having to type variables.
The reason I chose Erlang is because with functional purely functional programming languages like Erlang, you can automatically multitask your program over several CPU's (or this will take minimal effort). Nice feature to have in the future because every CPU manufacturere is going multi-core chip now. Also, you can easily make a server that never goes down with Erlang because your server is automatically clustered. Just plonk down a couple networked PC's and if one dies, the server cluster will just keep on going (a bit slower) until you replaced the power supply of the broken PC.
There are tons of other advantages but, as I said, the above links will convince you if you're smart. Haskell is a bit more academic in nature, they're figuring out the best possible language and Erlang is more polished and ready to go. It was invented by Ericsson to create ultra reliable realtime servers.
Re:Ah yes. (Score:3, Insightful)
Re:Design docs (Score:3, Insightful)
Not knocking inline documentation - I think it is a great idea, but you have to make sure that developers buy into it.
Really there is a lot of common sense that can go into coding standards to help reduce recurring bugs in "problem functions". Rules for initializing and using globals, rules for maximum method length, code ownership, and small group code walkthroughs can do a lot to prevent the kind of problem
Re:GUI (Score:5, Insightful)
You're actually trying to claim that Winamp's design is good?
Winamp and other players which try to emulate the look and feel of a "new wave" stereo do nothing but piss me off. Stereo systems have the bad interfaces they do because of an inherent lack of physical space; something that's still a concern with computers, but much less of one.
Here's to more programs like Rhythmbox and iTunes which have the *important* controls accessible, allow for easy categorisation of songs, and use screen space nicely. All that without having to resort to 6pt fonts.
Parent
Re:maintainability index = bullshit (Score:5, Insightful)
It's only amusing to people who don't bother to think about why it's there. It's actually a very insightful part of the metric.
First of all, perCM ranges from 0 to 1, not 0 to 100. Yes, that isn't explicitly stated, but it would be ridiculous otherwise. Second, try looking at the damn graph.
As I told somebody else, do it now. Don't pretend to do it, GRAPH the damn thing and look at it: sin(sqrt(2.4*x)) for x=0..1.
That graph makes it completely transparent what they're trying to accomplish with that part of the formula. First off, if comments are 0, the value is 0. Having no comments does not positively impact maintainability! Second, the function PEAKS at around 0.43. This represents an avgCM of 0.03, or 3%. Then, the function begins to go down again, but not as drastically as it rose.
What this is saying is that the benefit of comments has a maximum at around 3%. Having more comments than this tends to DECREASE the maintainability (and this is borne out by experience). However, having too many comments is better than having too few comments, so the function is skewed to the left side by the sqrt() function.
You see, every part of that expression makes total sense if you spend more than 2 nanoseconds thinking about it. Sheesh.
Parent
Re:maintainability index = bullshit (Score:3, Informative)
The etymology of the word is irrelevant. In practice, people use the term "percentage" to mean parts-in-100 OR a fraction. Look at the second definition listed in the dictionary. It's a "part of a whole." I've used it both ways many times. So has every engineer I've ever worked with. It's usually obvious from the context, as it was in this case.
Well, what you illustrate