Forgot your password?
typodupeerror
Programming IT Technology

Open Source Code Maintainability Analyzed 264

Posted by Zonk
from the i-read-about-this-in-college dept.
gManZboy writes "Four computer scientists have done a formal analysis of five Open Source software projects to determine how being "Open Source" contributes to or inhibits source code maintainability. While they admit further research is needed, they conclude that open source is no magic bullet on this particular issue, and argue that Open Source software development should strive for even greater code maintainability." From the article: "The disadvantages of OSS development include absence of complete documentation or technical support. Moreover, there is strong evidence that projects with clear and widely accepted specifications, such as operating systems and system applications, are well suited for the OSS development model. However, it is still questionable whether systems like ERP could be developed successfully as OSS projects. "
This discussion has been archived. No new comments can be posted.

Open Source Code Maintainability Analyzed

Comments Filter:
  • by rdwald (831442) on Tuesday February 15, 2005 @05:40PM (#11682150)
    GNU General Public License (GPL)
    Berkeley Software Distribution (BSD)


    are all defined in the article.

    But not ERP.

    Go figure.
    It seems like ERP stands for Enterprise Resource Planning [wikipedia.org].
  • by melted (227442) on Tuesday February 15, 2005 @05:48PM (#11682252) Homepage
    I've worked on a major product in CRM market, and let me tell you, don't want to know what goes into sausage. If you knew, you wouldn't touch this code with a 10 foot pole much less bet your company on it.

    I'm sure it's the same with ERP. It's just a huge polished turd, but because you don't have the source code you don't know it's a turd. You only see the polish.
  • by Brent Nordquist (11533) <bjnord@gSLACKWAREmail.com minus distro> on Tuesday February 15, 2005 @05:51PM (#11682297) Homepage
    However, it is still questionable whether systems like ERP could be developed successfully as OSS projects.

    GNU | Enterprise [gnuenterprise.org]

  • No ERP, eh? (Score:1, Informative)

    by Anonymous Coward on Tuesday February 15, 2005 @05:56PM (#11682386)
    However, it is still questionable whether systems like ERP could be developed successfully as OSS projects.

    It is funny how sourceforge [sf.net] lists [sourceforge.net] several [sourceforge.net] ERP [sourceforge.net]-systems [sourceforge.net] then [sourceforge.net]. And the list goes on, by the way.
  • by Rei (128717) on Tuesday February 15, 2005 @06:21PM (#11682731) Homepage
    Well, the importance of a test suite rises dramatically with the complexity of the project. The difficulty of making a test suite increases with the amount of hardware that you need to implement it. When I think of "big" open source projects that aren't very hardware dependant - for example, ITK (the Insight Toolkit), they tend to have nice test suites. Naturally, the little ones don't, but little projects of most things don't have test suites.

    I agree, though, that automated test suites are underused. Also, not enough programmers (OSS or otherwise) seem to understand the importance of refactoring.

    A message to coders: People, if your function is more than 10 lines long, you should start to consider splitting it. If it's more than 100 lines long, you're probably doing something wrong. If you have the same code written with slight modifications two or more different ways, you're probably doing something wrong. Use templates rather than repeating code if your language supports them. If you ever feel "this should probably be commented more", don't comment it - split it up into functions and let the functions be their own comments (if you have to, comment the functions as well). Use const as much as physically possible (in supporting languages). Use array objects that clean themselves up instead of arrays allocated on-the-fly whenever physically possible. If you find certain variables being used often together, group them into an object. If you find a set of functions operating on an object and only that object, make them member functions. Etc.

    Just doing basic refactoring can make code far more organized and readable.
  • by dwheeler (321049) on Tuesday February 15, 2005 @06:37PM (#11682935) Homepage Journal
    It's worth noting that a slightly older variation of this paper was already referenced in Why Open Source Software / Free Software (OSS/FS, FLOSS, or FOSS)? Look at the Numbers! [dwheeler.com] back in 2004-09-30. Look at their results, the actual numbers give a rather positive story: "1. Using tools such as MI derived for measuring CSS quality, OSS code quality appears to be at least equal and sometimes better than the quality of CSS code implementing the same functionality."

    OSS is no silver bullet. Their last point is "OSS code quality seems to suffer from the very same problems that have been observed in CSS projects." Er, big surprise, they're all software.

  • by pclminion (145572) on Tuesday February 15, 2005 @06:49PM (#11683074)
    They take sin(sqrt(mumble_percent)).

    Now, I'm all for emperical data, but that is just bistromatics and totally insane.

    Metrics are already "black magic." This one is no worse or better than any other dimensionless metric I've seen.

    Obviously the input is in radians. The argument to a trig function is always assumed to be radians unless otherwise specified. Now, the sqrt(mumble percent) can only range from 0 to 1, so what we're looking at here is the graph of the sin function from 0 to 2.4 radians.

    Do it now. Graph it. Graph the function sin(sqrt(2.4*x)) from x=0..1

    Notice that this function (you might call it a transfer function) ramps up and peaks at 0.43 radians. That corresponds to a comment percentage of 3%. Then it begins to go down again. What does this mean? It means that there is a point beyond which more comments are not useful. If more than 3% of your code is comments, there's something wrong. That's all that part of the equation means!

    You only classify it as "bistromatics" because you're too lazy to do the thinking and figure out what it's for.

  • by rjstanford (69735) on Tuesday February 15, 2005 @07:14PM (#11683344) Homepage Journal
    The problem with a software company filling this role is that their system is proprietary and unmodifiable by the client. Most companies *do* have the resources to hire a programmer or a contractor to add a feature to a piece of OSS.

    Not quite. Most enterprise software comes with source available, and pretty much all of it gets customized once you get into bigger customers. Its actually a real PITA when it comes time to do upgrades. And yes, I'm an architect at an ERP vendor.
  • by pclminion (145572) on Tuesday February 15, 2005 @07:37PM (#11683618)
    Oh, it is quite explicitly stated: perCM is described as a "percentage" (as in "per hundred"), so it ranges from 0 to 100, not 0 to 1.

    The etymology of the word is irrelevant. In practice, people use the term "percentage" to mean parts-in-100 OR a fraction. Look at the second definition listed in the dictionary. It's a "part of a whole." I've used it both ways many times. So has every engineer I've ever worked with. It's usually obvious from the context, as it was in this case.

    Well, what you illustrate again is that the MI is a seat-of-the-pants kind of measure that was thrown together because it looks nice, not because anybody thought about statistical validity.

    Do you even know what the term "index" MEANS? It's a magical number used to quantify the unquantifiable! It's dimensionless! What, do you think the Consumer Price Index is any less black magic? Nobody ever implied that the number has any statistical validity, it is just a number which happens to do a fairly good job at helping people compare things.

    If it's useful to you, use it. If not, don't.

  • by BlueCodeWarrior (638065) <steevk@gmail.com> on Tuesday February 15, 2005 @08:11PM (#11684000) Homepage
    Speaking of CPAN, I use Acme::Bleach [cpan.org] to ensure that all of my Perl scripts are clean, easy to understand, and maintainable.
  • by Rei (128717) on Tuesday February 15, 2005 @08:12PM (#11684003) Homepage
    > e.g., writing just one function and using a function pointer to control
    > the inner loop behavior

    No, function pointers are evil. Pointers themselves can make things rather nasty, although they are important in some situations. Use inheritance and virtual inline functions. Inline functions run as fast as macros, and inheritance is a lot cleaner than using function pointers.

    > It is a hand-unrolled implementation of a bit-oriented run-length encoder

    I'd like to see a piece of code that by hand-unrolling runs 7 times faster and can't be functionalized. I sure haven't seen one yet, and can't picture how that would even be possible.

    > away from typical ideas of "maintainability" for the sake of speed

    Seldom is that the case nowadays in modern languages. You sound like you're writing in pure C, though ;)
  • by Jerf (17166) on Tuesday February 15, 2005 @10:27PM (#11685144) Journal
    We can't use JSP's, there hard to maintain!
    We can't use Javascript, it's loosely typed!
    We have to use an Object Broker, SQL is not maintainable!

    All the projects that I have been on where code maintainability has been the primary goal have one thing in common. They all failed.


    If that is their idea of "maintainable", they didn't fail because they shot for maintainable, they failed because they drank the kool-aide and trapped themselves into software paradigms that only work when oodles of resources are thrown at them. Smaller teams require more agile methods to get results, and that is also the mechanism whereby smaller teams can produce software where larger teams failed. (It goes both ways, I'm not claiming that as an absolute. But that small teams can and have beaten much bigger ones is an unassailable fact.)

    Certainly you've got some good facts at hand to learn from, but I think you're taking the wrong lesson away. Projects that simply ignore maintainability fail, too. Can you imagine Mozilla with no concern for maintainability, or the Linux kernel?

    The focus should always be on product quality, not code quality.

    If you don't have quality code, you don't have a quality product. You may have an adequate product. You may be in a situation where an adequate product is all you need. I have an adequate set of knives in my kitchen, because I can not afford quality knives. But I do not pretend that they are therefore quality knives.

    You're calling for a classic short-term focus, and you can and will suffer the classic penalties for short-term focus. I know, I've seen it first hand and dragged software products out of their local optima by the sweat of my brow. It's not easy, but either it happens or the product dies a code-quality death.

    You need to use the proper metric for quality. Inappropriately using and paying for a strong type system is anti-quality in my book; that goes for your other two examples as well, when done correctly. (SQL and JSP code both need to be rationally minimized via the application of Once and Only Once, but they are not the cause of the unmaintainability; the abandonment of Once and Only Once is. Once and Only Once is one of the most important aspects of any proper quality metric.) Your quality metric should have functionality built into it.
  • by Skuld-Chan (302449) on Wednesday February 16, 2005 @02:46AM (#11686565)
    I'm on the support side of a crm product and I totally believe you.

    Its often a case of fix one bug create 2 more. We've got customers who refuse to upgrade because they are worried about losing data, running into strange bugs that didn't exist in previous versions etc.

    I think a lot of stems from the fact that developers of this stuff tend to focus on putting new features in the program rather than stabilizing or documenting it.
  • Urban legend alert! (Score:3, Informative)

    by Anonymous Brave Guy (457657) on Wednesday February 16, 2005 @03:35PM (#11691764)
    People, if your function is more than 10 lines long, you should start to consider splitting it. If it's more than 100 lines long, you're probably doing something wrong.

    So I've been told, sometimes by some of the biggest names in programming. Unfortunately, a firm belief among the industry doesn't make them right.

    Rather than debunking this one here, I'll simply refer you to Steve McConnell's excellent Code Complete. McConnell cites a large amount of hard data to show that longer routines can be at least as good on both development time and error count grounds as shorter routines, and indeed exceptionally short routines (the 10-liners you're advocating) are amongst the worst on both metrics.

    As an aside for general interest, since I'm sure a lot of people reading this comment also found that book very good, it seems a second edition has recently been published, updating the examples by a decade or so and putting much more emphasis on recent coding approaches, particularly OO. Whether that is an improvement remains to be seen, but I'll certainly be buying a copy. I guess if he's reversed his position based on more recent studies, I'll have to eat my words, too, but I doubt it. ;-)

Life. Don't talk to me about life. - Marvin the Paranoid Anroid

Working...