Forgot your password?
typodupeerror
Bug Open Source Programming IT

Coverity Report Finds OSS Bug Density Down Since 2006 79

Posted by timothy
from the but-3333-is-a-cool-number dept.
eldavojohn writes "In 2008, static analysis company Coverity analyzed security issues in open source applications. Their recent study of 11.5 billion lines of open source code reveal that between 2006 and 2009 static analysis defect density is down in open source. The numbers say that open source defects have dropped from one in 3,333 lines of code to one in 4,000 lines of code. If you enter some basic information, you can get the complimentary report that has more analysis and puts three projects at the top tier in quality of the 280 open source projects: Samba, tor, OpenPAM, and Ruby. While Coverity has developed automated error checking for Linux, their static analysis seems to be indifferent toward open source."
This discussion has been archived. No new comments can be posted.

Coverity Report Finds OSS Bug Density Down Since 2006

Comments Filter:
  • Three? (Score:5, Funny)

    by Dhar (19056) on Wednesday September 23, 2009 @02:30PM (#29519217) Homepage

    "...puts three projects at the top tier in quality of the 280 open source projects: Samba, tor, OpenPAM, and Ruby."

    Counting, apparently, was low in quality.

    • Re: (Score:1, Insightful)

      by Anonymous Coward

      TFA says four.

      So, not only are the /. summaries merely paragraphs copied from the article nowadays, they're paragraphs copied incorrectly.

      • Re:Three? (Score:5, Funny)

        by akadruid (606405) <slashdot@thedrBO ... o.uk minus berry> on Wednesday September 23, 2009 @02:38PM (#29519337) Homepage

        and then you get so-called slashdotphiles, who think they can hear artifacts in the lossy story compression.

        let's see how you fare in a double blind test

        • Re: (Score:3, Funny)

          by Ihmhi (1206036)

          I have gold-plated Ethernet cables, so my Internets sound nice and crisp. You can really hear the richness in the lower kbps range.

      • Re:Three? (Score:5, Insightful)

        by eldavojohn (898314) * <eldavojohn.gmail@com> on Wednesday September 23, 2009 @02:59PM (#29519627) Journal

        TFA says four.

        So, not only are the /. summaries merely paragraphs copied from the article nowadays, they're paragraphs copied incorrectly.

        So if my summary was "merely paragraphs copied from the article" then where did I get the 1 in 3,333 and 1 in 4,000 numbers from?

        Also, if all I did was copy/paste the article, I'd be plagiarizing and -- not only that -- I would have copy/pasted the correct count of the projects in Rung 3 status. Instead I skimmed the report and was thinking "Rung 3" when I wrote that sentence the three was put in instead of the four. Doesn't make me any less wrong but I hate anonymous non-constructive criticism that's modded up. I apologize for my human error, obviously the human editor also missed it. Since you're anonymous, I can't assume you're human and beg you to relate to my plight of errors. I'm sure my error made the summary completely unreadable. I'm also certain that you've published hundreds of articles on Slashdot without so much as a single error in any of them.

        You do know that the number of submissions I've had recently, almost all have had some flaw or error in them. Simply because I realize there's no reward for fact checking. And there's no penalty for getting an error published. So assuming the summary sells to eyeballs and there's no error large enough to get it rejected the next thing is timing. I've written submissions that have been beat out by a few minutes and I get marked "dupe" by firehose. So that pushes me from taking 10-15 minutes to create a summary to 2-3 minutes. Oh well, the worse penalty is if I respond to the article (like this) I'm modded down by righteous moderators. Doesn't really bother me.

        If the editors aren't catching the errors and I've got no incentive to reduce the errors, do you think they're going to go away?

        • Re: (Score:1, Insightful)

          by Anonymous Coward

          You have an excuse. Mistakes happen.

          Mistakes like this is why we have editors. The post you replied to was somewhat out of line, though as a general rule I'd say they would have been more accurate than they were in this case. Most submissions ARE copied directly from TFA.

          The real issue is that this was a blatantly obvious, easy-to-catch mistake. We're not talking about to/too or their/there issues that a technically-oriented person may not pick up on at first glance; we're talking about something that t

        • Re: (Score:3, Insightful)

          by evanbd (210358)
          We're bitching about the slashdot editors, not you. It's their job to catch submitter mistakes. That is what an editor does. The really annoying thing is they're as likely to "edit" the summary to introduce mistakes as to remove them.
        • by pha3r0 (1210530)

          Honestly here everyone. It's a human error and its bound to happen. Hell why even waste time bitching just read on, use your brain to figure out that the 3 was an oversight and spend all the time you do bitching on a more productive task.

          And I know we are all anal retentive nitpicks but by FAR /. has clearer more intelligible writing then any other news or new summary site.

          I find typos, not just grammar errors, in almost ever major new story I read. Give the volunteers a break and go chew on someone that's

          • Honestly here everyone. It's a human error and its bound to happen. Hell why even waste time bitching just read on, use your brain to figure out that the 3 was an oversight and spend all the time you do bitching on a more productive task.

            It is fair sport to pick nits with spelling, etc. It becomes tedious and unsporting when the nitpicking erupts into a flame war.

      • by zizzo (86200)

        It's true but static analysis can fix this problem.

    • "Coverity Report Finds OSS Bug Density Down Since 2006"

      Bad news for entomologists, huh?

    • Re: (Score:1, Redundant)

      by Tubal-Cain (1289912)
      0. Samba
      1. tor
      2. OpenPAM
      3. Ruby
  • Fewer but bigger (Score:1, Interesting)

    by Anonymous Coward

    Why would Samba and Linux have got so unstable over the years, then?

    • Re: (Score:1, Offtopic)

      by poetmatt (793785)

      Maybe because grammar is tough for windows users?

    • by Virak (897071)

      Wait, Samba and Linux are unstable now? That's news to me. I can't remember the last time either crashed for me ever.

    • by jhol13 (1087781)

      Possibilities are endless, just a few here:
      1. Fixing bugs found by Coverity might give false sense of "goodness", especially as:
      2. Coverity does not catch all problems, e.g. timing or parallelism related. Dual cores are now abundant.
      3. A lot of hardware is flaky.

  • by ifwm (687373)

    puts three projects at the top tier in quality of the 280 open source projects: Samba, tor, OpenPAM, and Ruby

    Hmmm...

    In all seriousness, this seems to point to an increasing level of sophistication and maturity in OSS products and procedures, which can only be a good thing.

    • by V!NCENT (1105021)

      What was NASA's 'bug guideline'? I remember seeing or reading it somewhere, I thought it was one bug in 10.000 lines of code.

      I could be absolutely wrong! But I just like to know...

    • Not really, it's a meaningless statistic. Coverity has been publishing these reports for a few years. Every time they do, the relevant projects fix all of the bugs they find. The next year, some proportion of that code is the same code that already had these bugs fixed, so if the total number of bugs per line of code didn't go down it would be quite disappointing. On top of that, there are other static analysis tools, like clang, that are used by a lot of open source projects. Even if Coverity reports
  • by StuartHankins (1020819) on Wednesday September 23, 2009 @02:34PM (#29519281)
    "... and puts three projects at the top tier in quality of the 280 open source projects: Samba, tor, OpenPAM, and Ruby."

    Our chief weapon is surprise...surprise and fear...fear and surprise....
    Our two weapons are fear and surprise... and ruthless efficiency....
    Our three weapons are fear, surprise, and ruthless efficiency...
    and an almost fanatical devotion to the Pope....
    Our four... no...
    Amongst our weapons... Amongst our weaponry...
    are such elements as fear, surprise...
    I'll come in again.
  • by MosesJones (55544) on Wednesday September 23, 2009 @02:34PM (#29519283) Homepage

    The question of course is "Is 4000 good, average or bad?" can't be answered because closed source companies just aren't going to publish this sort of information.

    So what we can say is that the quality of OSS is trending upwards, but we can't say whether this makes it better, equivalent or worse than close source competitors.

    What are the odds on any of them taking up the challenge?

    • Re: (Score:2, Informative)

      by Anonymous Coward

      Actually the topic is the subject of research and the blog below quotes some book that says Microsoft is at 1/2000 lines of code.
      http://amartester.blogspot.com/2007/04/bugs-per-lines-of-code.html

      Of course, these studies try to assess the number of defects that have not been found yet... So the numbers are to take with a grain of salt, but apparently testing the software before delivery gets 90% of the bugs.

      The Coverity report is likely based on what the tool says, so you need a grain of salt for that too.

      Th

      • Actually the topic is the subject of research and the blog below quotes some book that says Microsoft is at 1/2000 lines of code.

        If some blog quotes some book that makes some claim about Microsoft being worse than Linux, that's good enough evidence for me!

      • Re: (Score:3, Interesting)

        by jc42 (318812)

        There can be some serious "methodology" problems in many of the definitions of "bugs", that can seriously confuse the bug counters.

        An example that I like to use is a project I worked on in the late 1990s. An important part of the package that I delivered included a directory of several hundred C source files, mostly small, with at least one bug in each. The project's leaders got some chuckles out of mentioning this at meetings, commenting that they had no intention of letting me fix any of the bugs, since

    • by MathFox (686808)
      What I heard from a Coverity employee doing a presentation is that the best closed source/commercial projects score as good as the best Open Source projects; bad commercial projects do as bad as bad Open Source projects.

      In other words, the variation in both categories is so big (more than a factor 10!) that one can not say either side is better with statistical relevance.

    • Re: (Score:1, Funny)

      by Anonymous Coward

      Actually, we did test our code here at Microsoft, we have 4200 defects by line of code, which is much better than the 4000 of open-source projects.

      wait a second...

    • by Strake (982081)

      The question of course is "Is 4000 good, average or bad?" can't be answered because closed source companies just aren't going to publish this sort of information.

      This is part of the reason that OSS is better than closed-source competitors - the bugs are widely-known, and therefore can be more readily fixed.

      This is also part of the reason that the quality of OSS is trending upwards.

    • by bloodhawk (813939)
      You can't even really say that quality of OSS is trending upwards. The same company using the same tools is doing the analysis, this brings a certain degree of bias as many of the flawes they point out the first time are fixed and therefore artificially lower the error count. The kind of analysis needed is one that covers stuff this code inspection tool doesn't cover to see how the error rates are really trending.
    • They probably wouldn't make a good representative sample, but you could take the source code of projects that were formerly closed and subsequently opened to see how many errors they averaged. The ID engines come immediately to mind.

  • ... or less effective bug checking?

  • That's some good coding.. Makes me feel like a n00b. I'm not sure what my bug to code ratio is, but I'm sure its a lot higher than that.
    • It is not as much that 1 line out of 4000 is average for each programmer. It is just that they fix the bugs before release.

      • by jc42 (318812)

        They also decrease the bugs-per-line count in their coding standards. That's why you see lots of blank lines in the code, lines that contain just a single brace, etc. The more lines you can spread your code over, the fewer bug you have per line.

        If you don't like this observation, you shouldn't be measuring bugs-per-line. But nearly every company does just that.

        There was also the funny thing a few years ago, when MS was claiming that some percent of linux code was stolen from Windows. Someone did a gre

        • I don't know.

          If someone uses retval for the return value they must be stealing my code.

          • by jc42 (318812)

            Then there's the infamous case of the AT&T /bin/true program, which was a shell script that contained nothing but a blank line and a copyright notice. So if you include blank lines in your code, you're violating AT&T's copyright.

            I had fun once (around 1990) by "publishing" the entire text of one of these on a newsgroup, and publicly challenging AT&T's lawyers to take me to court over this blatant copyright violation. For some unexplained reason, I never heard from them.

            (If you google for "/bin

    • by jimicus (737525)

      Remember the bug finding is automated. There are only some classes of bugs that can be automatically found.

      • And not just automated bug finding, but bug finding by static analysis. This is notoriously bad at finding bugs in programs that use shared libraries or indirection layers (e.g. code that calls other code via function pointers).
  • It seems logical that older security issues are more well-known and documented than newer ones. Is it possible that the results do not point to an improvement in coding quality so much as an inability to detect newer flaws as accurately older flaws?
  • Survivorship bias (Score:5, Interesting)

    by vlm (69642) on Wednesday September 23, 2009 @03:04PM (#29519689)

    Survivorship bias

    http://en.wikipedia.org/wiki/Survivorship_bias [wikipedia.org]

    The projects that were alive back then, and now, are obviously more mature, thus would have fewer bugs. Unless you believe in spontaneous generation of bugs at a constant rate in unchanged code (in my experience, actually not too unbelievable for old C++ compiled by the newest G++ due to specification drift)

    • Re: (Score:1, Informative)

      by Anonymous Coward

      Old projects doesn't necessarily mean old code. Currently, on average each day the linux kernel adds 13K lines, deletes 5K lines, and changes 2.8K lines. Over a year, that works out to roughly 4.5M lines, 2M lines, and 1M lines.

      For a project with roughly 12M lines of code, that's a pretty significant amount of churn.

  • I wonder if these code improvements lead to overall usability. That seems to be the biggest stumbling block for open source, not stability.
  • From the press release [coverity.com]:

    Since 2006, more than 11,200 defects in open source programs have been eliminated as a result of using the Coverity Scan service.

    While this is good for open source and demonstrates the value of static analysis, it is not surprising that if you fix the issues found, the number of issues remaining will go down.

    • Re: (Score:2, Informative)

      by chromatic (9471)

      If you fix the issues, Coverity moves the project to a new rung and performs stricter analysis to find more types of errors.

  • The bug-per-line count doesn't really give you a reasonable measure of product stability. A bug in the hotspot code is far likely to be triggered by the end user than one in the rest of the software is.

    So why not correlate the bug distribution with profiling data? I don't think this should be too difficult. You don't even need to do the profiling yourselves; when you obtain the source code just ask for the data from developers.

  • I love Coverity. I love other static analysis tools too -- I'm one of the lead developers for Perl::Critic, which performs static analysis on Perl code. They are enormously valuable tools.

    However, I've seen many cases where people read the issue report from the tool and fix the symptom rather than the problem. The improvement from 1 in 3333 to 1 in 4000 is fantastic, but that means 1 *Coverity issue* in 4000, not 1 *bug* in 4000 lines.

    My current closed source project has a Coverity count of 2 issues in 1

...when fits of creativity run strong, more than one programmer or writer has been known to abandon the desktop for the more spacious floor. - Fred Brooks, Jr.

Working...