Forgot your password?
typodupeerror
Python Programming

Open-Source Python Code Shows Lowest Defect Density 187

Posted by Soulskill
from the errors-should-never-pass-silently dept.
cold fjord sends news that a study by Coverity has found open-source Python code to contain a lower defect density than any other language. "The 2012 Scan Report found an average defect density of .69 for open source software projects that leverage the Coverity Scan service, as compared to the accepted industry standard defect density for good quality software of 1.0. Python's defect density of .005 significantly surpasses this standard, and introduces a new level of quality for open source software. To date, the Coverity Scan service has analyzed nearly 400,000 lines of Python code and identified 996 new defects — 860 of which have been fixed by the Python community."
This discussion has been archived. No new comments can be posted.

Open-Source Python Code Shows Lowest Defect Density

Comments Filter:
  • by Anonymous Coward on Tuesday September 03, 2013 @05:25PM (#44751007)

    "Coverity fails to detect errors in python" would be my headline of choice here. Seem a much more reasonable explanation for the results.

    • by julesh (229690)

      "Coverity fails to detect errors in python" would be my headline of choice here. Seem a much more reasonable explanation for the results.

      Or, to put it another way, "static analysis tool fails to detect many potential errors in code whose authors use the same static analysis tool to find and fix potential errors." Which is hardly surprising.

  • by OzPeter (195038) on Tuesday September 03, 2013 @05:27PM (#44751033)

    I read TFS and both TFAs and all I can glean is that Coverity Scan service is some sort of report that measures defects in code, but never defines how such defect are determined. They articles also mention comparing open source code metrics, but the only project that is mentioned anywhere is Python.

    So what is a Coverity Scan service and why should I care? After all I can make up all sorts of metrics about my own software.

    • What is Coverity Scan service? It is a product they hope to sell you. Does advertising work? It just did!

    • by msauve (701917)
      "Coverity's code-scanning system for open-source projects... has been in place since 2006, when the effort was first funded by the U.S. Department of Homeland Security (DHS)."

      A defect is when the code uses encryption, and doesn't send the keys to the NSA, or uses smtplib, and doesn't bcc:archives@dea.gov.
    • by Krishnoid (984597) * on Tuesday September 03, 2013 @05:41PM (#44751165) Journal
      Here's the python dev's own page [python.org] describing it and how to get to the results.
    • So what is a Coverity Scan service

      It's the same idea as the 'lint' command, it picks up potential bugs.

      These sort of tools can't help improve the quality of your code. Having said that, in my (20+) years of experience it's not common practice to use these things, I've worked on several large "mission critical" systems and the Y2K ordeal was the only time someone even asked if I used such a tool, let alone demanded it. At the end of the day (actually more like a month) the "Y2K lint" tool's only practical achievement was to tick a due-dil

      • by gl4ss (559668)

        if it's the same as lint then eh..

        of course it has less "defects" to complain about.. with all that whitespace shit defined in language and all.

        "style errors" aren't defects. they're just matter of deciding who decides the right style..

    • by kermidge (2221646)

      Warning: pdf
      http://wpcme.coverity.com/wp-content/uploads/2012-Coverity-Scan-Report.pdf [coverity.com]
      explains much if not all that you ask

      For a good article and a fun read that goes into the background of Coverity and what it does, see
      http://cacm.acm.org/magazines/2010/2/69354-a-few-billion-lines-of-code-later/fulltext [acm.org]
      it's written by some of the developers and founders

  • Where is the study? (Score:2, Informative)

    by achacha (139424)

    I could not find a link to the actual study, instead the company links lead back to the article and the article leads back to the company home page. Is this more "faith-based computing"? I am interested in the comparisons to other languages and in what type of code was analyzed.

  • Hmmm (Score:5, Informative)

    by Anonymous Coward on Tuesday September 03, 2013 @05:32PM (#44751075)

    TFA seems to be about the Python interpreter, also known as CPython (because it's implemented in C), rather than about code written in Python itself. So maybe it has nothing to do with the Python language, but everything to do with the fact that the Python authors are apparently awesome C programmers.

    That's great, but most people interpret "Open Source Python Code" to mean code written in Python that is Open Source, not code written in C (to implement the Python interpreter) that is Open Source.

    • by Laxori666 (748529)
      Oh that is extremely misleading. To be honest though I did some mental math and I thought, really, out of the entirety of all programs written in Python, there's only one defect per 200,000 lines of code? Unlikely. Now it all makes sense. My world view has been repaired by your astute observations. If only you had not posted AC so I could direct merit and praise to the appropriate username and so your karma would increase thus assuring you a better rebirth in a heavenly realm wherein you could hopefully mee
  • Does it mean better coders, or better language? Seems like the results are ambiguous in their meaning.

  • The Slashdot summary is confusing, as is the eweek.com headline. Reading the article, it is clear that it is about the code that powers the official Python interpreter, AKA CPython, AKA /usr/bin/python. When I clicked the link, I thought Coverity had surveyed the entire world of open source Python code and discovered that Python programmers as a whole publish higher quality code than people who e.g. program in Ruby. That's not what the article's about.

    It'd be great if the headline in Slashdot were to be fixed to say, "Python interpreter has fewer code defects compared to other open source C programs, says Coverity."

    • by jrumney (197329)

      That makes more sense. From the summary, I thought the most likely scenario was that Coverity does not handle Python code very well based on my experience of random buggy Python code. It is to be expected that a widely used VM/interpreter is going to be of better quality than your average code.

  • Math impairment (Score:5, Informative)

    by fava (513118) on Tuesday September 03, 2013 @05:36PM (#44751113)

    0.005 defects per thousand lines times 400,000 lines gives a total defect count of 2.

    So where did the other 994 defects come from?

    • by Tumbleweed (3706)

      0.005 defects per thousand lines times 400,000 lines gives a total defect count of 2.

      So where did the other 994 defects come from?

      They were in comments.

    • I'm more interested in this software that detects bugs in code. Does it also solve the halting problem? Can it satisfy finite combinational logic in polynomial time?

      • I'm more interested in this software that detects bugs in code. Does it also solve the halting problem? Can it satisfy finite combinational logic in polynomial time?

        The don't claim to find all bugs. I have used Coverity, and they found quite a few bugs, and also found many instances of unclear code that wasn't really a bug but should be rewritten anyway. But they don't find most logic bugs, or flaws in your requirements, etc. You still have to use your brain for those. But you can use tools like Coverity and other dynamic and static analysis tools to flag the easy bugs so you can spend more time on the hard bugs.

        • Does it analyze source code or is it like a fuzz tester?

          • Re:Math impairment (Score:5, Informative)

            by ShanghaiBill (739463) on Tuesday September 03, 2013 @09:06PM (#44752449)

            Does it analyze source code or is it like a fuzz tester?

            It is static analysis of source code. It doesn't actually run the code, it scans it for patterns that might be bugs. I like Gimpel Lint [wikipedia.org] better, but it isn't either-or, so you can use both and they will find different bugs. You still need to do dynamic testing with something like Valgrind [wikipedia.org]. Tools are cheap compared to people, so you want to give your developers the best testing tools you can, and put your code through the wringer. We use six different tools for C/C++, and no code is shipped out the door till it passes them all (plus unit, usability, and requirements testing).

            • I wish I could mod you informative for your response.

            • and no code is shipped out the door till it passes them all

              I quite agree. I won't ship my code until it passes the test tool I use. My test tool is gcc. Once that runs without error, I ship.

  • by caffeinemessiah (918089) on Tuesday September 03, 2013 @05:38PM (#44751131) Journal
    So a private, for-profit company named "Coverity" has released a report that shows that their "Coverity Scan" software finds the fewest vaguely-defined "defects" in a programming language whose community has added the "Coverity platform" product to their development process? I was about to say "excellent marketing" by writing a fluff piece for free Slashdot traffic, but it's really not even excellent marketing.
    • by dkf (304284)

      So a private, for-profit company named "Coverity" has released a report that shows that their "Coverity Scan" software finds the fewest vaguely-defined "defects" in a programming language whose community has added the "Coverity platform" product to their development process?

      Their stuff does work at detecting certain kinds of problem, but it doesn't detect all possible bugs (nor does anything else I've encountered). It's better to say that it's an independent tool that can be used as well as other tools, and they provide free access to quite a few of the larger OSS projects. They surely don't have to; nobody's forcing them. They've also been doing it for years.

      For an example of the sort of thing they find, in a software package I know about their tool recently picked up that th

  • by dwheeler (321049) on Tuesday September 03, 2013 @05:39PM (#44751137) Homepage Journal

    Coverity sells software that does static analysis on source code and looks for patterns that suggest defects. E.G., a code sequence that allocates memory, followed later by something that de-allocates that memory, followed later by something that de-allocates the same memory again (a double-free).

    The product is not open source software, but a number of open source software projects use it to scan their software to find defects: https://scan.coverity.com/ [coverity.com] It's a win-win, in the sense that Coverity gets reports from real users using it on real code, as well as press for their product. The open source software projects get reports on potential defects before users have to suffer with them.

    • Re: (Score:3, Interesting)

      by Anonymous Coward

      We've ran Coverity on several very large projects where I work. For C++ it did a decent job of finding little and simple things that Visual Studio missed, like variables that were never initialized before use, subtle type violations Visual Studio missed, or accessing past the end of a statically allocated array. These aren't the sorts of bugs that we worry about. The evil bugs - like those created by programmers that don't know enough about multithreading but were assigned because some offshore contracto

      • Re: (Score:2, Informative)

        by Anonymous Coward

        you should try TSAN. See : https://code.google.com/p/thread-sanitizer/

  • by greg1104 (461138) <gsmith@gregsmith.com> on Tuesday September 03, 2013 @05:40PM (#44751147) Homepage

    Coverity's services have been useful to a number of open-source projects. But this article is carefully picking its terms to get a headline worthy result. Compare against the Coverity scan of PostgreSQL [postgresql.org] done in 2005 for example, and CPython's defect rate isn't very exciting at all. But that was "Coverity Prevent" and this is "Coverity Scan"...whatever that means.

  • The title is misleading again as hell. It appears they talk about the C code included in the Python compiler/interpreter project, and it is to be compared against other open source software projects, not against other languages. All that it shows is the Python project developers are eager to fix problems what this particular verification software founds. If they have fixed all those bugs, then they will have exactly zero known defects. Good for them, but most probably there will remain unknown defects, and
  • by sgt scrub (869860) <saintiumNO@SPAMyahoo.com> on Tuesday September 03, 2013 @05:49PM (#44751225)

    They counted my C++ features as bugs?

  • Numbers like .69 or 1.0 or 0.005 mean nothing if you don't know to what it relates.

    Usually defect counts are based on 1k LOC (one thousand lines of code, and no: a line of code is likely not what you consider a line of code).

    I doubt that 1.0 is a accepted industry standard defect density [...] for good quality software of ...

    1 defect per 1 kLOC is absurd high, luckily I never was in a project the last 20 years with such a high defect rate.

    • 1 defect per kLOC is pretty good. The question is, however, *what* is exactly a defect? It is one thing to define a defect as an error that manifests itself when a piece of code is passed what ought to be a valid input, but we all know that no program will ever be handed any significant subset of all valid input during anyone's lifetime. Even that 1 defect per kLOC may never be triggered because even though the function is defective in terms of not handling all possible inputs from what one would consider t
      • Yeah, what exactly is considered a defect varies.
        In the personal software process by Watts Humphrey(sp) already a line that does not compile is considered a defect and is added to the defect log.
        Bottom line everything that comes up in an issue tracker with the aim to fix it later, is a defect.

        In that regard, sleeping defects that are never discovered because "never" some invalid data triggers them, are no defects.

        Regarding 1 error per kLOC. Serious tools count something like this:

        /**
        * @param in, the amount

        • Regarding your Input example: I disagree. Most enterprise systems are very good in rejecting invalid input.

          I think you probably misunderstood what I meant by "invalid inputs". Take the infamous example of the 32-bit version of the Java's binary search in a sorted array [blogspot.cz]: the problem was in the overflow of the midpoint computation: while (a+b)/2 looks like a reasonable way to do it, even if both a, b, and (a+b)/2 are within the range of the integer type used, a+b doesn't necessarily have to in some cases. But since few people did multi-GB arrays to even potentially get the <a,b> tuple into an invalid range,

  • Hey metric retards (Score:4, Interesting)

    by Sulik (1849922) on Tuesday September 03, 2013 @06:00PM (#44751311)
    While it can be useful in pinpointing common code defects, interpreting coverity results as an absolute indicator of code quality is just retarded. 90% of coverity's defect's tend to be really false positives that would be obvious to even the average code monkey... Not sure that massaging a code base to please coverity and getting a 'high score' is really any kind of achievement and may be more an indicator that you have way too much time on your hands...
    • by kermidge (2221646)

      According to their report (take it as you will) false positives as of 2012 were 9.7% of reported defects.

  • This is bullshit, but a great tactical conversion of non-informative data into marketable news by Coverity.

    Coverity uses lexical pattern matching to find bugs based on "tricks" discovered by Dawson Engler and his colleagues in Stanford University in the early 2000s. The tricks (find "malloc" not coupled with "free", cli() not coupled with sti(), dereferences of uninitialized pointers etc.) were developed in the context of the C language used for Operating System code.

    So they used tricks developed for one la

  • I once thought about learning python. Then i combed craigslist across the US looking for job opportunities doing python programming. Relatively few out there by comparison to ASP.NET and Java. Sure its less buggy.....but whats to motivate anyone to learn something they can't easily find work in?
    • On the other hand, there are also proportionally many Java and .NET programmers, so you'll be competing with fewer people in Python land.

      The right answer, anyway, is to learn all three - and a couple more (C++, in particular).

  • The code is so slow, they have lots of extra time to look for defects.

    • by Z00L00K (682162)

      When you look at analyzing defects - you can find coding defects pretty easily but you can't find design defects where the designer has misunderstood the goal of the product.

      One example of a pretty annoying design mistake is when you run Microsoft software where you can chose to send a document as an attachment from Powerpoint, Excel or Word. However it will at the same time block all access to other windows in Outlook preventing you to get the list of names that you know were present in another message. No

  • ... couldn't find the languages compared? Curious to know how Ada fared and if Python was compared against it.
  • So what they are basically saying is "Don't use our product to scan Python code; it doesn't recognize all the defects".

    I know the truth is possibly somewhere in the middle, but this report just assumes the scanning products works equally well for all languages, which is atleast somewhat unlikely.

    Also, what exactly is a defect in this context? Is it a security flaw, a functional error or just something that will crash your software. If the latter is the case, then any language that accepts shitty code and ju

  • This all seems very misleading. It took me quite a while to figure out that it is only talking about the code for the Python interpreter, not all open-source programs written in Python.

The more cordial the buyer's secretary, the greater the odds that the competition already has the order.

Working...