Code Quality: Open Source vs. Proprietary 139
just_another_sean sends this followup to yesterday's discussion about the quality of open source code compared to proprietary code. Every year, Coverity scans large quantities of code and evaluates it for defects. They've just released their latest report, and the findings were good news for open source. From the article:
"The report details the analysis of 750 million lines of open source software code through the Coverity Scan service and commercial usage of the Coverity Development Testing Platform, the largest sample size that the report has studied to date. A few key points: Open source code quality surpasses proprietary code quality in C/C++ projects. Linux continues to be a benchmark for open source quality. C/C++ developers fixed more high-impact defects. Analysis found that developers contributing to open source Java projects are not fixing as many high-impact defects as developers contributing to open source C/C++ projects."
Not a surprise (Score:4, Insightful)
Re: (Score:1)
You can't fix a problem that you can't see!
Re:Not a surprise (Score:4, Insightful)
bugs, like DRM?
Re: (Score:1)
DRM is not a code bug. It's a human bug.
Re: (Score:2)
Without seeing it?
They patch at random and let evolution take care of the rest? ..
Re: (Score:1, Informative)
Sure you can. It's called binary patching. It's how people patch bugs in closed-source games.
Yes its possible, you just need to ignore the license, use a decompiler, and single step through the code, fixing and patching. The only problem is that you are fixing *your copy only*. Now you can continue to ignore the license and spread your copy around, or you can make a binary patch and spread that around (ignore the first instance of license violation by using a decompiler). The only other issue is that you
Re: (Score:1)
Yeah, tell that to the OpenSSL team, it will cheer them up.
Re: (Score:1, Interesting)
Only if they're actually is sufficient enough people looking at the code. OpenSSL proves that there isn't.
Re: (Score:1)
Your assuming that heartbleed was a bug and not an undocumented feature requested by a Governmental sponsor, even unwashed libertarian programing hippies have to eat.
Re: (Score:1)
Wait, wut?..
So you're claiming that there _were_ enough people looking at the code, but they were all bought out by The Government (which, apparently, reached all developers around the world easily, but missed Google and Codenomicon)
Re: (Score:3)
Proves nothing of the sort.
It would likely have remained hidden in a closed source equivalent.
Re: (Score:2)
Re: (Score:2)
Only if they're actually is sufficient enough people looking at the code. OpenSSL proves that there isn't.
It's difficult to get enough eyes on the project when its design is such a mess that people who take a look at it have no idea what they're seeing.
Every major TLS software package available is crap, from what I've seen. OpenSSL only "proves" that with a sufficiently hyped marketing campaign, a bug in one package can ruin its reputation relative to others with similarly bad security issues that did not get the same marketing. In some respects, it could be argued that the recently discovered bugs in GnuTLS
Re: (Score:1, Redundant)
Given enough eyeballs, all bugs are shallow . --- Linux Torvalds
Actually that was Eric Raymond, and it is evident that in fact there never are enough eyeballs (at least ones that can comprehend what they are looking at). The theory is sound but in practice it is not.
Re: (Score:3)
Actually that was Eric Raymond, and it is evident that in fact there never are enough eyeballs (at least ones that can comprehend what they are looking at). The theory is sound but in practice it is not.
It's a fundamental truth that, the more of the system you have to comprehend to truly understand it, the harder it is to debug. Syntax problems? Trivial. Global liveness checking? Much harder. (There's just so many ways to screw up.)
Re: (Score:2)
Re: (Score:1)
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
I just think it's beyond stupid to say "This bug was found, therefore the software's development methodology ensures we never find bugs!" or some such twaddle.
I agree, but nobody said or inferred that, just that even if you do have "many eyes" (and the vast majority of projects do not) and even if you assume they are the right ones looking in the right place at the right time you will likely find at least some bugs at some point in time. So the point is it's hardly a reliable and practical advantage, more of a shot in the dark. In theory it's a great idea but you so rarely have it actually work, in this case - and really given the profile of the project this is a
Re:Not a surprise (Score:5, Interesting)
Re: (Score:2)
but given an arbitrary acceptable error, there are usually acceptable sample numbers and sampling strategies.
Well you need people that can fully understand a particular complex system to find the tough bugs, and you need a lot of them dedicated to it. I would say there is rarely ever enough, except maybe on the Linux kernel where the critical error rate is pretty low (though they do happen). Demonstrated by the key advantage of free/open source software being that it is easier/quicker to fix bugs in it, not that is necessarily more bug-free than proprietary software in general.
Re:Not a surprise (Score:5, Funny)
You can't even attribute a quote correctly.
Linus was the guy that said "Look what you did to my code! You @#$%&! I'm gonna @#)+-*&$! You. You &$(#*%+.
Re: (Score:2)
"My name is Linux Torvalds..." (Score:2)
"My name is Linux Torvalds... and I pronounce him 'Linus'...".
that's ESR, not Linus (Score:2)
That would be: ... the fix will be obvious to someone.
Given enough eyeballs, all bugs are shallow
Eric S Raymond
Although ESR called it " Linus' Law", it's ESR's writing, from CATB. Linus has a completely different concept that he calls "Linus' Law". Linus talks about motivations for what we do.
Re: (Score:2)
> When a bug does show and people look at the source
That's when the quote comes into play, "given enough eyeballs all bugs are shallow ... the fix will be obvious to someone".
A shallow bug, one where the fix is obvious, is obvious when eyeballs are looking. The question of how many bugs exist is a separate issue.
Also:
> When a bug does show and people look at the source
For most of my contributions to open source, at least three people looked at the code before it was distributed - me, the module mainta
Re: (Score:2, Insightful)
Well - at least it was found...
With closed source at the other hand..
Well - let's say the bad guys still have great day's for a loooong time...
Re: (Score:2)
Managed langauges (Score:5, Insightful)
Java project developers participating in the Scan service only fixed 13 percent of the identified resource leaks, whereas participating C/C++ developers fixed 46 percent. This could be caused in part by a false sense of security within the Java programming community, due to protections built into the language, such as garbage collection. However, garbage collection can be unpredictable and cannot address system resources so these projects are at risk.
This is especially amusing in light of all the self-righteous bashing that C was getting over OpenSSL's problems. Seems it's true that using a "safe "language just makes the programmer lazy.
Re: (Score:1)
Resource leaks are hardly as critical as "undefined behavior" (read: buffer overflows and all kinds of other nastiness).
At best a resource leak gets you a DoS.
Re: (Score:1)
You're right. Exploiting the swiss cheese of the JVM is a much better target.
Re: (Score:1)
Swiss cheese security, that is.
The JVM (Score:1)
Guess which language the JVM is mostly written in? Dumbass.
Re: (Score:2)
Guess which language the JVM is mostly written in? Dumbass.
Dumbass? Is that a dialect of Brainfuck [wikipedia.org]?
Re: (Score:3, Insightful)
Resource leak in Java = DoS, as mentioned already
Resource leak in C = Heartbleed.
Personally, I'd rather my application crash than expose my private keys and other data that was supposed to be encrypted.
Re:Managed langauges (Score:4, Informative)
Apparently you missed the cyrpto flaws in Android 's Java crypto library from last year that exposed private keys. Apparently writing things in Java guarantees jack and shit.
Re:Managed langauges (Score:5, Funny)
Apparently you missed the cyrpto flaws in Android 's Java crypto library from last year that exposed private keys. Apparently writing things in Java guarantees jack and shit.
No, writing things in Java guarantees your shit will be jacked.
Beta Sucks (Score:1)
Java calls C for anything performance-critical, anyway.
Re:Managed langauges (Score:5, Interesting)
You mean this one [securitytracker.com], lol?
Solution: The vendor has issued a fix for the Android OpenSSL implementation and has distributed patches to Android Open Handset Alliance (OHA) partners.
Oh, that notorious piece of Java code, OpenSSL!
Re: (Score:3, Informative)
Re: (Score:1)
Remember that most programmers are far from half as good as the Linux coders, and thus should avoid writing code as much as possible.
FTFY
Re:Managed langauges (Score:5, Interesting)
I also think that with a low level language that more developers are aware of potential problems than developers using high level languages. In some sense I think this is also due to the types of programs being developed. C/C++ today is common used for embedded systems, operating systems, runtime libraries, compilers, security facilities, and so forth. So systems programmers versus application programmers versus apps programmers. The system programmers are forced to take a close look at the code and must be mindful of how the code affects the system. I think that if you had such a comparison done back in the 80s that the numbers would be different because many more application programmers were using C/C++.
Ie, interview for a systems programmer: do you know about priority inversion, do you understand how the hardware works, do you know the proper byte order to use, what does the stack look like, etc.
Interview for the modern applications programmer: have you memorized the framework and library facilities.
Re: (Score:3)
I've never been in an interview where they asked for memorized framework and library facilities. As a web developer, I get questions about data normalization, graph theory, and complicated SQL JOINs.
Re: (Score:1)
Re: (Score:2)
The problem with low level languages, isn't anything technical about the languages.
It is about a common attitude among programmers.
As a kid, We learn things by taking steps up.
We Walk/Run, Then we Ride a Bike, then we Can Drive a Car. It is a simplistic way of viewing things. One is better then the other, and you need to be better to use the better method.
The same idea goes with programming languages. (I'll Show my age hear)
You code in Basic, then you go to Pascal, then you can do C finally you will be ab
Re: (Score:1)
Another explanation is that the leaks left in managed programs are likely to be the harder ones to fix, because a silly programming error (oops, I didn't free this pointer) isn't a source of leaks.
Or we could moralize it and pretend managed developers are lazy.
Not a surprise, but no reflection of O/S vs Prop. (Score:5, Insightful)
First, we shouldn't confuse Coverity's numerical measurements with actual code quality, which is a much more nuanced property.
Second, this report can't compare open source to proprietary code, even on the narrow measure of Coverity defect counts. In the open source group, the cost of the tool is zero (skewing the sample versus the commercial world) and Coverity reserved the rights to reveal data. Would commercial customers behave differently if they were told Coverity might reveal to the world their Coverity-alleged-defect data?
Again, having good Coverity numbers can't be presumed to be causally related to quality. For example, Coverity failed to detect the "heartbleed" bug, demonstrating that the effect of bugs on quality is very nonlinear. 10 bugs is not always worse than 1 bug; it depends on what that one bug is.
Re: (Score:2)
First, we shouldn't confuse Coverity's numerical measurements with actual code quality, which is a much more nuanced property.
Yeah, but good quality might well correspond to some sort of measurable anyway. Provided you've got the right measure. Maybe some sort of measure of the degree of interconnectedness of the code? The more things are isolated from each other, across lots of levels (in a fractal dimension sense, perhaps) the better things are likely to be.
Maybe that would only apply to a larger project, and I'm not sure what effect system libraries (and other externals) would have. Yet the fact that it might be a scale-invaria
Re: (Score:2)
The more things are isolated from each other, across lots of levels (in a fractal dimension sense, perhaps) the better things are likely to be.
Language has a lot to do with that.
If your project is written in a managed language, allocated memory is always initialised first, there is no pointers arithmetic and array bounds are always checked, so it's impossible to read random data from memory.
If your project is written in C, all code has access to all memory.
Beta Sucks (Score:1)
"If your project is written in a managed language, allocated memory is always initialised first, there is no pointers arithmetic and array bounds are always checked, so it's impossible to read random data from memory."
Except when you forgot to remove some reference to an object, so it's still stitting around in a list somewhere because it can't be garbage-collected, and some code then uses whatever objects happen to be in that list.
No language is safe for an unthinking programmer to use.
Re:Not a surprise, but no reflection of O/S vs Pro (Score:5, Insightful)
are you sure about that?
that's valid C#, all you need to do is inject something like that into the codebase and let the JIT compile it (using all the lovely features they added to support dynamic code) and you're good to get all the memory you like.
Now I know the CLR will not let you do this so easily, but there's always a security vulnerability lying around waiting to be discovered that will, or an unpatched system that already has such a bug found in any of the .NET framework, for example this one [cisecurity.org] that exploits... a "buffer allocation vulnerability", and is present in Silverlight. [mitre.org]
The moral is ... don't think C programs are somehow insecure and managed languages are perfectly safe.
Unsafe code (Score:3)
Yes, so your argument is that you can, with great difficulty cause a possible security issue in C#, but in order to do so, you have to basically say... I'm about to do something possibly bad, please don't check to make sure what I'm doing is bad. Then modify the compiler from default to allow said code to be compiled, then put it into a fully trusted assembly so it bypasses all security checks, and THEN you might have an issue.
and this is comparision to where in C/C++ where you can write an exploit in 2 li
Re: (Score:1)
A managed language would not have protected against Heartbleed, because the program maintained it's own freelist to prevent memory from being unallocated. If it did not do this then being written in a managed language would have prevented Heartbleed - but then again, if it did not do this then the C code wouldn't have been vulnerable either.
Re: (Score:2)
Another problem with the comparison is that the average closed-source project is four times as big as the average open-source project. I'd expect defect density to go up with size of codebase. (Of course, this may not be an issue with what Coverity detects, but if so that emphasized that Coverity doesn't find all the important defects.)
Code Quality Sucks on Either (Score:1, Troll)
Yeah, I have seen the source code to the Windows 7 OS, CISCO's iOS and LINUX of course.
They all suck equally.
However, that being said, I am currenrlty running a version of the LINUX OS I built and modified for my customers use in a PostGRES server which is quite large.
Open Source wins again because I can correct the suck. :-)
Re: (Score:1)
I am currenrlty running a version of the LINUX OS I built and modified for my customers use in a PostGRES server which is quite large.
Assuming you mean Linux, that's a kernel and not an OS. Are you saying you run a custom kernel, or are you actually insane enough to run LFS on a server? If the later, WHY?
Did you write your improvements in C+ ? (Score:5, Funny)
Your four-sentence comment has five glaring errors that make it obvious that you have absolutely no idea what you're talking about. You very much remind me of the job applicant who told me he has experience in C, C+, and C++.
Re: (Score:1)
good thought. This was spoken face-to-face (Score:2)
That's an interesting thought. Had it been typed, it might be a typo. I was thinking of a guy who said that, out loud, face-to-face. That's not the only comment that made it clear he was claiming four times as much as he in fact knew.
Of course, in a interview I give someone leeway - my mind went blank once in an interview when I was asked "what are the four pillars of object oriented programming?". At the time, I could have implemented objects in C using the preprocessor*, but interview stress caused a brai
Re: (Score:1)
Re: (Score:2)
Your four-sentence comment has five glaring errors that make it obvious that you have absolutely no idea what you're talking about. You very much remind me of the job applicant who told me he has experience in C, C+, and C++.
Well, since that time, I also learned C+++.
Re: (Score:2)
First of all, built means compiled and modified means I switched off u32 support as well as targeting a Xeon processor class.
I did not need to write code, the kernel had to be rebuilt and the binary replaced/modified for the target processor and memory architecture.
You use the .config for that and rebuild the kernel tree. You don't need to write code.
SO! The included Redhat kernel was way too generic for the application performance required.
Among other things. But the point is, you can't do that with an OS
Mod parent up (Score:1)
How can software that has any bugs be considered as good quality??? I guess that if guns are legal in your country, then buggy software may be too.
Re: (Score:3)
"none, because someone might find out I've made the code worse, not better"
Polishing old code or writing good code (Score:5, Interesting)
The report doesn't really go into an important measure.
What is the defect density of the new code that is being added to these projects?
Large projects and old projects in particular will demonstrate good scores in polishing - cleaning out old defects that are present. The new code that is being injected into the project is really where we should be looking... Coverity has the capability to do this, but it doesn't seem to be reported.
Next year it would be very interesting to see the "New code defect density" as a separate metric - currently it is "all code defect density" which may not reflect if Open Source is *producing* better code. The report shows that the collection of *existing* code is getting better each year.
Re: (Score:2)
This is exactly what I would expect. Odds are that open source and closed source software start out with similar defect densities. The difference is that open source software, over time, is available for more people to
Re:Polishing old code or writing good code (Score:4, Interesting)
Actually it's the reverse. Fengguang Wu does automatic defect reporting so new bugs get found and reported within a week. We've had great success with this.
But if we delay then here is the timeline:
- original dev moves to a new project and stops responding to bug reports (2 months).
- hardware company stops selling the device and doesn't respond.
- original dev gets a new job and his email is dead (2 years).
- driver is still present in the kernel and everyone can see the bug but no one knows the correct fix.
- driver is still in the kernel but everyone assumes that all the hardware died a long time ago. (Everything in drivers/scsi/ falls under this catagory. Otherwise it takes 8 years).
Each step of the way decreases your chance of getting a bug fix. I am posting anonymously because I have too much information about this and don't want to be official. :)
Ah, Coverity (Score:5, Insightful)
Coverity: Hey you, proprietary software developer with the deep pockets. Yeah, you. We've got this great tool for finding software defects. You should buy it.
Proprietary software developer: get lost.
Coverity: Hey, open source dudes, we've got this great defect scanner. Want to use it? Free of course!
Open source dudes: Meh, why not?
Coverity: Hey proprietary software developer, did we mention those dirty hippie neck beards are beating the stuffing out of you in defect (that we detect)-free code?
PSD: Fine, how much?
Useless analogy (Score:5, Interesting)
This is a useless analogy. Code Quality is a function of both skill and the stewardship of the team supporting the code. Tools help as well but you can write some elegant, high quality code regardless of the language chosen. You can also write some real shit too but ultimately how many defects a piece of software has comes down to the design and testing that goes along with it. Some bodies of work get rigorous testing and it's not like OpenSSL's recent problem wasn't about deficient design it was about a faulty implementation. Faulty implementations in logic happen all the time and there are some bugs that just take awhile to become known. I mean even with test driven development and tools for code analysis probably couldn't have found this particular issue but considering how long it was in the code base without somebody questioning it goes back to not only stewardship by the team but the rest of the world who are using the code. If anything this situation points out that FOSS can have vulnerabilities just like proprietary software however the advantage is that with FOSS you can get it fixed much more quickly and because other people can see the implementation it can become scrutinized by folks outside the team that develops and maintains it.
In the case of Heartbleed the system works. A problem was found, it was fixed it's now just a matter of rolling out the fix and regressions are put into place to help insure that it doesn't happen again. The repercussions of what it means is that another gaping hole in our privacy was closed and that "bad guys" may have stolen data, rollout the fix ASAP. Your guess is as good as mine as to what was stolen is a matter of research and conjecture at this point. I doubt that the bad guys will tell us what they gained by exploiting it. Let's also be sure that until the systems with the bug are patched, they're vulnerable so cleanup on aisle 5.
To be honest it's a bit naive if we all assume that FOSS software that handles security doesn't have potential vulnerabilities. Likewise it's also naive to assume that proprietary code has it licked as well given the revelations of NSA spying for the past year. Given that there are numerous nefarious companies that sell vulnerabilities to anybody who can pay for it, that means unless you're buying them you probably will never know what is exposed until somebody trips over it. What this means for everybody that you can depend on is when those vulnerability-selling companies are out of business can assume that your software is free of the easier to exploit vulnerabilities; governments will always use all their tools to get intelligence including subverting standards and paying off companies who can give them access to what they want.
What's up with Dice Developers (Score:1)
Re: (Score:2)
its all ready is open sourced and that is what the soylent news guys did but the community didn't fallow.
Re: (Score:2)
Re: (Score:1)
its all ready is open sourced and that is what the soylent news guys did but the community didn't fallow.
Yes, SlashCode is open source, but the latest public release is 5 years old and not at all what's running on slashdot now.
It would be very nice, if Dice would release a newer version of the code, not only for SoylentNews [soylentnews.org], but also for the Japanese slashdot.jp [slashdot.jp] and the Spanish barrapunto.com [barrapunto.com], both of them are still using the old version.
FreeBSD looks just as good as Linux (Score:3)
with nearly 2x the LOC.
Re: (Score:2)
What if.... (Score:1)
Code repositories were compromised by the NSA (or other capable group)
proprietary: VBscipt BEST EVER (Score:1)
Just kidding
Did they test OpenSSL? (Score:3)
With all the noise about OpenSSL lately, running this Coverity test on it (and other security software like GNUTLS) and sharing the results seems like it would be a good thing...
Re:Did they test OpenSSL? (Score:4, Informative)
They did examine heartbleed.
http://ericlippert.com/2014/04... [ericlippert.com]
Quality of people process (Score:2)
If you have good quality people, especially a good leader, your code will be good.
Even if the people are relatively inexperienced.
At this point, just about everything in IT/CS is a research project, not innovation.
So it's a matter of diligently doing the work based on past archetypes.
In My experience ... (Score:1)
There is no comparision (Score:3)
Re: (Score:2)
Re: (Score:2)
Yes how could ever compare to groups that might have a significant amount of overlap? It can't be done! There's no branch of mathematics that would allow us to do such a thing! It's impossible.
The only problem (Score:1)
Put your suit on for a meeting or sweatpants at... (Score:2)
...home?
Most people will put more effort into something that will be public (both out of positive motivation and the negative motivation of shaming.)
Open Source will always, in general, be better than closed source. Again - in general. There are people who will engineer things properly irrespective of whether or not someone will be browsing your github account or checking it out of the company's private server... Too bad there's not more of them ;).
Of Course Open Source is Better (Score:1)
Broken Metric (Score:2)
This is the same broken metric that Coverity has been mis-using year after year.
"Defect density (defects per 1,000 lines of software code) is a commonly used measurement for software quality, and a defect density of 1.0 is considered the accepted industry standard for good quality software."
In other words, if you double the size of the code base by adding no-op code, you increase your quality score.
Also, if you leave the bugs in, but reduce code size, you are reducing your quality score.
Re: (Score:1)
Crap code remains crap code.
People have to be able to provide updates... and feel appreciated by doing so. BSD licenses allow companies to "appropriate" code, and then sue the original author for copyright violations...
Re: (Score:2)
> People have to be able to provide updates... and feel appreciated by doing so. BSD licenses allow companies to "appropriate" code,
Very true
>and then sue the original author for copyright violations...
Say what!?!?! Source code typically must be published before being appropriated, making it trivial to prove who had the prior claim.
Of course that doesn't stop a lawsuit from being *filed*, but then nothing stops me from filing a lawsuit against you for stealing my pink unicorn either.
Patent violation
Re: (Score:1)
Re: (Score:1)
Most static analysis tools look for bugs and potentially buggy behavior. They must rely on limited pattern matching and data flow analysis. They can't find all bugs. See: The Halting Problem.
Re:heartbleed (Score:5, Interesting)
I'm not going to yell about the openSSL guys.
I'm going to be honest here, they deserve yelling at, and I'm an open source fan. The error they made is exactly the same mistake that everyone else has made in years past when dealing with SSL: x509 and the SSL protocol demands [lengthofstring][string], "pascal" style. This is how everyone (open and closed source) got hit with that domain validation bug where the certificate said "(26)bank.com\0.blahblahblah.com". Certificate signers looked at the domain on the end of the string "blahblahblah.com" and validated it. Client programs treated it like a C string and thought it was a certificate for "bank.com". Not a single person anywhere said "whoa there, null bytes are not part of a valid hostname!"
The attack asks server to respond with "(65535)Hello" and the server replies with 65535 bytes of data. Falling for this attack is exactly like the guy who points and laughs at the person who just fell off their bike, seconds before falling off their own bike. They should have known better, especially with how high-profile these attacks were in the past.
The bit about writing their own malloc implementation, poorly, was just icing on the cake.
Re:heartbleed (Score:5, Informative)
Disclaimer, I work for Coverity. There's a write-up on why Coverity didn't find it out of the box here:
http://security.coverity.com/b... [coverity.com]
Re: (Score:2)
But that's written in C and C is the worst programming language ever!! How dare they be so dumb to not write Python in a "memory safe" langauge!
Re: (Score:1)
It's like letting someone else make the bricks, shingles, cement, planks, steel beams, while you stick to building stuff out of them, instead of making your own from scratch.
On a related note, perhaps you should let the smart ones do the commenting instead.
Re: (Score:3, Interesting)
I would expect both "open source" code to be of approximately equal quality to proprietary code. In each ideology you will get people who care (about quality), and people who don't, in approximately equal proportions, the same with skill, ingenuity and passion for the work.
The difference is that proprietary software is constrained by the number of developers able to view and work on the code. An open source project may have a similar number, or smaller set of core developers, but a much larger pool of developers that can spot problems, suggest alternatives, fix the one bug that is affecting them, etc. Having a more diverse set of developers will increase the chances that the software improves.
You could also make an argument about the motivations of the developers. Open source
Re: (Score:2)
Re: (Score:1)
Riiiight.......I definitely feel more elite taking my own garbage out rather than having someone else do it for me.