Software Code Quality Of Apache Analyzed 442
fruey writes "Following Reasoning's February analysis of the Linux TCP/IP stack (putting it ahead of many commercial implementations for it's low error density), they recently pitted Apache 2.1 source code against commercial web server offerings, although they don't say which. Apparently, Apache is close, but no cigar..."
Code defects appear to be a small part of the equa (Score:5, Insightful)
But here's the kicker: the vast majority runs Apache on either BSD or Linux. All of this code, from the kernel to the library that tells Apache how to use PHP, is open source. Every hacker on the planet has full access to the code - which means that they can review it and find vulnerabilities in it. Not many people have access to Windows or IIS code. So why does IIS and Windows come out as far less secure, and is exploited so much more?
I think the answer lies in the severity of the code defects, and the architecture and design of the operating system that powers the web server. And yes, I know that Apache can run on Windows.
Wait a second (Score:4, Insightful)
2.1 ? (Score:4, Insightful)
What do reasoning do? (Score:5, Insightful)
This is probably a publicity stunt for them although a good one. I think it would be a good idea for them to sell software suites of their product if they don't already.
FACT: 3 is a larger number than 2 (Score:5, Insightful)
As far as I can see, this article says 'We have two arbitary numbers, and one is bigger than the other. From this we deduce that Apache is not as good as commercial software.'
Apache 2.1...? (Score:5, Insightful)
Either way, to have only 31 errors in close to 60,000 lines of code is impressive!
"Defect Density"? (Score:5, Insightful)
Since LOC is a poor metric, a "defect density" measurement based on that will be just as poor.
Yes, I know there's not much else to go on, but something along the lines of putting the program through its paces, stress testing, load testing, etc. would be a much better measurement than a metric based on LOC.
more to it than # flaws-per-unit-"whatever" (Score:5, Insightful)
What bothers me about these articles is that there is more to software quality than the # of flaws-per-unit-"whatever".
Like design.
It seems to me most of the problems with Apache's main competitor in terms of software quality are the result of design and engineering choices made by MS's IIS development team.
In other words, it does exactly what they designed it to do, but what they designed it to do was a very bad idea.
No cigar, my ass. (Score:5, Insightful)
The problems with this are:
Re:Apache 2.1...? (Score:3, Insightful)
This is an ad for their software (Score:2, Insightful)
I am trying to get the main analysis downloaded now, but they must have been prepared for a slashdot posting
code defects? (Score:1, Insightful)
source code for common programming errors,
but how can such a system ever find semantic
errors, such as complicated protocol handling
issues?
It seems to me that those just happen to be
strong points of open source software.
Re:Code defects appear to be a small part of the e (Score:1, Insightful)
Re:FACT: 3 is a larger number than 2 (Score:3, Insightful)
what is a "software error"? (Score:5, Insightful)
First, are all of IIS's issues "software errors" per se? I'm wondering if all security problems would have been caught, or if that was really the goal of the analysis. Perhaps it was, but I'm not sure. One could contest that IIS has a lot of things unprotected, but that this doesn't constitute a software error.
And as you say, severity would be another issue. It's always been typical open-source style to get the mission-critical parts hardened against nuclear attack, but leaving the other bits a tad soft. I wouldn't be surprised to learn that was the case with apache.
One thing I want to know - did MS (or whoever) give these guys source or were they analyzing the binaries?
Dubious (Score:5, Insightful)
If the company has developed proprietary tools to enable them to identify defects in medium-sized software projects, which of the following business models do you think is more effective:
1. Design proprietary tools to identify defects in medium-sized software projects.
2. Fix defects
3. Profit
or
1. Design proprietary tools to identify defects in medium-sized software projects.
2. Sit around mumbling about defects, Open Source software, closed source software and why farting in the bath smells worse
3. ???
4. Profit
Secondly, where on earth did they get hold of a closed source enterprise level (which Apache undoubtedly is) web server software codebase?
"Hi, is that BEA? Do you mind if we take a copy of your entire code base so that we can peer review it against Apache's? What's that? Yes, Apache might come out on top, and we will make the results public..."
How do they define a defect anyway? A memory leak? A missing overflow check? A tab instead of 4 spaces?
It just sounds like bullshit to me...
Different standards? (Score:5, Insightful)
automatically detected defects exclude security (Score:5, Insightful)
So the error level in pre-release Apache ... (Score:5, Insightful)
Bad Statistics... (Score:5, Insightful)
My general rule is that if someone is quoting statictics to you, they are lying. At least on average.
Actually the article suggests apache is better (Score:5, Insightful)
Wrong Math (Score:5, Insightful)
The longer and more content you have per line the higher the likelyhood of error/ line.
As example with one errror in 100 lines you get 1% error. Imagine you could do the whole thing in one line. Now you have 100% error.
Re:So if they found them... (Score:4, Insightful)
I'm just glad I'm not the poor go-coder who has to go through the code to find and fix these few "errors."
Don't assume IIS (Score:5, Insightful)
It could also be Zeus, SunOne or one of the other lesser known web servers out there.
Apache 2 is not Apache 1 (Score:3, Insightful)
The test may be more interesting if applied to Apache 1. As someone who has had to migrate a mod_perl site from Apache 1 to Apache 2, I can tell you that Apache 2 is a very new beast, and it doesn't shock me at all that there are dozens of bugs that still need to be shaken out. Fewer users are running Apache 2 in a production environment as well, since it's considered a development branch. See less eyeballs rule.
Defects and maturity of code base (Score:5, Insightful)
I think the next step for these folks would be to take a project that has a long history, say perhaps Apache 1.x and show defect rates over the life of the project.
Having read the reports.. (Score:5, Insightful)
Their automated checker also searched for out-of-bounds array accesses, memory leaks, and bad deallocations. It found none.
They also state that they ran the same checks against other codebases, and found that they did marginally better, on average.
In short, this report says that OLD development code for an unreleased opensource project is nearly as good as current commercial offerings. That's at best, when you consider the huge gamut of possible defects that this checker won't pick up. That margin probably disappears in the +/- of the sampling if you were to do a proper statistical analysis.
The report is fairly useless. It certainly should not be taken as a reason to not trust Apache; to do so would be foolhardy particularly given Apache's track record.
Oh, and Reasoning's webserver is being pounded into the ground. You can get my local copy of the reports from here [ic.ac.uk].
Re:Code defects appear to be a small part of the e (Score:5, Insightful)
The majority of the secruity holes are from the people setting up the web servers. The holes are usually abused by "wanna-be" hackers, or script-kiddies. The problem is that people are not educated enough to run some of these programs. Being able to understand Apache, and how to make it operate correctly is not everyone's top priority. As long as it works, people don't care how it works (as goes for many other things in this world).
It's all in how you calculate a defect (Score:4, Insightful)
Apache is just a webserver, and that's all. PHP, JSP, etc, are all separate applications treated separately. The integration does make things more efficient, yes, but also more prone to problems.
Something is wrong here... (Score:3, Insightful)
There is no magic "defect detector" for software. If there was such a thing, they would be making a helluva lot more money than they get for doing little defect tests.
It is very difficult to prove a program to be correct, and there's a lot of REALLY smart people who have tried.
Maybe these people have stuff than can look for buffer overflows and stuff, but actually being able to tell if Apache is returning the correct results requires far more than generic tests.
And I'll all but guarantee they didn't get together an entire development team to understand the code base and how it works as apache is a very large and complex code base.
Maybe they take what the find for their generic tests and extrapolate that if they find more generic problems there are probably more specialized errors as well, but they make it very clear in the report that the difference between
Anyways, I'm not saying the entire thing is worthless, just not to read too much into it -- either this one that puts Apache slightly behind some unnamed commercial implementation or the one that put the Linux TCP/IP stack ahead of some other commercial implementation (though I'd say it would probably be easier to test a TCP/IP for correct behaviour than a web server).
Re:Magic software (Score:2, Insightful)
"Fixing" requires understanding the code's intent.
Lies, damned lies, and statistics (Score:5, Insightful)
1) Apache 2.1 has more bugs than some unknown commercial competitor. If the version is correct, a development (not-ready-for-release) build was pitted against a released commercial build. Not fair playing ground.
2) Reasoning does not detail the severity or kind of the bugs. Certainly, a web server not being able to handle a type of format (pdf, csv, ogg vorbis) is less severe than a security hole. Pitted against IIS, I would trust Apache even if it had more bugs, because historically it has had fewer security patches. Check out Apache's 2.0 known patches [apacheweek.com] vs IIS 5.0 [microsoft.com]
Re:So if they found them... (Score:5, Insightful)
I mean, yeah, it would be nice if code would explicitly check for a NULL before dereferencing, but if there's no earthly way for the pointer to actually BE a NULL pointer at that time (barring memory corruption -- in which case all bets are off and your code is doomed anyway) then I wouldn't call those errors.
This whole exercise seems very suspect to me.
Re:every program. (Score:2, Insightful)
Re:Apache 2.1...? (Score:3, Insightful)
Agreed. It would be interesting to know whether this low LOC is accomplished through good architecture that emphasizes simplicity and maintainability or "clever" hacks that compress a 10-line loop down into a three-line abomination of pointer arithmetic. I genuinely hope it is not the latter.
Regardless, 59K lines is small enough a program that--given a good architecture--can be studied and debugged relatively easily by one or two people. I'd estimate that this is why Apache is known for its low number of exploits in spite of its enormous web server market share.
Re:So the error level in pre-release Apache ... (Score:5, Insightful)
That, too, but I'm damn certain that they must have tried it on recent stable 2.0.46ish release aswell. The question is, why weren't those results made public?
I'm guessing it's because the results were something that would've placed their "defect detection sw" into bad light. I.e. nothing as fancy as the forementioned "use of uninitialized variable" and "dereference of a NULL pointer" (which strikes really odd to me in the first place).
Naturally the other explanation is endorsement. It would be so much not-the-first-time that I don't even bother... but I wouldn't bet that this is the case here, because the defect counts were only compared to production release code averages (which strikes me as the other extremely dubious part of this whole "experiment").
Re:Code defects appear to be a small part of the e (Score:3, Insightful)
Do you know how long it takes to read someone else's code on something like an Apache-level webserver and understand it to the point where you can make useful changes and fixes? The big lie of the "all bugs are shallow" argument is that such a thing is simple, when in fact it is not.
Fixing a non-obvious bug in a 100k or so line C or C++ project is hard enough when you wrote the code yourself. If someone else wrote the code, it is harder still.
RTFAdvertising (Score:4, Insightful)
Heck, forget confidence - YOU CAN JUST CHECK.
The fact that Reasoning didn't have to go and get permission from Apache to run this test - coupled with the fact that we don't even know what Apache is being compared to - is the *real* point behind this "article".
ps: IANAL but don't they have to include a copy of the Apache License given that they publish fragments of the source code in their defect report?
Defect is too strong a word... (Score:5, Insightful)
I suspect the following code will be flagged as a defect:
as long as doOrDie() does its job and never returns a NULL then where's the defect? The guys who wrote this tester seem to want you to check any pointer dereferencing against NULL before use - I might be doing this in my doOrDie() function, I dont want to have to do it twice.OSS Standards (Score:2, Insightful)
When Open Source software is about the same quality as closed source, the developers consider it unstable and warn people that they may run into problems.
It shows a big difference, to me, in the quality standards that OSS developers (and users) expect.
Null pointers and uninitialized variables (Score:3, Insightful)
As a rather "stupid" example, I had to initialize a Map to an empty HashMap just last week to get Sun's Java compiler accept my code, although the only two references to the Map where within two if-blocks, within the same function, both of which depended on the same boolean value, which wasn't changed in the whole function.
There's a difference between defect and a bug. Tools that help in finding problems are great, but after all, they can only point possibly unsafe points. Ofcourse it's good to write code that doesn't trigger any such possibilities in the first place.
Coding errors & program logic errors (Score:3, Insightful)
The most worrying errors in programs are generally not coding errors as they are either terminal (ie. crash) or they are benign (the error may cause memory corruption in a place where it does no harm). Of course, there are exceptions such as buffer overflows, but I'd class those, in general, into the logic error category.
Logic or algorythmic errors are far more dangerous as they can be well hidden and are more likely to make the code do things unintended. The code itself may be perfect but if the algorithm is faulty then there's a major problem.
Re:So if they found them... (Score:5, Insightful)
Note that current_provider is set to conf->providers on line 257. The loop starts and neither current_provider or conf->providers change. Then on line 287 there's a conditional break if conf->providers is NULL.
If current_provider is going to be NULL at line 291, then conf->providers must be as well, so the conditional break will happen and the NULL dereference will be skipped.
Or am I missing something else?
Re:FACT: 3 is a larger number than 2 (Score:3, Insightful)
Suppose I had 100K lines of code with 100 defects. After reviewing my code I discovered that I could refactor it to 80K lines and suppose further that doing so had no effect on the defect count. Defects per line of code would look worse after an improvement.
Also, given that this is an automated program, I have to ask how they calibrate and validate its results. How many of the 32 errors found actually aren't errors? How many existing known bugs were not found by this program. I really can't accept these results as anything more than fluff with numbers.
Re:So if they found them... (Score:2, Insightful)
After reading the review I came a way with the impression that the reviewers were trying to hide this very fact. No mention this is a development version of Apache. No mention of what the "several commercial equivalents" are. Not much to back up their claim "Apache http server V2.1 code has defect density rate similar to the average found within commercial applications - Findings differ from previous Open Source Study".
I dare say that at first glance this this seems to be a case of FUD.
Development release (Score:4, Insightful)
Errors mean nothing... (Score:2, Insightful)
My wife who is a lead QA tester could vouch for that...
Hmm, the first claim seems to be wrong... (Score:2, Insightful)
Re:Code defects appear to be a small part of the e (Score:3, Insightful)
Nobody's saying that the information should be published - what they're saying is that you can't rely on that information being a secret.
Is Fort Knox secure? Probably. If so, then why don't they publish the blueprints, guard rotation schedule and security policies?
That's pretty much the point you're missing - even if that information was published, it wouldn't diminish the security of Fort Knox..
If the people in charge relied on the fact that they don't publish those details, that would be obscurity, because it would lead them to make errors elsewhere. (Oh, it's OK if we leave the main vault open tonight - nobody knows that there will be no guards around it for 10 minutes at 3:30 AM tonight.)
Re:So if they found them... (Score:1, Insightful)
The only place where I like to put NULL checks is where passing a NULL pointer has some sort of meaning in the API (in which case, it's obviously necessary). Doing so helps signal to anyone reading the code (mainly myself) that a NULL pointer value has significance beyond a possible segmentation fault. That would be drowned out if I put a NULL pointer check everywhere just to return a marginally useful error code, which I would also have to check for, rather than the program crashing in a clean and spectacular manner (the fail fast mentality).
Re:So if they found them... (Score:3, Insightful)
That's what assert() exists for. And 'preconditions' you are referring to are actually 'invariants', so if "suddenly that pointer can indeed be NULL" it means that someone broke a fundmental design assumption and should not be tweaking the code anyway.
And for those who haven't seen this trick before, a nice habit to get into is to write your checks like so:..
I found this trick pretty annoying. First of all any decent compiler can catch this with a warning. Second, if you are in fact misplacing == with = so often that you need a special habit for fighting it, then perhaps you should look at what you type
"xFF" vs "\xFF"
comma operator; for instance, f(param) vs f,(param)
misplaced structure initializers
etc, etc
It does not mean the programmer need to guard against all these too, it just means that the code must be proofread as it's being written, which is a reasonable thing to expect from a professional developer.
Guy, they're on your side (Score:2, Insightful)
Their conclusion is that while the INITITAL defect rate of Apache is roughly equivalent to a closed source product (since they are testing a development release), the Open Source methodology reduces the defects to a greater extent and results in code with fewer defects over time.
They are saying that Open Source coding methods are producing _better_ code in the long run.
Re:what is a "software error"? (Score:5, Insightful)
IMNSHO, that ought to be standard for any mission-critical software. Bugs and the places that bugs live in are not created equal. The beauty of Apache (at least 1.13) is that the overall system can be very robust and reliable with rather buggy modules. I suspect the problem with IIS is that everything assumes everything else is perfect, which overall doesn't quite work so well.
Re:So if they found them... (Score:5, Insightful)
The metrics report does mention the version number (dev-1/31/03), though the fact that this is development code is not explicitly noted No mentions is made who commissioned this study. Perhaps the company is simply fishing for clients.
Re:So if they found them... (Score:3, Insightful)
So shut up, you little twerp.
Re:to be expected from Open Source (Score:3, Insightful)
That is: he is sure that *both* processes take into account severity and priority of bugs. The poster just felt that their priorities were different. (Polish being more important for commercial code, absolute correctness for open source. The question of the 'correct' balance is left up to the reader.)
Re:Development release (Score:4, Insightful)
Why did they use the development branch of Apache
Let me restate this: why are they comparing pre-alpha software with production releases?
Most simple answer: because they wanted to find flaws. The second most popular web software is ISS. This looks like a Microsoft tactic: anonymously hire this company to "evaluate" code so that the results look unbiased. Everyone will likely realize that the competitor is Microsoft's ISS, so it doesn't need to be stated bluntly. MS wins; another (small) battle for mindshare is won.
Re:It's not fair! (Score:2, Insightful)
what doesn't kill you makes you stronger (Score:2, Insightful)
Re:So the error level in pre-release Apache ... (Score:4, Insightful)
everyone is reading this wrong (Score:2, Insightful)
The article seems to indicate that the
This report is simply an attempt to prove a simple hypothesis about OSS: it gets increasinly refined as it matures.
Reasoning believes they've proved the hypothesis because Apache, a middle-aged project, I suppose, has an error density comparable to commercial software, while the TCP/IP stack, a mature project, has a significantly lower density.
This isn't inteded to be a comparison of web servers (come on, people, *of course* they didn't have access to IIS) it is intended to be a mildy interesting observation about the life-cycle of open source software.
It would be a lot more interesting if we could see an analysis of whether or not commercial software goes through a similar maturing process. Maybe commercial products also grow refined with age. Maybe not. If so, which matures faster?
This sooo does not matter (Score:2, Insightful)
Re:So if they found them... (Score:2, Insightful)
A simple reductio ad absurdum from this: if you produce thousands and thousands of lines of harmless, simple code to do something that could be done in a line, then your more verbose code is "better" than the concise one by this metric.
This is assuming that it is possible to reliably statically test for errors in the first place, and that one "error" is equivalent to another... All seems a little suspect to me.
This signature is intentionally pointless.
within the statistical margin of error (Score:3, Insightful)
Furthermore, while presumely many commercial equivalents were used to generate the commercial average, only one Apache was used to generate the FS/OSS average error density. Again, very crappy statistics.
Even if 100 different FS/OSS projects like Apache and Apache were used to generate that 0.53 average, and 100 different commercial equivalents used to generate the commercial average, it's probably still within the margin of error (or standard deviation).
In short, this study = completely insignificant. Likewise, so was their previous study showing that FS/OSS has a lower bug-density, as it only used one FS/OSS project. To get useful statistics, you need hundreds of data-points -- not one.
Re:So if they found them... (Score:2, Insightful)
In this case, the bad form in using early returns is that using them leads one to not look at the whole routine as a cohesive whole where all the antecedents and consequents are correctly considered and accounted for. It's similar to why:
if (a) {
}
else if (b) {
}
is bad form compared to
if (a) {
}
else {
if (b) {
}
}
From tracing point of view, they are indistinguishable. They may even compile to the same set of instructions. The second, however, shows a level of diligence on the part of the engineer that all the possible routes are considered and there is no dangling consequent.
Disclaimer: The real reasons why these things are bad form are practically impossible to convey in an example that doesn't make use of real code. i.e. it's the "..." bit that provides the opportunity for the bad-form constructs to leak bugs.
Re:Apache 1.3? (Score:3, Insightful)
I keep hearing this, and I'm not convinced.
I didn't see anything in the article about what versions of closed-source codebases they used for comparison. But I'd hypothesize that it's code that they've been contracted to analyze. That means it's probably development code in that event, too.
We can't gritch about them using Apache 2.1-dev unless we have reason to believe they didn't compare againt dev versions. We can gritch about not having this information.