Do Static Source Code Analysis Tools Really Work? 345
jlunavtgrad writes "I recently attended an embedded engineering conference and was surprised at how many vendors were selling tools to analyze source code and scan for bugs, without ever running the code. These static software analysis tools claim they can catch NULL pointer dereferences, buffer overflow vulnerabilities, race conditions and memory leaks. Ive heard of Lint and its limitations, but it seems that this newer generation of tools could change the face of software development. Or, could this be just another trend? Has anyone in the Slashdot community used similar tools on their code? What kind of changes did the tools bring about in your testing cycle? And most importantly, did the results justify the expense?"
Re:In Short, Yes (Score:3, Interesting)
Static analysis tools (Score:4, Interesting)
I've also used Polyspace. In my opinion, it is expensive, slow, can't handle some constructs well and has a *horrible* signal to noise ratio. There is also no mechanism for silencing warnings in future runs of the tool(like the -e flag in lint). On the other hand, it has caught a (very) few issues that PC-Lint missed. Is it worth it? I suppose it depends if you are writing systems that can kill people if something goes wrong.
They do work (Score:5, Interesting)
Even lint is decent -- the trick is just using it in the first place. As for expense, if you have more than, oh, 3 developers, they pay for themselves by your first release. Besides, many good tools such as valgrind are free (valgrind isn't static, but it's still useful).
Static analysis tools (Score:3, Interesting)
Of course they can work (Score:5, Interesting)
It would probably be more useful if you could state which kind of problem you are trying to solve and which tools you are considering to buy. That way, people who have experience with them could suggest which work best
MIT Site (Score:4, Interesting)
Very useful in .Net (Score:3, Interesting)
FxCop too has gone server-side too (for those familiar with
Not Yet, In My Personal Experience. (Score:4, Interesting)
Useful for planning tests (Score:5, Interesting)
I've since moved on, and I think the tool has since gone offline, but I think there's a real value to doing static analysis as part of the planning for everything else.
Trends or Crutches? (Score:4, Interesting)
For instance, we put men on the moon with a pencil and a slide rule. Now no one would dream of taking a high school math class with anything less than a TI-83+.
Languages like Java and C# are being hailed while languages like C are derided and many posts here on slashdot call it outmoded and say it should be done away with, yet Java and C# are built using C.
It seems to me that there is no substitute for actually knowing how things work at the most basic level and doing them by hand. Can a tool like Lint help? Yes. Will it catch everything? Likely not.
As generations of kids grow up with the automation made by generations who came before, and have less incentive to learn how the basic tools work, an incentive which will diminish, approaching 0, I think we're in for something bad.
As much as people bitch about kids who were spoiled by BASIC, you'd think that they'd also complain about all the other spoilers. Someday all this new, fancy stuff could break and someone who only knows Java, and even then checks all their source with automated tools will likely not be able to fix it.
Of course, this is more of just a general criticism and something I've been thinking about for a few weeks now. Anyway, carry on.
To a degree, yes (Score:5, Interesting)
However these things do work and are highly recommended. If you use other advanced techniques (like Descign by Contract),they will be a lot less useful though. They are best for traditional code that does not have safety-nets (i.e. most code).
Stay away from tools that do this without using your compiler. I recently evaluated some static analysis tools found that the tools that do not use the native compilers can have serious problems. One example was an incorrecly set symbol in the internal compiler of one tool, that could easily change the code functionality drastically. Use tools that work frrom a build environment and utilize the compiler you are using to build.
Re:Yes (Score:3, Interesting)
Buyer (User) Beware (Score:3, Interesting)
Re:In Short, Yes (Score:5, Interesting)
Re:Trends or Crutches? (Score:3, Interesting)
I whipped out my trusty slide rule and commenced to using it. The teacher wanted to confiscate it and thought that I was cheating with some sort of high-tech device... mind you it was just plastic and cardboard. I'm sure you've all seen one before.
I'm only just about to turn 24, so 7th grade was not long ago for me.
The point is, students should be required to know how to do things by hand. a PhD in physics 20 years ago clearly knew how to do calculus by hand. if he wants to use a Ti-92 to do it now, that's his business. Good that those aren't allowed in class (even if they are allowed on the SAT and AP exams, well the ti-89 is).
Intro to comp sci shouldn't be taught with Java any more than elementary school math should be "intro to the calculator." You just cripple people's minds that way.
I ended up getting a BA in English the first time around because of personal reasons I was too messed up to concentrate on maths. I'm now starting to take a 2nd degree in MechE - and I'm making a good faith effort to do as much by hand as possible, because I don't want to fuck something up in the future because I figured the calculator was giving me a right answer and I had no idea of where the proper answer should be.
summary: using a tool to check your answer is one thing. Relying on it to get the answer in the first place is lazy, stupid, and potentially dangerous.
Absolutely (Score:2, Interesting)
WHOA... nice timing (Score:2, Interesting)
make up for language deficiencies (Score:4, Interesting)
I've never gotten anything useful out of these tools. Generally, encapsulating unsafe operations, assertions, unit testing, and using valgrind, seem both necessary and sufficient for reliably eliminating bugs in C++. And whenever I can, I simply use better languages.
Watch the differences! (Score:4, Interesting)
Something that we've found incredibly useful here and in past workplaces was to watch the _differences_ between Gimpel PC-Lint runs, rather than just the whole output.
The output for one of our projects, even with custom error suppression and a large number of "fixups" for lint, borders on 120MiB of text. But you can quickly reduce this to a "status report" consisting of statistics about the number of errors -- and with a line-number-aware diff tool, report just any new stuff of interest. It's easy to flag common categories of problems for your engine to raise these to the top of the notification e-mails.
Keeping all this data around (it's text, it compresses really well) allows you to mine it in the future. We've had several cases where Lint caught wind of something early on, but it was lost in the noise or a rush to get a milestone out -- when we find and fix it, we're able to quickly audit old lint reports both for when it was introduced and also if there are indicators that it's happening in other places.
And you can do some fun things like do analysis of types of warnings generated by author, etc -- play games with yourself to lower your lint "score" over time...
The big thing is keeping a bit of time for maintenance (not more than an hour a week, at this point) so that the signal/noise ratio of the diffs and stats reports that are mailed out stays high. Talking to your developers about what they like / don't like and tailoring the reports over time helps a lot -- and it's an opportunity to get some surreptitious programming language education done, too.
Re:In Short, Yes (Score:3, Interesting)
In particular, I've found FindBugs has an amazing degree of precision considering it's an automated tool. If it comes up with a "red" error, it's almost certainly something that should be changed. I'm not familiar with any C/C++ tool that comes close.
Re:Coverity & Klocwork (Score:3, Interesting)
I did some work running Coverity for EnterpriseDB, against the PostgreSQL code base (and yes, we submitted all patches back, all of which were committed).
Based on my experience:
1) Yes, Coverity produced a LOT of false positives - a few tens of thousands for the 20-odd true critical bugs we found. However, the first step in working with Coverity is configuring it to know what can be safely ignored. After about 2 days of customizing the configuration (including points where I could configure it to understand that certain methods fixed its common complaints already), the bug list dropped to 200. 2 days of configuration by an inexperienced user dropped it to a total false positive rate of about 75%, if that, though only about half of those proved to be critical bugs. I think that's good enough to be useful.
2) Can't argue - horrendously expensive.
3) Postgres is smaller than what you're talking about... but still, the speed was a pain, but the system was useful enough to make up for it, particularly once we had the system configured to know what bugs were real and automated its running every few nights. While EnterpriseDB still had the license, it was a great supplement to nightly testing.
Re:In short, YMMV (Score:4, Interesting)
Halting problem bullshit (Score:4, Interesting)
Several posters have cited the "halting problem" as an issue. It's not.
First, the halting problem does not apply to deterministic systems with finite memory. In a deterministic system with finite memory, eventually you must repeat a state, or halt. So that disposes of the theoretical objection.
In practice, deciding halting isn't that hard. The general idea is that you have to find some "measure" of each loop which is an integer, gets smaller with each loop iteration, and never goes negative. If you can come up with a measure expression for which all those properties are true, you have proved termination. If you can't, the program is probably broken anyway. Yes, it's possible to write loops for which proof of termination is very hard. Few such programs are useful. I've actually encountered only one in a long career, the termination condition for the GJK algorithm for collision detection of convex polyhedra. That took months of work and consulting with a professor at Oxford.
The real problem with program verification is the C programming language. In C, the compiler has no clue what's going on with arrays, because of the "pointer=array" mistake. You can't even talk about the size of a non-fixed array in the language. This is the cause of most of the buffer overflows in the world. Every day, millions of computers crash and millions are penetrated by hostile code from this single bad design decision.
That's why I got out of program verification when C replaced Pascal. I used to do [acm.org] this stuff. [animats.com]
Good program verification systems have been written for Modula 3, Java, C#, and Verilog. For C, though, there just isn't enough information in the source to do it right. Commercial tools exist, but they all have holes in them.
Yes, they do - in the right hands (Score:2, Interesting)
Re:In Short, Yes (Score:3, Interesting)
The C/C++ version does MISRA-C [misra-c2.com] (the C used in the automotive industry) too.
There's also a version for Ada [wikipedia.org], of course.