Slashdot Log In
Do Static Source Code Analysis Tools Really Work?
Posted by
CmdrTaco
on Mon May 19, 2008 12:19 PM
from the if-you're-stupid-they-do dept.
from the if-you're-stupid-they-do dept.
jlunavtgrad writes "I recently attended an embedded engineering conference and was surprised at how many vendors were selling tools to analyze source code and scan for bugs, without ever running the code. These static software analysis tools claim they can catch NULL pointer dereferences, buffer overflow vulnerabilities, race conditions and memory leaks. Ive heard of Lint and its limitations, but it seems that this newer generation of tools could change the face of software development. Or, could this be just another trend? Has anyone in the Slashdot community used similar tools on their code? What kind of changes did the tools bring about in your testing cycle? And most importantly, did the results justify the expense?"
Related Stories
This discussion has been archived.
No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
Full
Abbreviated
Hidden
Loading... please wait.
In Short, Yes (Score:5, Informative)
In short, YMMV (Score:5, Informative)
The thing is, these tools produce
A) a lot of "false positives", code which is really OK and everyone understand why it's ok, but the tool will still complain, and
B) usually includes some metrics of dubious quality at best, to be taken only as a signal for a human to look at it and understand why it's ok or not ok.
E.g., ne such tool, which I had the misfortune of sitting through a salesman hype session of, seemed to be really little more than a glorified grep. It really just looked at the source text, not at what's happening. So for example if you got a database connection and a statement in a "try" block, it wanted to see the close statements in the "finally" block.
Well, applied to an actual project, there was a method which just closed the connection and the statements supplied as an array. Just because, you know, it's freaking stupid to copy-and-paste cute little "if (connection != null) { try { connection.close(); } catch (SQLException e) {
Other examples include more mundane stuff like the tools recommending that you synchronize or un-synchronize a getter, even when everyone understands why it's OK for it to be as it is.
E.g., a _stateless_ class as a singleton is just an (arguably premature and unneded) speed optimization, because some people think they're saving so much by a singleton instead of the couple of cycles it takes to do a new on a class with no members and no state. It doesn't really freaking matter if there's exactly one of it, or someone gets a copy of it. But invariably the tools will make an "OMG, unsynchronized singleton" fuss, because they don't look deep enough to see if there's actually some state that must be unique.
Etc.
Now taken as something that each developper understands, runs on his own when he needs it, and uses his judgment of each point, it's a damn good thing anyway.
Enter the clueless PHB with a metric and chart fetish, stage left. This guy doesn't understand what those things are, but might make it his personal duty to chart some progress by showing how much fewer warnings he's got from the team this week than last week. So useless man-hours are spent on useless morphing perfectly good code, into something that games the tool. For each 1 real bug found, there'll be 100 harmless warnings that he makes it his personal mission to get out of the code.
Enter the snake-oil vendor's salesman, stage right. This guy only cares about selling some extra copies to justify his salary. He'll hype to the boss exactly the possibility to generate such charts (out of mostly false positives) and manage by such charts. If the boss wasn't already in a mind to do that management anti-pattern, the salesman will try to teach him to. 'Cause that's usually the only advantage that his expensive tool has over those open source tools that you mention.
I'm not kidding. I actually tried to corner one into;
Me: "ok, but you said not everything it flags there is a bug, right?"
Him: "Yes, you need to actually look at them and see if they're bugs or not."
Me: "Then what sense does it make to generate charts based on wholesale counting entities which may, or may not be bugs?"
Him: "Well, you can use the charts to see, say, a trend that you have less of them over time, so the project is getting better."
Me: "But they may or may not be actual bugs. How do you know if this week's mix has more or less actual bugs than last weeks, regardless of wh
Parent
Re:In Short, Yes (Score:5, Interesting)
Parent
Re:In Short, Yes (Score:5, Insightful)
Parent
Re:In Short, Yes (Score:5, Informative)
My group at work recently bought one of these. They catch a lot of things that compilers don't -- for example, code like this:
.. where invalid input causes arbitrarily bad behavior. They also tend to be better at inter-procedural analysis than compilers, so they can warn you that you're passing a short literal string to a function that will memcpy() from the region after that string. They do have a lot of false positives, but what escapes from compilers to be caught by static analysis tools tend to be dynamic behavior problems that are easy to overlook in testing. (If the problem were so obvious, the coder would have avoided it in the first place, right?)
Parent
Just like compiler warnings... (Score:5, Insightful)
It has found some real bugs that are hard to generate a testcase for. It has also found a lot of things that aren't bugs, just like -Wall can. Since I work in the virtual memory manager, a lot more of our bugs can be found just by booting, compared to other domains, so we didn't get a lot of new bugs when we started using static analysis. But even one bug prevented can be work multiple millions of dollars.
My experience is that, just like enabling compiler warnings, any way you have to find a bug before it gets to a customer is worth it.
Re:Just like compiler warnings... (Score:5, Informative)
Parent
OSS usage (Score:5, Insightful)
Coverity Reports Open Source Security Making Grea (Score:5, Informative)
http://it.slashdot.org/article.pl?sid=08/01/11/1818241 [slashdot.org]
- doug
Parent
Coverity Prevent Rocks (Score:5, Informative)
* I really like Insure, but it is difficult to set up on a system composed of many shared libraries. However, there are some bugs that really need run-time analysis to catch.
Parent
They do work (Score:5, Interesting)
Even lint is decent -- the trick is just using it in the first place. As for expense, if you have more than, oh, 3 developers, they pay for themselves by your first release. Besides, many good tools such as valgrind are free (valgrind isn't static, but it's still useful).
Yes, they work. (Score:5, Insightful)
Static analyzers will catch the stupid things - edge cases that fail to initialize a var, but then lead straight to de-referencing it; memory leaks on edge-case code paths, etc. that shouldn't happen but often do, and get in the way of find real bugs in your program logic.
Of course they can work (Score:5, Interesting)
It would probably be more useful if you could state which kind of problem you are trying to solve and which tools you are considering to buy. That way, people who have experience with them could suggest which work best
Testing cycle (Score:5, Informative)
Since we've had the tool for a while and have fixed most of the bugs it has found, we are required to run static analysis on new code for the latest release now (i.e. we should not be dropping any new code that has any error in it found via static analysis).
Just like code reviews, unit testing, etc., it has proved useful and was added to the software development process.
Yes (Score:5, Informative)
Add me to the Yes column
We use them (PMD and FindBugs) for eliminating code that is perfectly valid, yet has bitten us in the past. Two Java examples are unsynchronized access to a static DateFormat object and using the commons IOUtils.copy() instead of IOUtils.copyLarge().
Most tools are easy to add to your build cycle and repay that effort after the first violation
Useful for planning tests (Score:5, Interesting)
I've since moved on, and I think the tool has since gone offline, but I think there's a real value to doing static analysis as part of the planning for everything else.
Coverity & Klocwork (Score:5, Informative)
My comments would be:
(1) Klockwork & Coverity tend to produce a lot of "false positives". And by a lot, I mean, *A LOT*. For every 10000 "critical" bugs reported by the tool, only a handful may be really worth investigating. So you may spend a fair bit of time simply weeding through what is useful and what isn't.
(2) They're expensive. Coverity costs $50k for every 500k lines of code per year... We have a LOT more code than this. For the price, we could hire a couple of guys to run all of our tools through Purify *and* fix the bugs they found. Klocwork is cheaper; $4k per seat, minimum number of seats.
(3) They're slow. It takes several days running non-stop on our codebase to produce the static analysis databases. For big projects, you'll need to set aside a beefy machine to be a dedicated server. With big projects, there will be lots of bug information, so the clients tend to get bogged down, too.
In short: It all depends on how "mission critical" your code is; is it important, to you, to find that *one* line of code that could compromise your system? Or is your software project a bit more tolerant? (e.g., If you're writing nuclear reactor software, it's probably worthwhile to you to run this code. If you're writing a video game, where you can frequently release patches to the customer, it's probably not worth your while.)
To a degree, yes (Score:5, Interesting)
However these things do work and are highly recommended. If you use other advanced techniques (like Descign by Contract),they will be a lot less useful though. They are best for traditional code that does not have safety-nets (i.e. most code).
Stay away from tools that do this without using your compiler. I recently evaluated some static analysis tools found that the tools that do not use the native compilers can have serious problems. One example was an incorrecly set symbol in the internal compiler of one tool, that could easily change the code functionality drastically. Use tools that work frrom a build environment and utilize the compiler you are using to build.
Yes, absolutely (Score:5, Informative)
FindBugs is becoming increasingly widespread on Java projects, for example. I found that between it and JLint I could identify a substantial chunk of problems caused by inexperienced programmers, poor design, hastily written code, etc. JLint was particularly nice for potential deadlocks, while FindBugs was good for just about everything else.
For example:
At least in the Java world, I wish more people would use them. It would make my job so much easier.
My experience in the Python world is that pylint is less interesting than FindBugs: many of the more interesting bugs are hard problems in a dynamically typed language and so it has more "religious style issues" built in that are easier to test for. It still provides a great deal of useful output once configured correctly, and can help enforce a consistent coding standard.
Low startup cost and great benifits (Score:5, Insightful)
Could this be just another trend?
I don't worry about what's "trendy" or not. Just give the tool a shot in your group and see if it helps/works for you or not. If it does keep using it otherwise abandon it.
What kind of changes did the tools bring about in your testing cycle?
We use it _before_ the test cycle. We use it to catch mistakes such as "Whoops! Dereferenced a pointer there, my bad" before going into the test cycle.
And most importantly, did the results justify the expense?
Absolutely. The startup cost of adding static analysis for us was one developer for 1/2 a day to setup FindBugs to work on our CI build on a nightly basis to give us HTML reports. After that, the cost is our team lead to check the reports in the morning (he's an early riser) and create bug reports based on them to send to us. Some days there's no reports, other days (after a large check-in) it might be 5-10 and about an hour of his time.
It's best to view this tool as preventing bugs, synchronization issues, performance issues, you name it issues before going into the hands of testers. But, you can extend several of the tools like FindBugs to be able to add new static analysis test cases. So if a tester finds a common problem that effects the code you can go back and write a static analysis case for that, add it to the tool and the problem shouldn't reach the tester again.
Many of never all (Score:5, Informative)
Short version:
There are real bugs, with huge consequences, that can be detected with static analysis.
The tools are easy to find and worth the price, depending on the customer base you have.
In the end, that cannot detect "all" bugs that could arise in the code.
Worth it?
Only you can decide, but after a few sessions learning why tools flag suspect code, if you take those suggest to heart, you will be a better coder.
Linux kernel devs use sparse for static analysis (Score:5, Informative)
http://www.kernel.org/pub/software/devel/sparse/ [kernel.org]
Sparse has some features targeted at kernel development - for instance spotting mixing up kernel and user space pointers and a system of code annotations.
I haven't used it but I do see on the kernel mailing list that it regularly finds bugs.
Re:Yes. (Score:5, Informative)
Parent
Re:Yes. (Score:5, Informative)
Parent
Re:Trends or Crutches? (Score:5, Insightful)
I also don't think new languages help bad programmers much. Bad code is still bad code so now instead of crashing it will just memory leak or just not work right.
On a software project I worked on before our competition spent two years and two million dollars did their code in visual basic and MSSQL and they abandoned their effort when no matter what hardware they threw at it they couldn't get their software to handle more than 400 concurrent users. We did our project in C and with a team for 4 built something in about a year that handled 1200 users on a quad CPU P III 400mhz Compaq. Even when another competitor posed as a client and borrowed some of my ideas (they added a comms layer instead of using the SQL server for communication) they still required a whole rack of machines to do what we did with one out of badly out of date test machine.
C is a fine tool if you know how to use it so I doubt it will go away any time soon.
Parent