Forgot your password?
typodupeerror
Programming IT Technology

Do Static Source Code Analysis Tools Really Work? 345

Posted by CmdrTaco
from the if-you're-stupid-they-do dept.
jlunavtgrad writes "I recently attended an embedded engineering conference and was surprised at how many vendors were selling tools to analyze source code and scan for bugs, without ever running the code. These static software analysis tools claim they can catch NULL pointer dereferences, buffer overflow vulnerabilities, race conditions and memory leaks. Ive heard of Lint and its limitations, but it seems that this newer generation of tools could change the face of software development. Or, could this be just another trend? Has anyone in the Slashdot community used similar tools on their code? What kind of changes did the tools bring about in your testing cycle? And most importantly, did the results justify the expense?"
This discussion has been archived. No new comments can be posted.

Do Static Source Code Analysis Tools Really Work?

Comments Filter:
  • Yes. (Score:4, Insightful)

    by Dibblah (645750) on Monday May 19, 2008 @12:24PM (#23463726)
    It's another tool in the toolbox. However, the results are not necessarily easy to understand or simple to fix. For example, see the recent SSL library issue - Which exhibited minimal randomness due to someone "fixing" an (intended) uninitialized memory area.
  • by mdf356 (774923) <mdf356NO@SPAMgmail.com> on Monday May 19, 2008 @12:24PM (#23463734) Homepage
    Here at IBM we have an internal tool from research that does static code analysis.

    It has found some real bugs that are hard to generate a testcase for. It has also found a lot of things that aren't bugs, just like -Wall can. Since I work in the virtual memory manager, a lot more of our bugs can be found just by booting, compared to other domains, so we didn't get a lot of new bugs when we started using static analysis. But even one bug prevented can be work multiple millions of dollars.

    My experience is that, just like enabling compiler warnings, any way you have to find a bug before it gets to a customer is worth it.
  • OSS usage (Score:5, Insightful)

    by MetalliQaZ (539913) on Monday May 19, 2008 @12:25PM (#23463742)
    If I remember correctly, one of these companies donated their tool to many open source projects, including Linux and the BSDs. I think it led to a wave of commits as 'bugs' were fixed. It seemed like a pretty good endorsement to me...
  • Re:Yes. (Score:2, Insightful)

    by mdf356 (774923) <mdf356NO@SPAMgmail.com> on Monday May 19, 2008 @12:28PM (#23463784) Homepage
    If you're using uninitialized memory to generate randomness, it wasn't very random in the first place.

    Not that I actually read anything about the SSL "fix".
  • Re:In Short, Yes (Score:1, Insightful)

    by Anonymous Coward on Monday May 19, 2008 @12:28PM (#23463788)
    The proper answer would be: No. A fully working static code analyzer would be like solving the Halting Problem, which has been proven to be impossible. Essentially you can just try to catch as many potential problems as you can, but you can never catch all.
  • Yes, they work. (Score:5, Insightful)

    by Anonymous Coward on Monday May 19, 2008 @12:32PM (#23463832)
    You will probably be amazed at what you will catch with static analysis. No, it's not going to make your program 100% bug-free (or even close), but every time I see code dies on an edge case that would've been caught with static analysis, it makes me want to kill a kitten (and I'm totally a "cat person" mind you).

    Static analyzers will catch the stupid things - edge cases that fail to initialize a var, but then lead straight to de-referencing it; memory leaks on edge-case code paths, etc. that shouldn't happen but often do, and get in the way of find real bugs in your program logic.
  • by BrotherBeal (1100283) on Monday May 19, 2008 @12:40PM (#23463914)
    the more they stay the same. Static code analysis tools are just like smarter compilers, better language libraries, new-and-improved software methodologies, high-level dynamic languages, modern IDE's, automated unit test runners, code generators, document tools and any number of other software tools that have shown up over the past few decades.

    Yes, static code analysis can help improve a team's ability to deliver a high-quality product, if it is embraced by management and its use is enforced. No, it will not change the face of software development, nor will it turn crappy code into good code or lame programmers into geniuses. At best, when engineers and management agree this is a useful tool, it can do almost all the grunt work of code cleanup by showing exactly where problem code is and suggesting extremely localized fixes. At worst, it will wind up being a half-assed code formatter since nobody can agree on whether the effort is necessary.

    Just like all good software-engineering questions, the answer is 'it depends'.
  • Re:Yes. (Score:3, Insightful)

    by moocat2 (445256) on Monday May 19, 2008 @12:41PM (#23463940)
    Assuming you are talking about the SSL issue in Debian - the original 'issue' they tried to fix was reported by Valgrind. Valgrind is a run-time analysis tool.

    While the parent makes a good point that results are not always easy to understand or fix - since the original post is about static vs run-time analysis tools, it's good to understand that they each have their problems.
  • Re:In Short, Yes (Score:5, Insightful)

    by Goaway (82658) on Monday May 19, 2008 @12:49PM (#23464038) Homepage
    You don't need to be perfect to be useful.
  • by iamwoodyjones (562550) on Monday May 19, 2008 @01:04PM (#23464208) Journal
    I have used static analysis as part of our build process on our Continous Integration machines and it's definitely worth your time to set it up and use it. We use FindBugs with our Java code and have it output html reports on a nightly basis. Our team lead comes in early in the morning and peruses them and assigns them to either "Suppress" or fix the issues. We shoot for zero bugs either through suppressing them if they aren't bugs or by fixing them. FindBugs doesn't give too many false positives so it works great.

    Could this be just another trend?

    I don't worry about what's "trendy" or not. Just give the tool a shot in your group and see if it helps/works for you or not. If it does keep using it otherwise abandon it.

    What kind of changes did the tools bring about in your testing cycle?

    We use it _before_ the test cycle. We use it to catch mistakes such as "Whoops! Dereferenced a pointer there, my bad" before going into the test cycle.

    And most importantly, did the results justify the expense?

    Absolutely. The startup cost of adding static analysis for us was one developer for 1/2 a day to setup FindBugs to work on our CI build on a nightly basis to give us HTML reports. After that, the cost is our team lead to check the reports in the morning (he's an early riser) and create bug reports based on them to send to us. Some days there's no reports, other days (after a large check-in) it might be 5-10 and about an hour of his time.

    It's best to view this tool as preventing bugs, synchronization issues, performance issues, you name it issues before going into the hands of testers. But, you can extend several of the tools like FindBugs to be able to add new static analysis test cases. So if a tester finds a common problem that effects the code you can go back and write a static analysis case for that, add it to the tool and the problem shouldn't reach the tester again.

  • by mrogers (85392) on Monday May 19, 2008 @01:06PM (#23464238)
    Depends on whether one interprets "you should comment" as "you should document" or "you should comment out", I guess. :-)
  • by deep-deep-blue (1055812) on Monday May 19, 2008 @01:07PM (#23464242)
    Another good point for using lint, is that after a while a programmer learns the the way, and the outcome is a better code in a shorter time. Of course I also found that are a few ways to avoid lint errors/warnings in a way that lead to some very ugly bugs.
  • by Lord_Frederick (642312) on Monday May 19, 2008 @01:08PM (#23464256)
    Any tool can be considered a "crutch" if it's misused. I don't think anyone that put men on the moon would want to return to sliderules, but a calculator is only a crutch if the user doesn't understand the underlying fundamentals. Debugging tools are just tools until they stop simply performing tedious work and start doing what the user is not capable of understanding.
  • by flavor (263183) on Monday May 19, 2008 @01:27PM (#23464478) Homepage
    Did you walk uphill in the snow, both ways, when you were a kid, too? At one point in time, high-level languages like ASSEMBLER were considered crutches for people who weren't Real Programmers [catb.org]. Get some perspective!

    Look, people make mistakes, and regardless of how good a programmer you are, there is a limit to the amount of state you can hold in your head, and you WILL dereference a NULL pointer, or create a reference loop, at some point in your career.

    Using a computer to catch these errors is just another flavor of metaprogramming. Get over it, and go be more productive with these tools, instead of whining for the days when you coded on bare metal with your bare hands and you liked it.

    Arrgh.
  • These tools require skill. Blindly fixing things that Lint shows up can introduce new bugs or conversely using lint notation to shut the warnings off can mask bugs.

    I also don't think new languages help bad programmers much. Bad code is still bad code so now instead of crashing it will just memory leak or just not work right.

    On a software project I worked on before our competition spent two years and two million dollars did their code in visual basic and MSSQL and they abandoned their effort when no matter what hardware they threw at it they couldn't get their software to handle more than 400 concurrent users. We did our project in C and with a team for 4 built something in about a year that handled 1200 users on a quad CPU P III 400mhz Compaq. Even when another competitor posed as a client and borrowed some of my ideas (they added a comms layer instead of using the SQL server for communication) they still required a whole rack of machines to do what we did with one out of badly out of date test machine.

    C is a fine tool if you know how to use it so I doubt it will go away any time soon.
  • To summarize... (Score:3, Insightful)

    by kclittle (625128) on Monday May 19, 2008 @01:49PM (#23464708)
    Static analysis is a tool. In good hands, it is a valuable tool. In expert hands, it can be invaluable, catching really subtle bugs that only show up in situations unlike anything you've ever tested -- or imagined to test. You know, situations like what your customers will experience the weekend after a major upgrade (no joking...)
  • by zimtmaxl (667919) on Monday May 19, 2008 @01:49PM (#23464710) Homepage
    It may be the best tool in the world - I admit I do not know it. But the word "proved" makes me suspicious. To me this sounds like the typical - and wide spread - management speak to make business decision makers and their insurrers sleep well. Thank you! This gives the perfekt example that the misleading wording is even used by educational bodies.
    Is this is a proof or do some mistakenly think they're safe?

    Who "proved" Astree to be error free in the first place?!
  • by Incster (1002638) on Monday May 19, 2008 @01:54PM (#23464764)
    You should strive to make your code as clean as possible. Turn on maximum warnings from your compiler, and don't allow code that generates warnings to be checked in to your source repository. Use static analysis tools, and make sure your code passes without issue there as well. These tools will generate many false positives, but if you learn to write in a style that avoids triggering warnings, quality will go up. You may be smarter than Lint, but the next guy that works on the code may not be. Static analysis tools are just another tool in the tool box. Also use dynamic analysis tools like Purify, valgrind, or whatever works in your environment. Writing quality code is hard. You need all the help you can get.
  • Re:signal to noise (Score:4, Insightful)

    by McGregorMortis (536146) on Monday May 19, 2008 @02:11PM (#23465002)
    If you're tuning it to ignore assignment within a test , ie "if( x=y ) {}", then you're missing the one of the great points of using PC-Lint.

    That code is simply in poor taste, even if it works. What PC-Lint, and good taste, say you should do is change the code to "if( (x=y) != 0 ) {}". This will satisfy PC-Lint, and also makes your intention very clear to the next programmer who comes along. And, best of all, it doesn't generate a single byte of extra code, because you've only made explicit what the compiler was going to do anyway.
  • by Alphasite (1261864) on Monday May 19, 2008 @02:21PM (#23465104) Homepage
    Quite the contrary, I think he's got a point. That's mainly the reason why great companys (specially google) try so hard to get the best coders they can.

    Of course best coders still make mistakes but lousy coders make a lot of them, belive me "I've seen things you people wouldn't belive". Every coder makes mistakes but some coders or maybe so called coders which in fact happens to have a learn xxx in 21 days course makes lots of mistakes.
  • Re:In short, YMMV (Score:5, Insightful)

    by TemporalBeing (803363) <bm_witness@yahoo ... minus herbivore> on Monday May 19, 2008 @02:31PM (#23465226) Homepage Journal

    Enter the clueless PHB with a metric and chart fetish, stage left. This guy doesn't understand what those things are, but might make it his personal duty to chart some progress by showing how much fewer warnings he's got from the team this week than last week. So useless man-hours are spent on useless morphing perfectly good code, into something that games the tool. For each 1 real bug found, there'll be 100 harmless warnings that he makes it his personal mission to get out of the code.
    I've found that eliminating compiler warnings will do a lot for finding bugs. Sure, there may be a number of "harmless" ones, but cleaning them up will still do a lot of good to the code too, and make the other not-so-harmless ones stand out even more. It also gives good practice for resolving the issues so that you become more proactive than reactive to bugs in the code. Just 2 cents.
  • by Ungrounded Lightning (62228) on Monday May 19, 2008 @02:36PM (#23465264) Journal
    Me: "ok, but you said not everything it flags there is a bug, right?"
    Him: "Yes, you need to actually look at them and see if they're bugs or not."
    Me: "Then what sense does it make to generate charts based on wholesale counting
                entities which may, or may not be bugs?"
    Him: "Well, you can use the charts to see, say, a trend that you have less
              of them over time, so the project is getting better."
    Me: "But they may or may not be actual bugs. How do you know if this week's
              mix has more or less actual bugs than last weeks, regardless of what the
              total there is?"
    Him: "Well, yes, you need to actually look at them in turn to see which are actual bugs."
    Me: "But that's not what the tool counts. It counts a total which includes an
                unknown, and likely majority, number of false positives."
    Him: "Well, yes."
    Me: "So what use is that kind of a chart then?"
    Him: "Well, you can get a line or bar graph that shows how much progress
              is made in removing them."

    Your next line is:

    Me: "So you're selling us a tool that generates a lot of false warnings
              and a measurement on how much unnecessary extra work we've done to
              eliminate the false warnings. Wouldn't it make more sense not to use
              the tool in the first place and spend that time actually fixing real bugs?"

    To work this question must be asked with the near-hypnotized manager watching.
  • Re:In short, YMMV (Score:2, Insightful)

    by Anonymous Coward on Monday May 19, 2008 @02:38PM (#23465292)
    Also beware of programmers who believe the comments of inherited code without actually looking into it.
  • by SpryGuy (206254) on Monday May 19, 2008 @02:38PM (#23465296)
    If you're doing C# development, you should really check out JetBrains "ReSharper". Version 4.0 is due out soon, which supports all the C#3.0 syntaxes (extension methods, LINQ, lambda expressions, etc), but even 3.x is a worthwhile tool. It does real-time syntax checking (as you type) so you don't have to compile to find out you have a syntax error, as well as tons of refactorings, and very useful static code analysis.

    Once you develop with Resharper, you really can't go back to using VS without it... it's like coding with stone knives and bear skins.
  • by Chris Snook (872473) on Monday May 19, 2008 @02:38PM (#23465298)
    Because they cannot solve the halting problem, there are many instances where they will see a questionable piece of code, and have to decide whether they should flag it and risk a false positive, or ignore it and risk a false negative. This is where the magic happens, at least in the high-end commercial code analysis tools. If it always errs on the side of false positives, the output will be ignored in all but the most thoroughly audited fields. If it always errs on the side of false negatives, it's worthless. A lot of work goes into analyzing which practices commonly cause problems in the real world, and fine tuning the problem detection code to look for those, while perhaps passing up certain classes of bugs that are very rare and very computationally difficult to identify.
  • Re:In Short, Yes (Score:5, Insightful)

    by Anonymous Coward on Monday May 19, 2008 @02:45PM (#23465364)

    The proper answer would be: No. A fully working static code analyzer would be like solving the Halting Problem, which has been proven to be impossible. Essentially you can just try to catch as many potential problems as you can, but you can never catch all.
    I hate it when the halting problem is trotted out as "proof" that formal verification is impossible. If you like to put intractable recursion in your code then you probably shouldn't be a programmer. (Maybe you could draft legislation instead.) In practice, you should be able to prove (at least informally) that your program halts when it's supposed to.

    The only real significance of the halting problem is to demonstrate that there can be some pretty absurd programs out there. It is not an indictment of static analyses. Nor is it an excuse to have less than total confidence in the correctness of your code.
  • Re:In Short, Yes (Score:5, Insightful)

    by pnewhook (788591) on Monday May 19, 2008 @03:08PM (#23465672)

    Would it not make sense to run this tool to catch these types of errors before wasting everyones time in a code review?

    By the time you get to code review and test, you should be catching logic errors, not stupid syntactical and poor code style ones. If the tool helps a developer clean up and catch the obvious stuff, then testing can be much more productive catching the real problems.

    Basically if the tool helps reduce errors then it is useful. Same comment goes for code complexity checkers. No tool will catch everything though, but then again you shouldn't be depending on it to.

  • they are useful (Score:3, Insightful)

    by elmartinos (228710) on Monday May 19, 2008 @03:10PM (#23465700) Homepage
    Its not a trend, it is something developers have been doing for a long time. We have a build system here that automatically compiles and runs unit tests, and when something fails the developers gets an email. We try to automate as much as possible, so we also have several static code analysis tools like PMD, Findbugs, Checkstyle installed. All of them are not perfect, but they all detect at least some problems; its better than nothing. It is also important that these tools can be switched off so that they don't get annoying. PMD does this very nicely, you can disable checks on a method based granularity with a simple annotation at places where appropriate.
  • Re:In Short, Yes (Score:4, Insightful)

    by _Swank (118097) on Monday May 19, 2008 @04:19PM (#23466700)
    Though, to be fair, "rgrep scanf" is a crude form of static analysis. So you haven't avoided it entirely.
  • Re:In Short, Yes (Score:3, Insightful)

    by just_another_sean (919159) on Monday May 19, 2008 @04:25PM (#23466780) Homepage Journal

    By the time you get to code review and test, you should be catching logic errors, not stupid syntactical and poor code style ones. If the tool helps a developer clean up and catch the obvious stuff, then testing can be much more productive catching the real problems.

    Sounds like a good way to teach developers about these stupid errors as well. As someone whose knowledge of programming is self taught I learned a long time ago to pay attention to all errors, warnings and output from tools like lint to add to my understanding of the correct way to do things.
  • Re:In Short, Yes (Score:4, Insightful)

    by pthisis (27352) on Monday May 19, 2008 @04:33PM (#23466880) Homepage Journal

    I disagree. Think of a loop where a break condition depends on the validity of, say, Goldbach's conjecture.


    That would be one of the absurd programs the GP was slamming. But a program where the break condition depends on, say, the user's input isn't amenable to static analysis and is perfectly reasonable and useful.

    But you don't need to be perfect to be decent. A lot of static analysis can't tell what will happen, but can warn you if some code is unreachable, if no path will ever free memory, if a loop runs off the end of a memory allocation, etc.

    The Linux kernel uses a lot of static checking tools to pretty great effect (sparse, for one, is extremely helpful, and the Stanford checker found a lot of problems too).
  • by Allador (537449) on Monday May 19, 2008 @07:26PM (#23468790)
    The 'me' in this case is missing the point.

    You dont just run the tool over and over again and never adapt it to your code.

    If it produces a bunch of false positives, then you go in and modify the rules to not generate those false positives.

    Thats half the point of something like this, you need to tune it to your project.

    The flip side is that if you see some devs over and over making the same kind of mistake, well you can write a new rule in it to flag that kind of thing.

    If you have an endless number of false positives, that doesnt ever go down, then you are either:

    1. Not using the tool correctly.
    or
    2. Not working on a project that is amenable to this tool.

    IME, the vast majority of time its #1. Now you may find that for certain small or narrowly scoped projects, or those worked on by 2 super-gurus, that the overhead for learning and tuning the tool for that project isnt worth it. But thats something you'd have to find out yourself, and it differs from project to project.
  • by benhattman (1258918) on Monday May 19, 2008 @09:46PM (#23469898)
    Even if such a tool only catches a couple errors, it is probably worth the investment. If there is one intermittent error on a subset of your target platforms, and if this tool catches that error it can easily save hours of debugging work. Considering engineering rates, these will pay for themselves quickly.

    Unless, engineers begin to rely on them! If I stop thinking about referencing null pointers because my tool catches 90% of them, I haven't gained a thing.
  • by benhattman (1258918) on Tuesday May 20, 2008 @12:09AM (#23470894)

    Me: "ok, but you said not everything it flags there is a bug, right?" Him: "Yes, you need to actually look at them and see if they're bugs or not." Me: "Then what sense does it make to generate charts based on wholesale counting entities which may, or may not be bugs?" Him: "Well, you can use the charts to see, say, a trend that you have less of them over time, so the project is getting better." Me: "But they may or may not be actual bugs. How do you know if this week's mix has more or less actual bugs than last weeks, regardless of what the total there is?" Him: "Well, yes, you need to actually look at them in turn to see which are actual bugs." Me: "But that's not what the tool counts. It counts a total which includes an unknown, and likely majority, number of false positives." Him: "Well, yes." Me: "So what use is that kind of a chart then?" Him: "Well, you can get a line or bar graph that shows how much progress is made in removing them." Your next line is: Me: "So you're selling us a tool that generates a lot of false warnings and a measurement on how much unnecessary extra work we've done to eliminate the false warnings. Wouldn't it make more sense not to use the tool in the first place and spend that time actually fixing real bugs?" To work this question must be asked with the near-hypnotized manager watching.
    Meh, I would respond with a second graph based on the first one titled "Static Code Analysis Warnings Before/After This Consultant Was Hired." It would show the warnings produced by the tool one week before your org started using it vs the warnings each week since the consultant was hired.

    Maybe that would be enough to convince him that a more warnings from the tool does not necessarily mean he'll keep his job.

Byte your tongue.

Working...