Using Redundancies to Find Errors 338
gsbarnes writes "Two Stanford researchers (Dawson Engler and Yichen Xie) have written a paper (pdf) showing that seemingly harmless redundant code is frequently a sign of not so harmless errors. Examples of redundant code: assigning a variable to itself, or dead code (code that is never reached). Some of their examples are obvious errors, some of them subtle. All are taken from a version of the Linux kernel (presumably they have already reported the bugs they found). Two interesting lessons: Apparently harmless mistakes often indicate serious troubles, so run lint and pay attention to its output. Also, in addition to its obvious practical uses, Linux provides a huge open codebase useful for researchers investigating questions about software engineering."
Here's a text link (Score:4, Informative)
http://216.239.37.100/search?q=cache:yuZKW8CjTqIC
More details / PostScript version (Score:2, Informative)
Appeared in FSE 2002. Finds funny bugs by looking for redundant operations (dead code, unused assignments, etc.). From empirical measurements, code with such redundant errors is 50-100% more likely to have hard errors. Also describes how to check for redundancies to find holes in specifications.
Link to PostScript file for easy viewing/printing
File [stanford.edu]
html (Score:3, Informative)
For future woud be researchers (Score:1, Informative)
Re:I have no idea what this article means ! (Score:3, Informative)
That paper explored the hypothesis that redundancies, like type erros, flag higher-level correctness mistakes. They evaluated the approach using four checkers which they applied to the Linux operating system. These simple analyses found many superising (to them) error types. Further, those errors correlated well with known hard errors: redundancies seemed to flag confused or poor programmers who were prone to other error types. According to them, these indicators could be used to decide where to audit a system.
Re:Is this not Obvious (Score:5, Informative)
They used a custom checker that finds these things much more effiectivly than lint.
I actually remember the flood of bug reports and kernel patches that toy of theirs generated the first few months they put it to use on the kernel.
lclint pointer incorrect... (Score:3, Informative)
Larch FTP Site
January 28, 1999
Many files formerly on this site were moved elsewhere after a disk
crash in March, 1998.
The LCLint distribution can be found at
ftp://ftp.sds.lcs.mit.edu/pub/lclint
or http://www.sds.lcs.mit.edu/lclint
Linux for research (Score:2, Informative)
An example of such is http://plg.uwaterloo.ca/~migod/papers/evolution.p
Re:Saw his talk at FSE (Score:1, Informative)
I've long wished that Gimpel Software would release a Linux version of their lint -- it does everything described in the PDF article and much more.
Related research at Berkeley: CQual (Score:2, Informative)
It's been used to find security holes.
Eliminate Redundancies with AspectJ (Score:2, Informative)
Re:Parallel programming 101 (Score:2, Informative)
Gee, reread yourself... Sorry but they're 100% right. If they could acquire the lock, cam and cam->ops cannot be NULL due to the first check.
BTW, in 2.4.18, the code now looks like this for this same function:
struct cam_data *cam = dev->priv; int retval; if (!cam || !cam->ops) return -ENODEV; DBG("cpia_mmap: %ld\n", size); if (size > FRAME_NUM*CPIA_MAX_FRAME_SIZE) return -EINVAL; /* REDUNDANT! */
if (!cam || !cam->ops)
return -ENODEV; /* make this _really_ smp-safe */
if (down_interruptible(&cam->busy_lock))
return -EINTR;
Re:lint is horrible (Score:3, Informative)
You can find duplicated Java code... (Score:2, Informative)
http://pmd.sourceforge.net/cpd.html
CPD uses a variant on Greedy String Tiling to find duplicated code in Java programs. There's also a JavaSpaces version since finding duplicated code is fairly parallelizable....
Yours,
Tom
Re:Error in paper (Score:3, Informative)
Re:Parallel programming 101 (Score:2, Informative)
>wouldn't cam->busy_lock cause a segfault
I don't feel bad about being flamed by people that missed my point, but I do feel bad about this. You are 100% right - the way I read the code can't possibly be correct.
--
GCP