Become a fan of Slashdot on Facebook

 



Forgot your password?
typodupeerror
×
Bug

Properly Testing Your Code? 472

lowlytester asks: "I work for an organization that does testing at various stages from unit testing (not XP style) to various kinds of integration tests. With all this one would expect that defect in code would be close to zero. Yet the number of defects reported is so large that I wonder how much testing is too much? What is the best way to get the biggest bang for your testing buck?" Sometimes it's not the what, it's the how, and in situations like this, I wonder if the testing procedure itself may be part of the problem. When testing code, what procedures work best for you, and do you feel that excessive testing hurts the development process at all?
This discussion has been archived. No new comments can be posted.

Properly Testing Your Code?

Comments Filter:
  • by Anonymous Coward on Wednesday June 19, 2002 @05:26AM (#3727830)
    ... is to not make the mistake in the first place. This may sound kind of stupid, but it's true. Don't skip on sleep - so you may stay properly awake, don't run yourself on Coca/Pitr Cola, eat good food, go for walks, and you'll find yourself making far fewer mistakes and producing better quality stuff. And _double_check_ everything.
    • Of course properly written functionality test scripts (doing what the user does) will find most bugs. The downside is that it is boring to follow test scripts manually.

      My company has been successful implementing automated functionality tests with Rational Robot (part of teamtest). If you just take the time to define proper test scripts you can easily redo all functionality tests on various platforms (if you use VMWare or similar sw to simulate different platforms) at the click of a button.

      This saves time every release as the developers can focus on finding the really tough bugs instead of running boring functionality tests again.
    • by eam ( 192101 ) on Wednesday June 19, 2002 @06:24AM (#3727996)
      I agree. The whole concept is flawed. Ultimately the problem with too many bugs is not a "testing" problem, but a "design" and "implementation" problem.

      The flaw in the thinking is the assumption that all bugs are inevitable. You accept as given the idea that the bugs have to be found and corrected. It actually is possible to avoid introducing the bugs in the first place.

      The sad thing is, it is likely true in every case that *avoiding* the bugs is cheaper than *correcting* the bugs. Yet we keep introducing bugs & assuming they will be found & corrected later.
    • by sql*kitten ( 1359 ) on Wednesday June 19, 2002 @06:41AM (#3728037)
      ... is to not make the mistake in the first place. This may sound kind of stupid, but it's true. Don't skip on sleep - so you may stay properly awake, don't run yourself on Coca/Pitr Cola, eat good food, go for walks, and you'll find yourself making far fewer mistakes and producing better quality stuff.

      The question is what type of mistake. Is your program crashing a lot? Then see the above poster. Is your program generating the wrong results? Then the problem is that you have not specified rigorously enough. With good engineering specs, the actual code is just data entry.
      • most of us actually got good specs? Been close to 3 years now for me. With half-assed specs derived from business users who A) don't know what they want and B) don't know when they're out of their league when talking about how something should work you're pretty much screwed from the beginning.
        • by scott1853 ( 194884 ) on Wednesday June 19, 2002 @07:26AM (#3728194)
          I don't let customers dictate how programs should work. I make them tell me what information they have to enter, and what they want to get back out. I decide on mostly everything in the middle.
          • by sql*kitten ( 1359 ) on Wednesday June 19, 2002 @08:10AM (#3728438)
            I don't let customers dictate how programs should work. I make them tell me what information they have to enter, and what they want to get back out. I decide on mostly everything in the middle.

            Then you aren't writing particularly complex software. If your users need software that does sophisticated processing, mathematical or otherwise, then the programmer probably isn't the best person to work out how it should do it. This is true whether you're working on software for pricing derivatives, or for tracking shipments in a supply chain, or for controlling manufacturing machinery. That's why there are notations like UML, so that functional experts can communicate unambiguously to the software developers what a system should be doing. A good programmer knows about programming, a good analyst knows about business processes, some people are both, but only with years of experience, and even then, only within a single industry.

            The requirements, specification and alanlysis process is what separates software engineering from "hacking".
            • My original post was a little ambiguous. My company provides software for construction companies for different phases of their business operations. So yes, there are complex mathematical formulas. On the other hand, complex is an ambiguous term in itself. Something is only complex when you can't understand it.

              But I was referring more to the actual user interface and db support. The formulas are standard and fairly easy to write and debug. But any time a user requests that they have a button somewhere I try to cut them short and just ask them what they want the end result to be. But if they want some custom analysis then yes they need to let us know what the formula is though.
        • Maybe this is a topic for another discussion, but I don't expect business user to supply me with specs. They aren't software guys, they are business users. I expect them to supply me with business requirements and business process definitions, from which we will together develop the software process definitions from which the specs evolve.
        • Sad but true.....

          Boss: Go write some code that does some stuff

          Me: Well what about this? I need this info.

          Boss: Well just start working and we will fill that stuff in laster.

          How the heck can I write good when I am hardly told what the application is supposed to do? So I write something, it doesn't take into account the missing details that I asked about. Those get defined two weeks after the thing is supposed to be done. The app turns out terrible and then the powers that be want to know why it has problems. It is incredibly frustrating.

    • by Hater's Leaving, The ( 322238 ) on Wednesday June 19, 2002 @06:53AM (#3728083)
      Better than double checking everything is to have an external eye code review everything. It's probably a 10% overhead when it comes to the coding side, but a >50% decrease in the debugging side. Well worth it.

      I'm currently on sabbatical, but I consult 1 day a fortnight for a couple of small local companies who can't afford me full time - all I do there is code review, and they are of the opinion that I more than double the effectiveness of their less experienced programmers.

      THL.
      • by Xentax ( 201517 ) on Wednesday June 19, 2002 @07:32AM (#3728230)
        I agree -- our own company suffers from giving less effort on code reviews than most of us know we should. People try to save time by under-planning for code reviews, but that saved time is always lost at least twofold in uncaught bugs, extra time for optimization, and so on -- all things that would be identified in a solid code review.

        Identify the people in your company that have the best "critical eye" for each language you develop in -- and see to it that they get the time they need to really critique code, either during implementation or at least before starting integration testing (preferably before unit testing, actually). It may be hard to convince management the first time around, but if you account your hours precisely enough, the results WILL speak for themselves, in terms of hours spent overall and on testing/integration.

        Xentax
    • the best way to test code is to not make the mistake in the first place.

      But the way to make solid code is to get each bug out as soon as you put it in.

      Over my thirty-five years of professional programming I developed a coding/testing method that produces extremely solid code - to the point that one of my collegues once commented that [Rod] is the only person he'd trust to program his pacemaker. B-) It goes like this:

      Design ahead of coding.

      Of course! If you haven't spent more time designing than you eventually spend coding, you likely haven't yet understood the problem well enough.

      This doesn't mean you have to understad every nitty-gritty detail before you write your first line of code - you'll discover things and change your design as you go. But you should know where you're goiong. And as you go, map the road ahead of you.

      Not coding until you understand where you're going is VERY scary to administrators. But it gets you to your destination MUCH sooner than striking out randomly.

      Get the modularity right.

      Think of the final solution as a knotted mass of spaghetti threaded through meatballs inside an opaque plastic bag. Squeeze it around until you find a thin place, where few strands (representing flows of information) connect the halves of the solution on each side of the bag. Cut the bag in half here and label the strands: You've defined a module boundary and interface. Repeat with each of the two smaller bags until the remaining pile of bags each represent an understandable amount of code - about a page - or a single irreducable lump of heavily-interconnected code (one "meatball"). Then tear them open one-by-one and write the corresponding code.

      Debug as you go:

      This is the key!

      Program top-down like a tree walk, stubbing out the stuff below where you're working. As you write the code, also write one or more pieces of corresponding test-code that produces output that is a signature of the operation of every bit of the code you write, and an expected-output file that is hand-generated (or computer-generated by a different tool, preferably written by someone else, or at least in a different language and style if you're alone).

      Use a system tool (like diff or cmp) to compare the results (in preference to writing programs to "check" it, so you don't have to worry whether the test is passing beacuse the code is right or the test is broken.)

      Run the test(s) every time you make a change or add code. Make a single change at a time between test runs, get it working and tested before you move on. (This is easy for procedural modules and subroutines. For instance: You can build the running of the test into your makefile, and fail the make if the test fails. GUI stuff is tougher, and I didn't have to deal with it myself. But tools are now available to perform similarly there.)

      The result is that your bugs will generally be confined to the changes you just made, drastically limiting your search space and shortening your debugging time.

      Do COVERAGE testing, and TRACK it.

      Don't move on until you have exercised every bit of code where you were working. "Exercise" means your tests have been updated to execute every line or component of a line, driving it to its edge and corner cases and extracting a signature that shows they're doing their job correctly.

      Automated coverage tools are an inadequate kludge to try to do this after the fact. Unfortunately, they pass code once all the branches have been executed, but have no idea whether they did the right thing. They may test that you hit the edge conditions - but can they tell if the edge is off-by-one? Human intelligent, with its knowlege of intended function, is required.

      I developed a style of marking listings to document coverage, which is why I use hardcopy even in this day of glass terminals.
      - a vertical bar beside a line that is completely working. Cross-mark across its top (T) at the first of a set of working lines, across the bottom of the last of a set.
      - an "h"-shaped mark beside one that represents a partially-tested branching construct (if, for, do, "}", ...} with a cross across the right if the branch case is untested, across the left if the through case is untested. Switch to vertical bar when fully tested (including hitting the edge from both sides if applicable)
      - For compounds (i.e. "for( ; ; ) {" or " ? : " underline the portions fully tested, put an "h" with crossbar through those partially tested.
      - Declarations "pass" when you've checked that the type is right for the job and at least one hunk of client code uses them.
      - Comments "pass" pretty much automatically when you think they're right.
      - A place where code is not yet present, or where the code above and below is tested but the flow across the gap is not, gets a break in the vertical line, with crossbars, as if there were an untested blank line "in the cracks". (But there should be a comment in there mentioning it. I start such comments with "MORE", so I can find any that are left with an editor. Similarly a MORE comment marks where I've stopped coding for now.)

      When the code is done-and-coverage-tested there's a vertical slash beside all of it. (Sometimes you have to add test code temporarily to make something visible externally, but #ifdef or comment it out rather than removing it when you're done.)

      The result is like growing a perfect crystal, with flaws only in the growing surface. When you've tested a part you're usually DONE. You never have to return to it unless you misunderstood its function and have to change it later, or if the spec changes.

      DOCUMENT!

      Co-evolve a document as the project develops if there's more than one on it, or if it has to be handed off to others later. If you're alone, you can get away with heavy comments.

      Put in the comments even if you have the document.

      Keep the comments up to date.

      Comment heavily even if it's just you and always will be. When you come back to the code (especially if you're following this methodology and only get back to it MUCH later) you'll have forgotten what you were thinking. So put it all down to "reload your mental cache" when you get back to it.

      The document should be a complete expression of the intended operation of the code - but in a very different and human-understandable form. (Especially not just pseudo-code for the same thing, or "i = i+1; add one to i". Use pseudo-code only for an ilustration, not an explanation.) Remember: Testing can NEVER tell if it's RIGHT. It can only tell if two different descriptions match. "Self-documenting code" is an untestable myth - all that can be tested is whether the compiler worked correctly.

      There's more but I have to go now. I'll try to continue later. The above contains the bulk of the key stuff.

  • by rattler14 ( 459782 ) on Wednesday June 19, 2002 @05:28AM (#3727836)
    before each compile, one should make a small sacrifice to the debugging gods and ask them to forgive you for your syn(tax).
  • by cliveholloway ( 132299 ) on Wednesday June 19, 2002 @05:33AM (#3727849) Homepage Journal
    They use Beta testers - the have the largest group of Beta testers out there - otherwise known as their customer base :)

    boom boom

    cLive ;-)

  • Testing (Score:2, Informative)

    I think JUnit-style testing works great, and I plan to start using it more often.
    Testing is good to verify that your code does exactly what you think it does; a lot of the time I produce code that I "think" works, using JUnit allows me to verify that it actually does.
    Check out junit.org [junit.org].

    For those of you who are sceptic about unit-testing, you should try it. Setting up the tests are not as tedious as one might think, they force you to think your problem through, and maybe most of all: they make your build look cool :)
  • by korpiq ( 8532 ) <`-.' `at' `korpiq.iki.fi'> on Wednesday June 19, 2002 @05:34AM (#3727853) Homepage

    One wonders how your development has been organized. Everybody here should know the basics of software engineering, including but not limited to:

    1) document APIs exactly, including definitions of legal and illegal data sets

    2) separate test group from programmers

    3) separate quality assurance from both API testing and programmers.

    Well, that's the theory. I've never worked in a place where that would have been implemented. Instead, people trying to bring this in have been kicked out. In practice, maybe one should try to get a feeling of each API: how is it supposed to be used? Use each piece of software only in the implicit limits of it's programmer's idea to keep the number of bugs down. Not to mention the obvious coding style mantras.
    • by Twylite ( 234238 ) <[twylite] [at] [crypt.co.za]> on Wednesday June 19, 2002 @06:14AM (#3727968) Homepage

      This is excellent advise. In my experience, the most stable code comes from pragmatic design followed up by pragmatic coding.

      Design your system thoroughly. Identify every component, and the minimum interface required for that component. Carefully document that interface (API) - use Design By Contract (preconditions, postconditions and invariants) if possible.

      Moving targets mean that the API will almost certainly have to be extended - documentation on the design and intent of the component/API as a whole will reduce the pain of this process. The responsibility for this documentation is shared between the design and implementation phases. Pay careful attention to documenting assumptions made within the code, e.g. ownership of member/global variables.

      When it comes to coding, start with a skeleton. Put in the API function/method as defined, then check/assert every pre/post condition. Think about how any parameter could be out of range, or violate the assumptions you make. Once you are happy you're checking for all illegal use, you can go on to code the internals.

      When coding internals, remember that you cannot trust anything (with the possible exception of other code in the same component). Check/assert the return values (and in/out parameters) of all calls you make. Have a well-defined system-level design for error handling, that doesn't allow the real error (or its source, if possible) to get lost.

      As for testing, I'm all for the XP method: write your test cases first. This helps you to think about what you API is doing, how you are going to actually use it, and what you can throw at it that may break it (helping you to lock down the pre/postconditions).

      You must use regression tests! Testing is useless if its done one, but the code is modified afterwards. Have a library of test cases, and use all of them. Every time a bug is found, add a test case for that bug, and ensure it is regression tested every time.

      Code audits can detect and solve a lot of common implementation bugs. Use them to look for unchecked pointer/buffer use, assuming return values or success/failure of functions, and that asserts are correctly and accurately used.

      In my experience most bugs do NOT come from implementation errors, but from developer misunderstanding, especially late in a project or in maintenance, or even during bug fixing! A developer must fully understand the code (s)he is working on, and all the assumptions it makes. Never adjust a non-local variable without first checking all other functions that use or modify that variable, and understanding the implications. Never use a function or method without understanding all the side effects (on parameters and scope/global state). This is why all of this information should be documented, and audits performed to ensure that the documentation is accurate.

    • You seem to know what you're talking about, so are there any good books that cover the software design process? A book that covers what should be flowcharted and how detailed it needs to be, as well as writing good specifications and what should be contained in them?
    • Re (2) and (3), having separate groups is a *really* good way to get the old "them-and-us" battle going. The testers hate the programmers bcos they see themselves as covering up for the programmers' lack of skills; the programmers hate the testers bcos the testers keep telling them that they're screwing up; and everyone hates QA for telling them how to do their job.

      The problem is that very few bugs occur at the purely code stage, and most of them are easy to trace. The real problem is design bugs - they're the killers. The solution at our company is to (a) review, (b) separate development, and (c) follow a V-model quite strictly. We have one group of engineers, where everyone writes, codes, reviews and tests, and for each section, everyone will have a different role so no-one gets stuck just being the tester.

      Someone writes a requirements spec, and also writes the system test spec by which they'll prove that the thing works as expected. Writing the test spec forces you to put numbers to your requirements, basically making you self-review the requirements for errors, and it find a lot of bugs. Someone else checks that the requirements spec makes sense, and also that the system test spec matches up with the requirements spec. And typically we spend over a quarter of our time at the requirements stage, bcos it's dead easy to edit a single line in a Word document but it's a helluva lot more difficult to change a zillion C files, test specs, etc.

      If the system is simple, we'll go straight to code. But if the system is complex, we do a detailed design (usually in Simulink and Stateflow these days, since we're coding for embedded control systems). The person who does the design MUST NOT be the same person who wrote the requirements - this effectively gives us another review of the requirements, to catch any oddities which can't be implemented. And someone will review that the design is actually meeting the requirements.

      The same person who's done the detailed design will also write a test spec to say how to test the code. This will cover all boundary conditions, so any time there's a comparison, for instance, we'd check that it gets ">=" instead of just ">". We've got an in-house tool which allows us to write test specs in Excel and run them on code automatically. And someone will review this to make sure it matches the design and covers all cases.

      Then someone writes the code. The coder MUST NOT be the designer - as with the requirements/design separation, this gives us a free review. They'll put that through Lint to check for obvious problems (we follow the MISRA coding standard, with a few exceptions where MISRA didn't get it right), and then run their code through the test spec. If it fails, they'll look for the bugs in the code. Sometimes they find bugs in the test spec; in that case the test spec gets modified. Having an automated test spec means we can run tests on code with zero overhead for repeating the tests.

      And then the code will be run against the system level tests, and hopefully everything passes. If it doesn't, the system level test has found a bug in the design, where the design isn't meeting the requirements. Rinse and repeat.

      It's worth saying that we're writing code for automotive control systems. How much testing to do is really a trade-off of the cost of testing against the cost of failure, and in our case, failure is not an option!

      Grab.
  • Need more data... (Score:4, Insightful)

    by joto ( 134244 ) on Wednesday June 19, 2002 @05:36AM (#3727858)
    It's impossible to tell what's wrong in your case, since all you've said so far is akin to "we find lot's of errors, should we test less?"

    And the answer to that is of course: "No, you should test more, and fix the bugs". And of course, looking over your development model to see why you have so many errors might be a good idea (such as formalizing who can commit code, if you've got lot's of programmer at various skill-levels).

    But in real-life, many bugs are not that important, and time-to-market and development cost is more important.

    So unless you provide us with more data, such as ...

    development process, type of product, size of product, complexity of product, severity of bugs, age of most bugs found, bug-review process, testing schedule, testing process, how you manage your version control tree, whether you are on or behind schedule (and if behind, how much), how management deals with the problem, etc...

    ...I don't think anyone will be able to give any good advice either.

  • by JohnT ( 124054 ) on Wednesday June 19, 2002 @05:38AM (#3727862) Homepage
    My experience has shown that the number one way to find defects is code reviews performed by other developers who can read the code and also understand the intended functionality. This will catch 90% of all defects before they are even released to QA.
    For more information, the developers bible (IMHO) Code Complete (available on Amazon and elsewhere) has some good information on testing strategies and some hard numbers on effectiveness of testing. Good luck.
    • by Lumpish Scholar ( 17107 ) on Wednesday June 19, 2002 @06:27AM (#3728002) Homepage Journal
      My experience has shown that the number one way to find defects is code reviews performed by other developers who can read the code and also understand the intended functionality.
      Violent agreement, for the following reasons:

      Study after study has confirmed: Code reviews find more defects, per staff hour, than any other activity, including all kinds of testing.

      Aside from that, the benefits of having more than one person aware of each change to the code are significant. If George is sick, or quits, or wants to go on vacation, it's not just George who can make the next change.

    • by ssclift ( 97988 ) on Wednesday June 19, 2002 @06:47AM (#3728054)

      Again, violent agreement. Why? Testing is basically just writing the code again, only in a more restricted form. You take a known input, and then program the output expected (rather than derive it another way) and then compare the two implementations.

      Inspection, on the other hand, compares the program with it written in another form: human language. Since human language is generally too vague to execute automatically, the only way to test the equivalence of the two is to inspect.

      By far the best inspection book is Software Inspection by Tom Gilb [amazon.com]. His very generous web site [result-planning.com] contains a ton of supplementary material [result-planning.com].

      Remember, proving two statements are the same is the halting problem, and is NP-complete (i.e. you must check all possible solutions). Testing is a measure of code against code, inspection a measure of code against requirements. Together they kill a lot of bugs because they find different discrepancies between three statements of the same problem.

    • I strongly disagree. It may have been true in the past, but you have to consider that languages, compilers and tools have evolved considerably since the idea of "code reviews" was first introduced.

      Designers should rely on their tools. We're a Java shop here, and we employ everything from JUnit testcases for whitebox testing to JProbe for memory leak testing. We also use JTest for lint-like output. This, combined with selective code reviews of trouble spots gives you the most bang for the buck.

      You must remember that code reviews will NOT catch the most insidious and costly mistakes, such as general architecture design flaws, race conditions, and memory leaks. You may catch a few, but peoples' eyes glaze over pretty quickly, and these are tricky bugs to catch.

      In short: USE THE TOOLS! That's what they're there for. They've come a long way. They're very good at what they do. And there's no substitute for firing the product up and just plain using it for a while.

  • by yogi ( 3827 ) on Wednesday June 19, 2002 @05:38AM (#3727863) Homepage
    Make sure that testing starts with each developer, so that they attempt to break all of their code before it goes anywhere.

    If you look at the guys with really low bug rates, like the NASA guys running the Shuttle control software, they have very separate test and development teams, and a competitive attitude. The test team "wins" if it finds a bug, and the devlopers don't want to look silly.

    Some Extreme Programming techniques, such as paired coding may help too.

  • Maybe we all think Microsoft don't test enough... maybe they test too much.

    Thats why they can't design a modular windows. Too busy testing....
  • ... NOT tested in. If the product is poorly engineered, there should be no surprise at the vast number of bugs, no matter how much testing you do. Crap is crap.
  • Wrong problem.. (Score:4, Insightful)

    by psycho_tinman ( 313601 ) on Wednesday June 19, 2002 @05:42AM (#3727872) Journal

    The main thing: testing does absolutely nothing to minimize the number of defects in a particular application.. There are lots of other things that are as important.. ie: are these defect reports being seen by the appropriate developers and are they being acted on, what types of procedures and communication actually exists between the developer and the QA persons (assuming that they are not the same folk)..

    The last point isn't as bizarre as it sounds, I've seen lots of places where a QA person enters bugs, but the developers silently reject them ("its not a bug, that's how the program works")

    Testing just tries to discover the presence of defects, by itself, it cannot ensure that your product works perfectly (for an application of even moderate complexity, there may be an exponential number of cases and paths to check, most test cases are written for a percentage of those only).. Because of this, if you feel that you're spending too much time testing, perhaps you need to check if your test cases are appropriate to the situation and stage of development..

    Another point is that tests can be automated to some degree or the other, perhaps a scriptable tool might assist in lowering some of the drudgery associated with actually assuring the quality of your software...

    rant mode = on...Excessive testing ONLY hurts if it takes people away from development at the early or even middle stages of a project and forces them to run tests on incomplete sections of code.. otherwise, there is NO such thing as too many things...

  • "I wonder how much testing is too much"

    Well, it's never to much, until you've done every possible thing. Thats one of the advantages of open source development. A lot of people are working, playing, coding with the beta version wich means a lot of things get tested. In my experiance most errors show up with unexpected things, like missing error checks etc. A good point to start testing is to test agains the developer. Ask him everything, what if i do this, what if i do that. If you think crazy enough (like normal users), you will recieve a dozen of 'euh, don't know' answers. Thats where your errors are ;-) Most tests are based on what the system is supposed to do, not on what one can do. Given a few users anything can be done to the program, things you likely didn't test because you understand the program to good. Try putting your mother behind the prog ;-)
  • Its all in the same package...

    If you have a well thought out and well structured design... and if every single function is documented atleast mostly before coding begins... and if all of the theory of interactions fit in... then only a small amount of testing is required

    The main problems which require more testing come when corners are cut over the initial design stages, or their not done as fully as they should be doing, or when people dont actually think about what input users can give

    For one of my projects I did ages ago for college, half of the class (which was split to give the same coding ability spread in both halfs) were each asked to write this program and the other half were asked to design it, review the design, make a proper testing strategy, document and THEN start writing it

    The second half took a little longer to get their program working, but the first half had more bugs, and spent more time testing ... My bet is that the second half would spend less total time on that program and bug fixes in a real world situation (which was the whole point of the exercise)
  • For me (Score:2, Interesting)

    by pkplex ( 535744 )
    Well.. showing your work to someone important always brings up bugs, every time :)

    I suppose its just a case of using the product like an actual user who knows nothing about it would, right from the first step.

  • by tgd ( 2822 ) on Wednesday June 19, 2002 @05:43AM (#3727880)
    If you really want to generate bug free code, you have to keep one rule in mind at ALL times. A bug occuring in the code is a failure in the methodology you are currently using to avoid them. Sounds very basic, but a lot of companies forget that.

    When you have a problem with bugs, you need to figure out where in the process the problem happened. Was the unit spec wrong? Documentation? Implementation? Unit testing procedures? Was it a correctable problem caused by the engineer involved?

    If you really want to be bug-free, every time one shows up you have to figure out why it happened, and change things.

    Personally, I think the biggest one is to make engineers work 9-5. Not 9-7, or 11-9. Tell them to go home at 5, even if they're in the middle of something. Software engineering is a very complex task that takes a lot of energy and concentration to do right. Just like Doctors who work long hours make mistakes (resulting, often, in people dying), engineers who work too long make mistakes too.

    Being in the "zone" is often the death of good code. You get lots of cool code written, but none of it is double-checked, none of it is verified to match spec, and it often ends up afterwards difficult to understand.

    Now don't get me wrong, I don't do any of this, and my crap is FULL of bugs, but thats what you need to do if you really want to help it. Writing buggy code is like a public works program for QA people. Who wants a hundred thousand unemployed anal-retentive QA people nitpicking something else like your car's inspection or your tax forms? Better to keep them in the software where they can't do any harm ;-)
  • Design by Contract (Score:2, Interesting)

    by Yousef ( 66495 )
    Another good way to reduce errors is to follow the principals of Design by Contract.

    State using Assertions what is expected of the code. Pre Conditions and Post Conditions.
    If any of these fail, then throw an appropriate Exception.

  • and force feed it to others.

    Mathmatical proofs and unit testing asside make sure that your program dosent crap out when its being used or abused. So get some regular people to bang on it..

  • Never.. (Score:2, Insightful)

    by kb ( 43460 )
    ... trust the programmers when it comes to testing. You may find some obvious buigs in your own code, but when it comes to runtime testing most programmers tend to emphasize on "correct" user behaviour or maybe the few wrong input sets they've taken care of and completely neglect the pervert fantasy of lusers *g*

    Dedicated testers OTOH are perfect in letting a program die in the most obscure ways. Did you know that you can crash a Commodore 64's BASIC interpreter by just typing PRINT 5+"A"+-5 ? ;)
  • by dikappa ( 581761 ) <dikappaNO@SPAMexea.it> on Wednesday June 19, 2002 @05:45AM (#3727887)

    IMHO, programming and testing should be done at the same time in the development stage.

    While programming and "bugging" happen at the same time, programming and de bugging/testing should happen at the same time too.

    It is very well explained in Bruce Eckel's Thinking in Java [bruceeckel.com]. You should just test everything in the code itself, even if it happens to add some overhead. Once called that function, you want that <something> happens.. so check it in the code.

    I know this is not the usual way procedural programming happens. It seems much more straightforward to drop the code as it comes and then check if it behaves correctly.

    But if you do so you will often discover that that tests made afterwards ara not comprehensive of all possible situations.

    And so you discover that testing and debugging are just unfinished tales, and it is even worst if testers are not the programmers who did the work.

    Plus, I hate testing, so I force myself to do the work well and let the code (as long as possible) test itself, even if it makes development slower and boring.

    Umhh... i'll preview this post 10 times, hoping it's free from bugs :)

    Obviously my code contains no ewwows ;)

    • The test code could well be three times the size of the normal code (IMHO, it could be much larger than that, but since I only have experience with the more standard way of debugging, I'll tone it down a bit). Doesn't this debugging code have to be bug-free as well? If the debugging code needs to go through the most of the whole process (at my place, the relevant steps would be design / code / unit-test / inspect), then it has the potential for doubling or tripling the time spent coding. It will reduce the number of bugs found later, but the number of bugs that pop up due to lack-of-regression-tests are pretty small overall. Perhaps it's not worth it to do this?
      • The test code could well be three times the size of the normal code (IMHO, it could be much larger than that, but since I only have experience with the more standard way of debugging, I'll tone it down a bit). Doesn't this debugging code have to be bug-free as well?

        Testing a solution for correctness, or probable-correctness, is usually easier than figuring out the solution in the first place. This means the test code will be smaller and less bug-prone than the code that does the actual work.

        An example: sorting a list of numbers. An efficient sorting implementation can be rather complex, but making sure it works is pretty simple, even in C: min = list[0]; for(i = 1; i < length; i++) assert(list[i-1] <= list[i]);

  • Literate Programming (Score:3, Interesting)

    by SWroclawski ( 95770 ) <.gro.ikswalcorw. .ta. .egres.> on Wednesday June 19, 2002 @05:50AM (#3727894) Homepage
    This may not address your current situation, but literate programming [literateprogramming.com] can often help reduce bugs and clean up errors found later.

    When a programmer is simultaniously coding and documenting thier code, at both the high and low levels, the larger "thought" bugs will decrease in number and severity.

    Even if you don't use a literate programming system, often documenting the system before you write it can help make the code more clear.

    - Serge Wroclawski

  • I don't know how relevent this is but I read somewhere long ago that newspapers ended up with more errors if they had multiple people proof reading the same text because nobody was really taking responsibility. Even if it is not intentional, there is always a feeling of 'the other guy will pick that up'.

    I vaguely remember something similar being said about the space shuttle disaster.

  • Good design is always the best way to avoid buggy, hard to fix code.
    But for testing it depends what you testing,
    a good general test process for data processes (most functions can be though of as dataprocesses) is

    Generate some input test data,
    Work out what the results should be by hand.
    this is your first stage regression.

    Now run the input test data through the application/function

    Diff the results against your hand generate file.

    Any descrepancies should be resolved as,
    A bug in the application/function
    or
    A Bug in the hand generated files.

    Fix all the hand-job problems

    Repeat until you test files are perfect.

    You now have a second stage regression test,
    Known good inputs, and known good outputs

    Use the correct test files to fix the application bugs.

    If a bug is found that isn't in you second stage regression, then generate test files for it.

    Fix the bug in the application(using the test files).

    Then run you second stage regression and check that any differences are down to the bug that was found (and been corrected).

    Following this process you application should always get better, and a you should soon be able to build up a fairly large sample of test data.

    The test harnis is simple enough (just a diff on the files and a bit of code to wrap up the functions.), to prevent artifacts caused by the testing process.

    Any-how that's more-or-less what I do for most of my testing, bug fixing.

    • One problem that usually arises with this approach is that an very large quantity of data must be hand created to test all the paths in the function. This is not IMO a flaw in the procedure you recommend. It points to a design weakness: it means that higher level logic has failed to adequately sort different cases to be dispatched to different functions. This can be a result of abusing the notion of "hiding messy details" at the upper levels and pushing the problems down to the bottom. There must be a balance of hiding/deferring and exposing/processing of situations at every level to keep the complexity burden reasonable at each level.
      • The idea is to build up the level of test data you have,
        You could have somthing that generats loads of test data , and use a simple script to check that it's data is ok, and even see what happens when bad data is fed in.

        If someone one finds a bug, they generate test cases(and variants).

        UAT testing should generate loads of data.

        An live envoriemts give you loads of data for full-cycle developments.

  • by James Youngman ( 3732 ) <jay&gnu,org> on Wednesday June 19, 2002 @05:55AM (#3727913) Homepage
    Well, I'm assuming that your systems start with analysis & design, followed by coding, redesign and more coding, go through unit testing (with more redesign if you are unlucky), followed by integration testing once unit testing is complete, and perhaps acceptance testing once integration testing is complete. This is more or less the traditional waterfall cycle when the deliverable is the finished code.

    This strategy works - lots of shops use it all the time. However, the real premise of the process is that you want to get through client acceptance testing as soon as possible, as long as the result is not dissatisfaction on the part of the client with the software after they've accepted it. As you have noticed this strategy doesn't actually produce bug-free code.

    This is not surprising. What you achieve is after all pretty much determined by what your goal was. You (shops in general) need to think hard about what your actual goal is. If your goal is nearly-zero-defects, then the traditional process isn't doing the right things for you. If however, your goal is to obtain milestone payments from your client, then it's pretty good. This is an area where the business goals determine the software engineering processes.

    Let's put another hat on and think about what the negative affects of this strategy might be (negative is really defined in terms of what your goals are, but let's be vague about that for a moment).

    • If your goal is not "zero bugs" then you will stop work before there are no bugs left
    • Your software is delivered to the customer with bugs in it, which the customer will find
    • Your development team will partly move on to other areas, probably leaving a smaller number of people to deal with the remaining bugs
    • Maintenance programmers are typically less skilled than some of the original team - because some of the original team have been pulled off to activities which are more important to your business (e.g. delivering another set of code in order to meet a payment milestone somewhere else).
    • Skills evaporate over time - after N years it gets very difficult to find authoritative information on why something works like it does.
    • As the code is fixed, it gets brittle. The emphasis is on just fixing bugs and making low-risk changes in order to avoid breaking the production code - hence refactoring is rare.

    All of the above factors are unpleasant for those left to maintain the code. Many of them also limit the longer term flexibility of the product and hence the useful life of the software. This feeds back into development processes because limited product lifetimes mean that there is less incentive to change your process to produce software which can persist (i.e. why make the effort to ensure that the system is flexible enough to last through 20 years of changing requirements when you expect the system to be retired after only 7 years?)

    You mentioned XP - it offers a lot of techniques that resolve these problems:-

    • Programming in pairs - this makes for very efficient skills transfer, hence you limit the extent to which the expertise boils off
    • XP testing aims for automation - this encourages more testing and the stronger testing ability allows you to contemplate high-risk-high-return activities like refactoring.
    • Refactoring - which prevents your code getting old, brittle and hard to change
    • Many more I'm sure (I'm not an XP practitioner)

    However, XP is best adapted to projects where a single team makes multiple frequent deliveries of code, can work closely with the client, and where the development project continues in the medium to long term. These characteristics allow many of the XP techniques, and this means that techniques taken out of XP may not help projects of a different style.

    Having said this, the automated testing angle is a real strength. If testing is done manually, it's time consuming and expensive. Hence people don't do it as much as they might otherwise thing is appropriate. Maintenance deliveries often just undergo regression testing, and faults can creep in which might have been caught by the original unit or integration tests. Automated testing has many advantages :-

    1. Automated tests are faster, so people actually do it!
    2. You can redo all the tests after every change if you want.
    3. Automated testing allows you to refactor without danger
    4. Being able to re-run all the tests really does keep out the new bugs which would otherwise have been introduced during maintenance
    5. Your testing coverage grows with time (since bew tests are introduced but tests are only retired when the relevant functionality is changed or dropped)
    6. You don't fail to spot errors (quite often with manual testing regimes errors can go unnoticed because the tester doesn't spot a small bit of incorrect behaviour that the original team might just have spotted).

    Just as a data point, I work on some software that has an automated test suite. The suite contains between 500 and 1000 test cases; the test suite conducts those tests in under 5 minutes on a very old machine. To do these tests manually would take one full-time person at least a week.

    The summary is :-

    • Understand your business's real goals
    • Cherry-pick techniques that will help achieve those goals (you might even be able to adopt a whole methodology if its processes are designed to achieve the kinds of goals that your business actually has).
  • by HiQ ( 159108 ) on Wednesday June 19, 2002 @05:57AM (#3727921)
    What I miss in this discussion is something about the persons performing the tests. In my companyh we have a test team, consisting mainly of people who don't know the first thing about coding, who cannot read sources and who can only test 'through the UI'. And yet the system we work on has thousands of sources, a percentage of which has a UI (20%). Testing of all the underlying objects is a lot harder, and my experience is that with this many sources the total amount of possible 'paths' in the system is so large that tests using the UI take too much time, and therefore is never done properly. So now the developers are constantly asked to provide methods by which the testers can perform the tests.
  • I believe that most bang per buck can be achieved if the organisation is not too fixated to one or two standard testing procedures. Projects differ a lot. Using 30 percent of testing budget for testing the testing plan might well be worth the effort. If your company is making a set of applications for a fixed platform using fixed components and fixed architecture and these basics have been previously thoroughly tested, then ofcourse what was said above might not be true.
  • by Anonymous Coward on Wednesday June 19, 2002 @05:59AM (#3727927)
    Having been on teams producing 24 X 7, bullet proof code for communication servers and credit card processing I have an idea about the increasing number of bugs found. In the Old Days(tm), we wrote every line of code ourselves and used time tested libraries (C language). I quit using microsnot when their libraries stared having bugs in their rush to C++. Now most coders use massive OOP libraries from who knows where built by slackers, and GUI app builders that generate code and perform all sorts of actions under the hood. When something goes TU it is often hard to find all the conflicts.

    Even when using one of these app builders I read through all the code and put tests and logging into the generated code. Funny that these tools are supposed to make us more productive. My coding and testing every line still beats total time spent on a project since I don't have to go back and redo it later. When it's done, it's done. Next project. I've had comm programs run for over 5 years error free servicing 1000s of users per day. One specialized delivery, billing, and inventory system I wrote was used over 6 years error free and caused the owner to stay with hardware that was compatible with the software (not M$) because the programs always worked. And not a damn bit of it was OO or came from some automated builder tool.

    In short, the closer you get to the metal and the more familiar you are with the code that is executing, the better your chances of producing error free programs. Takes longer to market, but then you don't have to redo it forever until the next bug ridden version comes out. Saves time and coders to work on the next version and the customers are always pleased. Get back to the basics. Try it, you'll like it.
    • I agree 100%

      The only perverted problem with this approach is that in case you are selling your code to some 3rd party company for example, and you are competing with a set of other companies you might have terribly hard time trying to make the customer understand why your development takes 3 times longer than what is promised in other proposals. This is not so much problem when you have a proven track record clearly stating that the complete time wasted is less using your approach. Still, in some cases, the customer might be stuck to some bizarre outsourcing process with inherently excludes all proposals that exceed the shortest estimate by some magic percent.

      The point being that many of current customers for software support this bizarre approach of first jumping to lake and then seeing if you hit a stone or not. Ofcourse its not easy job to decide to pay $100 for a candy, if you can get the same thing with just some dog dung added for $33. Especially if the software provider does not say that the dung is included in the offering.
  • do you feel that excessive testing hurts the development process at all?

    Testing should always be a part of the development process. The wording here implies that testing somehow is considered to be outside the scope of development and I suspect this mindset is causing a lot of bugs to remain undetected.

    It's just like documentation or support, those are also (or damned well should be) integral parts of the development process. Sometimes I think that most programmer's believe that the development process consists of the steps hack, compile, ship instead of the tedious iterative process of analyze, design, code, test.

    So what, then, is excessive testing?

    Well, as long as you find bugs doing it, it's not excessive.
    If your projections predict a bug or two in a specific piece of code and your tests fail to find them, then testing (provided that the test method isn't flawed) gave you a much desired quality assessment of that piece of code - meaning that the testing still wasn't excessive.
    Running the same tests over and over on the same code with the same data, now that's excessive, not to mention stupid.

  • When someone is told to implement feature A, they spend a little time sifting thru the 20yr old code, and do the minimum to get it done.

    They write test cases to test A, unit, system, etc. Their team leader approves it, and of course all the tests pass before it goes out.

    In the end, there's always some obscure way feat A interferes with feat B, but you're not going to write tests for every combination of keystrokes possible.

    If you have user testing (u should), they'll find a score of bugs u didn't. Of course, the users too have a deadline to get stuff shipping and they too want to do the minimum possible.

    In the end, nobody involved has a personal incentive to make it perfect. With proper testing procedures in place, everyone has a piece of paper with someone's signature on it which says they passed. They don't feel (too) guilty when there's a bug, (hey, my TL approved it!).

    Anyway, how are you going to ask the customer for $20k extra so you can test for an extra week?
  • Suggestion (Score:2, Insightful)

    by ceeam ( 39911 )
    I guess you should try to spend a part of your testing budget on improving your design and programming practices.
  • Close the loop (Score:5, Informative)

    by edhall ( 10025 ) <slashdot@weirdnoise.com> on Wednesday June 19, 2002 @06:12AM (#3727962) Homepage

    The object of finding bugs isn't to result in fewer bugs by fixing them. It's to result in fewer bugs by not writing them in the first place. The developers need to review found bugs on a regular basis, with the objective of changing development methods to avoid them in the future.

    It's all fine and good to say "don't write buggy code in the first place," but this sort of feedback is the only way to get there. What makes this so hard in many organizations -- aside from the usual disrespect many developers have for QA people -- is that developers fear that this process is some sort of performance evaluation. As soon as this happens, the focus shifts from finding better processes to defending existing processes: "It's not really a bug," "There isn't really a better way of doing that," "We just don't have time to do it the 'right' way," and so on.

    This is why the feedback needs to be direct from QA to the developers, who are then tasked to categorize bugs and develop recommendations for avoiding them. It's the latter that is the "product" required by management, not a list of bugs with developer's names on them. Management should otherwise get the hell out of the way.

    -Ed
    • Re:Close the loop (Score:3, Interesting)

      by jukal ( 523582 )
      > This is why the feedback needs to be direct from QA to the developers

      I agree. For some reason (maybe it's just me) developers are nowadays too full of false pride as well, thinkin: I am the lead coder, analyzing bugs is the job of trainees. In my opinion the situation should be (atleast in some cases) completely opposite, only veteran coders can make correct assumptions and define pre-cautions for future and fix this particular case in the correct way. Otherwise it might just lead to a decline of the original code - making things even worse.

      Working with bugs is a tough job, do it with pride! *with the allbugfixers unite anthem playing gently on background*
      • developers are nowadays too full of false pride as well, thinkin: I am the lead coder, analyzing bugs is the job of trainees. In my opinion the situation should be (atleast in some cases) completely opposite

        I don't know who your company is hiring but where I work things are as they should be. It's the college kids who are hired for the experience are the one's with cocky attitudes while the advanced developers are trying to push management to get better processes in place rather than just "Fly to the moon by next thursday."

        ... Then wednsday comes and it's "No wait, fly to Mars instead. But still do it by tommorow".

        At the moment our company isn't as structured as it should be. We don't have a QA team or a testing team. It's just management pushing the developers to get thigns done. But the developers are pushing back to say "hey, we need a process here. It's not just writing code. We need to design it first. And that takes time. We also need to implement code review and pick someone who's got the experience to decide what goes in CVS."

        So my point is that in my experience it's the inexperienced developers who want to just jump in and write code thinking "it's not my job to fix bugs". I think this has a lot to do with wanting to get that advanced status. But as they grow they realize that they're going about it wrong and smarten up with regards to processes.

        --
        Garett
    • Most of the 'worst' bugs i've come accross are down to bas systems design, before a single line of code is written.
      If a system is designed well then you should have far fewer bugs, even if you are using code monkies who don't know a quick sort from a n^2 bubble.

      Design you systems well, know your people, Bill's good at that kind of thing and likes it(but crap at ui's say),
      Jess loves doing data imports, (may not be that quick, but always does them well).
      Fread always designes and produces good/fast systems cores.

      Get your developers talking and sharing knowlage, 'I'm, having a bit of a problem' , or 'Who knows how to', are good things for people to be saying, so incorrage them to own up to the inadiquacies, and they won't have them for long.

      If you can manage that then your productivity and bug counts should drop dramaticly, and the bugs you do have should be easier to fix.
  • by fcrick ( 465682 ) on Wednesday June 19, 2002 @06:15AM (#3727971) Journal
    I've worked on both ends (dev and test), at M$ and other places, and I've come to one conclusion (I'm sure its not the only correct one).

    Developers must test their code.

    With a test team backing you up, it becomes too easy to change something, run it once (if at all), and then push it into the next build so the test team can catch your errors. I've found that as a tester, a huge proportion of bugs are simply features implemented where the developer just forgot something stupid. I end up wasting 5 minutes writing a report, my manager assigns the bug back to a developer (hopefully the one who made the mistake but not always), and the developer comes back to the code a week later, spending 20 minutes just trying to figure what s/he wrote a week back.

    My point: this wastes 30 minutes of people's time for every little stupid mistake. Pressure your developers to really give a thorough test to the code they write before the check it in, especially if you have a test team, because you just end up wasting more people's time.
    • ...and say, "Developers should write their test suites BEFORE they write their code."

      We have a fairly large open source project [faeriemud.org] with contributors coming in and going out all the time (well, not a lot going out; but any number is a problem there). Our experience shows that if you can't write a test suite you're not ready for anything more than a crude prototype. The problem with test-after-coding regimes is the testing gets short-circuited. You've already got working code. You "know" it works. You're just proving it works. So you test the obvious stuff that proves this.

      Since we have instituted this policy, coding efficiency has actually improved. Coders who have tried to devise a complete set of tests have formalized their understanding of the requirements in a sense which the most complete requirements doc will never do. We include the test suite in CVS. Nobody commits until their update passes the entire test suite. This results in an enormous (but complete) test of everything done so far. But you can't imagine the thrill of seeing your patch pass that many tests the first time.

      All of which is completely separate from what a QA process is for.

  • I often think the time lines have become so compressed in terms of expectations that it becomes harder and harder for companies to write clean code and get it out the door in order to meet the expected cycle of upgrades - and this is something i find common to all companies and even open source software. It seems we as consumers have come to expect an upgrade to this or that every year and so the dev cycle becomes one continuous thing - coders who are exhausted or working long hours write buggy code.

    I think many people would be happy to wait a bit longer for better products but the industry has brainwashed them into thinking that its almost easy to bung a new version out.

    The best way to avoid mistakes is not to make them in the first place but thats not so easy when working on a compressed cycle with management on top of you - its not just programmers who deal with it - its network designers, SOE architects etc etc

    Bugs exist - they always will - but minimising them requires time and time is a commodity not readily found. I dont know what the solution is - as i say its just my thoughts.
  • by PinglePongle ( 8734 ) on Wednesday June 19, 2002 @06:43AM (#3728040) Homepage
    You say you're finding too many bugs - it sounds like there is something fundamentally wrong with either the product or the development process.

    First thing to do : look in your bugtracking software ( you DO use bug tracking software, right ?) , and try to isolate hot spots. Is there a particular piece of code that generates more bugs than others ? Is there a common pattern to the bugs (ie. memory not being freed, of-by-one errors etc.) ? Are they _really_ bugs or mis-interpretations of the requirements or the design ? In my experience, the 80/20 rule applies to bugs in spades - it is just hard to find the patterns.

    If you need to, make the bug categorisation in your bug tracking software more specific. Once you get an idea of what your hotspot is, you can work at fixing the cause of the bugs.

    If it's a particular piece of code, make sure it's reviewed by the best developers/architects you have, and consider refactoring it. At the very least, insist that it is reviewed and tested thoroughly before chec-in to the source code control system, and consider adding a hurdle to jump prior to check in (e.g. get the manager to sign it off).

    If the code was written by one developer, consider swapping them out and giving it to someone else - it may be they're in over their head.

    Make sure you increase the number of test cases for this piece of software, and check for "edge cases" religiously - if the code is broken at all, it is likely to be broken in more ways than you realized.

    If it turns out that the problems tend to have a common cause (memory leaks, of-by-one errors,etc.) consider a structure which forces developers to concentrate on those issues before checking in code; again, consider the hurdle ("software must be signed off by the off-by-one guru prior to check in"), and hone your tests to check for these kinds of errors if possible.

    If the bugs stem more from misunderstood requirements or designs, beef up those areas. Work on your requirements and analysis processes; consider training courses for the developers to get them up to speed on interpreting these nebulous documents, and look at improving the review process by having designers present. Frequent "mini-deliverables" (another concept stolen from XP) will help here too - get your team to deliver a working piece of code - it need only be a minimal sub-system - and get it reviewed by designers and analysts. If the bugs tend to occur on the boundaries - i.e. invalid API calls, invalid parameters etc. - consider design by contract or aspects.

    Finally, there's a bunch of hygiene stuff :
    • coding standards (ideally with tools to check for compliance) help by allowing you to specify how to deal with commonly bug-prone operations
    • invest in a decent bug tracker, request tracker and source code control system - it really helps if you can trace a bug back to a particular requirement or previous bug fix if you're trying to find out where all your bugs are coming from
    • seating arrangements - I worked on a project once where the bug rate was halved when the developers were moved to sit in the same room as the database team. Get everyone in the same room if at all possible
    • automated test tools - already mentioned - help to ease the pain of testing. If practical, insist that the developers run the automated tests before checking in code
    • religiously review all work products - designs, specifications, documentation, project plans, code, tests, the works - before declaring them "ready for use". It's the single cheapest way of finding problems, and a great way to spread good practice.


    N
  • by jilles ( 20976 ) on Wednesday June 19, 2002 @06:45AM (#3728048) Homepage
    There's no one size fits all process for testing. How much effort you need to spend on testing depends on a lot of factors including but certainly not limited to: code size, amount of developers, customer requirements, life cycle of the system etc.

    That being said, here are some remarks that make sense for any project:

    In general a testing procedure that gives you no defects just indicates your testing procedure is bogus: defect free code does not exist and no test procedure (especially no automated procedure) will reveal all defects.

    The XP way of determining when a product is good enough: write a test for a feature before you write code. If your code passes the test it is good enough. This makes sense and I have seen it applied successfully.

    A second guideline is to write regression tests: when you fix a bug, write an automated test so you can avoid this bug in the future. Regression tests should be run as often as possible (e.g. on nightly builds). All large software organizations I've encountered do this. Combined with the first approach this will provide you with a growing set of automated tests that will assure your code is doing what its supposed to do without breaking stuff.

    Thirdly, make sure code is reviewed (by an expert and not the new script kiddie on the block) before it is checked in. Don't accept code that is not up to the preset standards. Once you start accepting bad code you're code base will start to deteriorate rapidly.
  • Testing is a start, but not enough. The results from testing (e.g. "300 buffer overflows") have to be identified and fed back into the organization to change how future software is developed for the better.

    Testing/debugging is like finding and putting out existing fires. If the organization can use test results to prevent future fires, then you're a step above. And probably more advanced than 90% of the software houses, too :-)

  • I program mostly in object oriented languages. So I have seperate files, which have seperate classes. I start at the bottom of my UML and work my way up testing each class as if it were its own program. When I know they all work individually, I can be certain that, despite the fact there had to be a few bugs I overlooked, that all bugs are due to the way they interact. It takes awhile, but in the end I'm mostly bug free.
  • If your testing is resulting in an inordinate amount of bugs, then there is probably bigger problems than you think.

    Testing is necessary but not sufficient. There must be a way to capture requirements, convert requirements to design, convert a design to implementation, and finally test. At each transisition it is a good idea to make an assessment of how well you accomplished your task.

    Skipping any of these steps and putting if off until test is pure folly. An extremely false economy.

  • by Sandmann ( 182819 ) <sandmann@daimi.au.dk> on Wednesday June 19, 2002 @07:02AM (#3728114)
    The image loaders in the gdkpixbuf library included with gtk+ 2.0 were tested with random data. This caught a lot of bugs, including some in the standard image manipulation libraries.

    Just generating random data and trying to load it caught a lot of bugs, but even more effective was to take a valid image and modify the bytes in it at random, and then try to load it.

    Of course, the reason this was so effective, is that the loaders would get mostly what they expect, and then suddenly something illegal. This is the kind of thing you tend to forget about when you write code.

    Since it is so easy to attack your program with random data, this kind of testing gives you a lot of bang for the buck, but on the other hand, the bugs it find may not always be those that are likely to occur in practice.

  • I am having a bit of a QA problem myself. After reading up (Steve McConnel, etc), I'm looking to spend more time in pre-code, and also implement inspections (a code review technique).

    The disadvantage to testing is that you detect errors, but need to spend time finding the source. If you avoid the error by detecting it at design-time or during code review, you will spend less time dealing with it, since you will know more about the root cause to begin with.
  • Just to emphasise how good design is the key to avoiding most bugs, not testing - there's a song that often gets sung at my place of work...

    Hundred and one little bugs in the code
    Hundred and one little bugs in the code
    Fix the code, compile the code
    Hundred and two little bugs in the code
  • by wowbagger ( 69688 ) on Wednesday June 19, 2002 @07:14AM (#3728157) Homepage Journal
    When I was an undergrad, one of the out-of-major classes I took was archery (I needed a PE credit, and I was interested in it). In archery (and in any other kind of marksmenship) the trick is
    • Be consistent
    • Measure your error
    • Identify the cause of the error
    • corrent the cause
    • repeat


    Programming is the same way. What kinds of bugs are you finding? Are they just stupid bugs, like buffer overflows or off-by-ones (good design, bad implementation), or are they unhandled errors, or are they API mis-matches or faulty algorithms (bad design)?

    Have you made any effort to go back and say "Gee, we are getting a lot of off-by-one errors. OK folks, we need to think about our loops."?

    And when you find one type of bug, do you go back and identify anyplace else a similar bug may exist?

    If you are hitting high and right, and you never adjust your sights, you will NEVER hit the target consistently. If you never feed back the CAUSE of the bugs, you will never eliminate them.
  • One thing I've found invaluable is to compile your program with a translator that inserts code to detect when branches have been followed. Then run the test suite and see that all the code was executed. Any code that was not executed has not been tested.

    It's amazing how poor coverage can be with a naively written set of tests. Ideally you want to write the tests so that the coverage comes out good, but in practice you may have to patch the tests with more tests to cover the parts you missed. You may also have to change the code to make it easier to cover.

    Rare error cases (like malloc failures) can be hard to cover.
  • by ortholattice ( 175065 ) on Wednesday June 19, 2002 @07:31AM (#3728223)

    A number of years back I wrote test programs for printed circuit boards. First you created a model for the board that simulated the logic circuits. You then wrote test patterns that were applied to the board's inputs, and the simulator model predicted the board's outputs. The inputs together with the predicted outputs were applied to a real board that you wanted to test, and if this test program passed you assumed that the PC board was good with a high degree of probability.

    One mode of the simulator allowed you to simulate faults that might occur on the board. The simplest kinds of faults were physical IC pins "stuck-at-zero" and "stuck-at-one" (these were the most common faults in real life), and if you wanted to be thorough you could also simulate "internal" faults down to the gate level.

    I worked in a contract test programming house, where the contract with the customer required us to produce a test program with a specified minimum level of fault coverage, usually just at the physical IC pin level to minimize cost of developing the program. This ranged from say 90% for cheaper commercial work to 99%+ for certain government contracts. With >95% coverage, the "real life" fault coverage was maybe one or two "dog pile" boards out of 1000 would pass the test program but fail a system test.

    The point of this is in that business, there was a clear objective measure of a test programs "quality". The measure wasn't perfect, but it was far better than just blindly writing a test program based on a "gut feel" for how the board should work. In addition, the test programmer had a clear, objective goal.

    I think a useful tool in the software business would be a measurement of the percent of lines of code that were actually run during the QA process, along with a log of those lines that were not run and not run. Often there are big chunks of code that only get triggered by very special conditions, and there is no way QA can guess those strange conditions. The standard QA process is very subjective; there is no objective measure of any kind as to how thorough the testing was, other than just documenting a list of features that were (often superficially) exercised.

    A more sophisticated tool could go beyond lines of code and into log the various logic combinations exercised in "if" statements, etc.

    Several years ago I wrote an experimental tool that did this for a specialized database programming language. Basically it rewrote the program with a logging call after each statement (and yes, the "QA version" ran very slowly). The results were quite eye-opening, revealing chunks of "dead code" and conditions no one ever thought of testing. Unfortunately the project kind of died.

    Many languages have "code profilers" that are mainly intended to analyze performance, but many of them could be easily adapted to become QA quality measurement tools.

    Do these kinds of tools exist, and if so why aren't they more widely used?

  • Sure, build a big suite of tests to run and check for things to go wrong. Every bug fixing process suggests it own test.

    Then you find out that you don't have the time and resources to run all the tests everytime someone makes a change to the codebase.

    So, use smaller suites of the faster tests and weed out some of the ones that have been ironclad passes for the last 5 dozen code checkins. For frequent testing it makes sense to only shake what's new and rickety, not what's stood through 10 hurricanes.

    Run the exhaustive complete test suite infrequently, say when a release is imminent, or as often as you can afford to spare the resource cycles.

  • If you test the code as much as you says you do, and are testing for the correct thing (which I do not know you are doing) the problem may be the architecture.

    Code which is "forced" into a paper architecture is sometime worse that code with no architecture at all. In many of my projects, parts of my architecture change part way through so that the code will work better. Sometime not everything can be thought of before hand. OO programs have a lot of information to fit in a human barin at one time, problems are bound to show through. I don't have any "high eng tools" to help with the architecture either, which doesn't help.

    Also, the architecture itself may suck.

    What kinds of problem are you having? I think you need to design test routines geared towards not letting the types of problems you currently have through. It is hard to have any specifics, since the post was so vague.

    -Pete
  • by southpolesammy ( 150094 ) on Wednesday June 19, 2002 @07:53AM (#3728325) Journal
    A lot of the problem may rely on what methodology you are using to code the program, whether it is the traditional waterfall method, or the sprial method, or perhaps M$'s old sync-and-stabilize method. Whatever methodology you use will drive how you should be testing.

    With the waterfall model, you really need to know way ahead of time that what you are coding is what will be desired in the end product. It forces you to have a clear picture in your model of what you are trying to build and with each step in the process, you must develop testing procedures that address that level of the code. For example, at a high level, you may say, let's build a compiler, and following that decision, you need to devise a test that proves that the compiler works. The next phase, you may say, let's build an assembler to produce machine code for the compiler. Then you need to build tests that prove that the assembler works. This methodology continues right down to the smallest module of code, and when all of the pieces have been written, integration testing begins, and you make sure that each larger piece can correctly function based on the output of the smaller piece.

    However, in the spiral model, it allows for a well-defined core code to be produced with tons of modules that evolve as the spiral expands. Integration is a function of the spiral, and testing occurs within each iteration of the spiral loop. Code produced with the spiral model also tends to be somewhat more difficult to test in later stages, IMHO, due to the nature of the testing that occurs at each cycle in the loop. Testing becomes more critical in later stages as the previous stages become more nested into the core of the program.

    Well, enough Software Engineering for one day. Back to work....
  • by John Hasler ( 414242 ) on Wednesday June 19, 2002 @08:02AM (#3728387) Homepage
    "When testing code, what procedures work best for you,..."

    Make sure it compiles and runs and then upload it to Debian/unstable.

    (Yes, I'm joking).

    "...and do you feel that excessive testing hurts the development process at all?"

    If didn't hurt why would you label it "excessive"?
  • The difference between testing expected inputs and possible inputs is that reality doesn't limit itself to expected inputs. Heck, Sometimes it doesn't even limit itself to possible inputs.

    Larger tests don't test more. What the large tests do is make sure everything works together. You need the small tests to make sure each piece actually works.

    The bigger the test, the more likely that your testing platform doesn't resemble production.

    There is a big difference between getting your test to run sucessfully and having bug free code.

    Chances are the test cases have nothing to do with hiow the users actually use the program. Chances are the programmer has never actually seen how a user uses the program. Chances are, the first time he does he'll go back to his computer and start cursing the user for not doing things the "right" way.

    If something breaks and you don't add that something to the test case you're asking for it to break again.

    Testing deserves powerhouse machines and sadistic maniacs who like to break code. People who want the tests to be sucessful and don't run them that often are obviously not going to be as nasty as the real world. Even the sadistic manic has a hard time being as nasty as the real world.

    Tests are less expensive than that production errors. But only if they find errors. Tests that prove the code works perfectly usually don't.

    No programmer likes to be told he made a mistake. On the other hand they love a challege. Make testing fun and brutal and it'll be much more productive than if you make it boring and painless.

  • If you're not finding the bugs, then you're not doing a good job testing.
  • I find that.. (Score:2, Interesting)

    by MarvinMouse ( 323641 )
    When I code programs that are used by the general public. I find double-blind testing, and black-box testing works best. With software that means life or death or something severe I will also do white-box testing.

    double-blind testing is when you give the code to a willing party and just let them work with it like they normally would for business purposes, without letting them know it is a beta testing. You have to also include some type of bug report that people can fill in if they wish, but try to encourage them not to cause bugs, and just work with the program as if it was normal. This allows you to see if any of the normal functions that people use everyday would be buggy.

    Black-box testing works great to Just test the programs function calls and modules. When I do BBTesting I usually give it to another party with instructions as to how the functions are called and utilized. This party knows how to test the extremes and the common values and give me the best testing.

    White-box testing is testing that involves intricate knowledge of the code. When I do this it is usually in development. At the end, if I feel like I enjoy pain I will do a through white-box testing suite for the program, but that has only happened once or twice.

    In expenses, the cheapest form of testing is BB testing, followed by Double Blind, and then WB. Since white box testing takes a long time to design run and analyze the results I find.

    There's some thoughts for you though.
  • by Digital_Quartz ( 75366 ) on Wednesday June 19, 2002 @08:17AM (#3728476) Homepage
    There are two subjects I want to discuss here. First of all, I'm going to present the "jelly bean model" of defect discovery, then I'm going to talk about why the "testing to improve quality" model is fundamentally flawed.

    The Jelly Bean model goes like this: Let's suppose you have a big vat of red and blue jelly beans. Your objective is to remove all the blue beans. You do this by reaching in, grabing a hand full of beans, throwing away all the blue ones, and dumping the red ones back in.

    At the begining, it will be very easy to find the blue beans (assuming the blue-bean density is high), and towards the end, it will be very difficult (since the blue-bean density will be low). If you graph the cumulative number of blue beans you remove each day, you'll get a exponential curve; quite steep at the begining (high rate of discovery) and which flattens out as you approach total bean removal.

    Software defect discovery follows this model exactly. Defects are easy to find at the begining if there are a lot of them, and hard to find towards the end. This means that if your defect discovery rate is pretty much constant (with respect to the number of hours of testing you've done) then you're probably still way down in the very first part of the curve, and your number of defects is probably very high.

    Here's the important thing to remember though; the quality of your product has nothing to do with how many defects you find and fix during testing. The quality of your product is determined by the number of defects remaining! If you find and fix 10,000 problems, you might think you're doing very well, but if there are 10,000,000 defects remaining, your product is still crap.

    You can estimate the number of defects remaining by trying to fit the number of defects you've found so far onto that exponential graph I mentioned above. The most popular method to use a Weibull curve, or Quadradic Regression.

    Now, why is testing to improve quality a bad plan?

    Let's say you worked at Ford, and roughly 50% of the cars you turned out had something wrong with them. You get lots of unhappy customers demanding their money back. Is your problem:

    a) That you have a design defect in your car.
    b) That you are introducing defects in production.
    c) That you are testing cars insufficiently.

    Most people realize that to test every car as it comes off the line is futile. There's too many of them, with too many potential points of failure. There's no way you can test them all. The root cause of the problem has to be in either a or b, and if you're looking to improve the qulaity of your cars, this is where you would spend your money. This isn't to say that Ford doesn't test their cars, I'm sure they do, but testing should be a means of verifying quality (IE, 1/1000 cars tested had a defect, our goal was 1/500, so therefore we can stop spending money on finding design and production faults), and not a means of improving it.

    It's so easy to see this when we're talking about cars. Why does everyone get it backwards when we start talking about software?

    Not only is it impossible to test every possible combination of inputs to most software, it's also very expensive to find and fix problems this way. If you find a problem in design review, or code inspection, then you have your finger on it. You know EXACTLY where the defect is, and how to fix it. On the other hand, when you say "Microsoft Word crashes when I try to select a paragraph and make it Bold", you have no idea where the fault is. Any one of several thousand lines of code could be the problem. It can take literally days to track down and fix the defect.

    Your testing should not be a means of finding faults, but a means of verifying the quality of your product. Testing is not part of the development process.
  • Though remember, this is Slashdot :) Automated testing is common in embedded systems programming, and all but non-existant for any kind of Open Source desktop applications (gcc is an exception).

    You write test cases as you go. You make sure you can run an automated regression test at any time. If you don't do this, then any time you change code you might break old code and you won't realize it. Just doing spot checks at the keyboard isn't good enough. And the programmers need to be writing these test cases first, and they need to be kept separate from tests written by external groups.
  • Tried Cleanroom? (Score:2, Informative)

    by CyberGarp ( 242942 )

    My personal recommendation is the "Cleanroom" methodology. You create a functional specification with a mathematical guarantee of completeness and consistancy. Auditable correctness is also a part of the process. Then when it comes to testing you generate test cases that cover all states, all arcs and then do statistical test case generation based on a usage model. The overall cost of this process is a bit more up front, but studies have shown that the process far more than pays for itself in greatly reduced maintenance/debugging costs.

    So to answer you question is that to generate a decent set of test cases, you really have to understand the problem space and have mapped out the state-space in some manner. Trying to derive this without a methodical approach and ones testing will be spotty. The worst I've seen so far was a random state-space walker (ala Brownian motion). Statistically this approach avoids all the difficult cases in the far corners of the state-space.

    Now for the bad news: Cleanroom is quite tedious for the programmer. The enumeration phase takes seemingly forever and can be mind-numbingly boring.

    Here's the amazon link on the layman's book on Cleanroom: Cleanroom Software Engineering: Technology and Process by Stacy J. Prowell, Carmen J. Trammell,Richard C. Linger, Jesse H. Poore [amazon.com]

    And now for the shameless self promotion bit with a long winded sales pitch for executives on Cleanroom: my own Cleanroom company: eLucidSoft [elucidsoft.net].

    Just chant over and over: "Hire eLucid, play golf."

  • At the time that you are coding, every assumption is going through your head. This is time to write it down, either on paper, in a document, or in comments in the code. The mental state you are in when designing test conditions cannot come close to the state of mind you are in when coding (if you are concentrating :) . You are mentally closer to a problem when you are coding than when you are designing, and you can take the shortcomings of the platform you are working on and pair it with the shortcomings of the design.

    Any consideration you have during the writing of a single line of code is gold. And like a great dream, if you don't get it down when you think it, you will lose it in a day or two.

    My 2 cents.
  • MSDN used to have a column called Stone's Way or something, and in one of them they discussed user case testing: set up a video camera to record the user as he/she uses the program OR masquerade as a nondeveloper and spy on the user as he/she uses the program.

    If you're just looking for regular bug testing, assume it's a given that the user will not report bugs to you and have the program automatically email you a core dump and/or stack trace and/or any appropriate data if an unhandled exception occurs.
  • I work for a large company with a large number of internally developed applications.

    I am shocked at how frequently our developers don't have a good understanding of their architecture, or sometimes even the problem that they are trying to solve. As a result, when they go do do "testing" they are frequently performing tests that are not valid.

    For example, they might create a new build and test that build only on their development workstation before full deployment of the application.

    Naturally the development box has different resources from that of a standard production machine. Many developers don't seem to understand this.

    Another example - frequently boundary conditions, or interfaces to other applications are not fully tested.

    Using bad methodology, all of the time that you spend testing is wasted.

    Management tends to feel that testing time is wasted because their experience is that the time that they have invested in the past has been fruitless.

    Please develop:
    valid test cases,
    valid test plans, then

    execute them,
    find gaps, then

    use the gaps to learn how not to make the same mistakes in the future!

    Phooey.

    Anomaly
    PS - God loves you and longs for relationship with you. If you would like to know more about this, please contact me at tom_cooper at bigfoot dot com
  • bebugging (Score:5, Insightful)

    by Martin Spamer ( 244245 ) on Wednesday June 19, 2002 @09:54AM (#3729124) Homepage Journal
    How can you be sure you are 'Properly Testing Your Code'?

    Actually you can do this by adding more bugs, yes adding them, The technique is called bebugging and the is basicly:

    1) Produce code, it contains an unknown number (N) of bugs.
    2) Programmer (or bebugger) seeds the code with a number (B) of known new bugs, the number and type of bugs should be determined from bugs found in previous debugging cycles.
    3) Code is submitted to testing and some bugs are found (F).
    3) The bugs found are examined and categorised as either real bugs (FN) or bebugs (FB).
    4) Number of real bugs (N) can be found as the ratio of found bebugs (FB) to unfound bebugs (F).
    5) Don't forget to remove all the bebugs.
  • Requirements (Score:3, Insightful)

    by andymac ( 82298 ) on Wednesday June 19, 2002 @01:46PM (#3730950) Homepage
    Hi there --

    I'm certain someone has already said this, but over 80% of defects come from crappy requirements. Forget about your design & analysis, your coding practices, inpsection techniques, debugging and testing abilities - if your requirements are not CLEAR, CORRECT, ATOMIC, UNAMBIGUOUS, and CONSISTENT, you might as well start burning money.

    NASA correlated a $1 cost to correct a "defect" in the requirements stage (here a defect can be any requirement that does not meet all 5 attributes I listed above) to several hundred to thousands of times over when addressing the same defect at the testing stage. Crappy requirements and crappy specifications are a big part of what makes your code buggy and expensive.

    LA Times posted a study last year that showed that the average US programmer only coded for 51 days a year. 51 days!! One fifth of your working year spent writing new code. The rest of the time? DOING REWORK.

    Biggest cause of rework?

    UNCLEAR AND AMBIGUOUS REQUIREMENTS.

    Spend the time and effort to beef up your requirements gathering and management processes. You'll get your ROI in ONE project cycle.

  • by richieb ( 3277 ) <richieb@g[ ]l.com ['mai' in gap]> on Wednesday June 19, 2002 @03:30PM (#3731810) Homepage Journal
    Building correct software from requirements is as easy as walking on water. As long as they are frozen. :-)

According to the latest official figures, 43% of all statistics are totally worthless.

Working...