Stories
Slash Boxes
Comments

News for nerds, stuff that matters

Slashdot Log In

Log In

Create Account  |  Retrieve Password

The P.G. Wodehouse Method of Refactoring

Posted by kdawson on Sun Mar 23, 2008 03:02 AM
from the on-the-wall dept.
covertbadger notes a developer's blog entry on a novel way of judging progress in refactoring code. "Software quality tools can never completely replace the gut instinct of a developer — you might have massive test coverage, but that won't help with subjective measures such as code smells. With Wodehouse-style refactoring, we can now easily keep track of which code we are happy with, and which code we remain deeply suspicious of."
+ -
story

Related Stories

[+] Refactoring: Improving the Design of Existing Code 184 comments
kabz writes "Refactoring (as I'll refer to the book from here on in) is a heavy and beautifully produced 418 page hardback book. The author is a UK-based independent consultant who has worked on many large systems and has written several other books including UML-Distilled. Refactoring is a self-help book in the tradition of Code Complete by Steve McConnell. It defines its audience clearly as working programmers and provides a set of problems, a practical and easily followed set of remedies and a rationale for applying those techniques." Read below for the rest of Johnathan's review.
This discussion has been archived. No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More
Loading... please wait.
  • Grok it. (Score:5, Insightful)

    by symbolset (646467) on Sunday March 23 2008, @03:10AM (#22834864) Journal

    It's only 30k lines of code. This is no problem.

    First, take ownership. This is your project. Identify your resources, name the gates you must get through to succeed. If you have help make sure they understand their changes must hit the corner cases or it's junk, then give them ownership of their piece explicitly. Create a safe environment for testing changes, with forward and backward versioning.

    Define success. So many projects skip this essential step. If you cannot identify the destination you cannot tell when you've won.

    Skip the 50,000 foot view and proceed directly to "what does this do and how can it be done better"? Believe it or not flowcharts and Venn diagrams are not obsolete. Create tree views of function calls. Identify processes that should be libraried. Create policies like "maximum function call depth", "Maximum process share", etc.

    If you're the lead, look at issues like memory allocation and process management. Do your profiler due diligence.

    If you're the lone ranger on this just absorb the whole thing and integrate it. Force feed your brain huge quantities of what-ifs until it gives you the right answer in self defense - and then have somebody else check the result.

    30 days development and 60 days testing. Remember to give a nice presentation at the end and sell it!

    Good luck.

    • Re: (Score:3, Insightful)

      Skip the 50,000 foot view and proceed directly to "what does this do and how can it be done better"?

      While your post has many good and clearly expressed ideas, I'm not quite sure where you're driving at right here. The question "how can this be done better" can be asked at each node in the call graph, and the question is very broad.

      To ask this question near the root, for architectural purposes, I think what you want is exactly the 50 kilofoot view. There's of course utility in asking the same question closer to the leaves, but I think it's a mistake to overlook the big perspective in favour of going low

      • Re:Grok it. (Score:5, Informative)

        by Like2Byte (542992) <Like2Byte@nosPAM.yahoo.com> on Sunday March 23 2008, @07:34AM (#22835612) Homepage
        I totally agree. Even if you are a developer for a large, corporate, multi-vendor project - knowing how components that feed components you directly interface with will allow you to become a better developer for the project and to point out problematic architectural design issues.

        And if I hear one more project manager say, "Let's not worry about this corner case" (usually said with no idea how this is going to negatively effect the entire process tree) I'm going to punch them in the colon.

        There are two ideas of thought about corner cases (and the GP pointed out one).
        Thought #1) (GP) There's no such things as a corner. It is a requirement - it may be that fewer people/fewer processes use it; but, it is still a section of the total solution that must be designed to overcome some problematic section. Otherwise, why is the code being written?

        Thought #2) Corner cases only effect a small number of your user-base; therefore, code to satisfy 95%-99% of your customers. The underlying principle here is that the manager will wait for another release. This approach is usually taken when the project manager failed to account for something and says (and I quote), "We'll just re-design it after the first release."

        If you find yourself in an environment where #2 (hehe) permeates the thought structure of management you have few options available to you.
        a) Kindly (because wrapping your hands firmly around their neck is just not understood these days) explain to them the flaw in that kind of thinking. It usually involves educating the manager to a level they've never even considered before. Completion of this project will be long and arduous. Good luck to you.

        b) If step 'a' fails - inform management. Project Managers (in large corps) are not, usually, the final decision maker. Elevate this threat (to the project) to the PM's manager - a Director, perhaps.

        c) If you're able, move to a new project within the company where the project manager in case 'a' has no influence. I know that's not feasible in most segments.

        d) Find a new job.

        If the project is sufficiently high profile enough then recourse option 'a', above, is your only solution. Mitigate the damage by engaging the offending PM and try to keep them under thumb by sharing your expertise with them. Good luck with that brick wall. YMMV.
        • by Terje Mathisen (128806) on Sunday March 23 2008, @09:40AM (#22836352)

          There are two ideas of thought about corner cases (and the GP pointed out one).
          Thought #1) (GP) There's no such things as a corner. It is a requirement - it may be that fewer people/fewer processes use it; but, it is still a section of the total solution that must be designed to overcome some problematic section. Otherwise, why is the code being written?

          Thought #2) Corner cases only effect a small number of your user-base; therefore, code to satisfy 95%-99% of your customers. The underlying principle here is that the manager will wait for another release. This approach is usually taken when the project manager failed to account for something and says (and I quote), "We'll just re-design it after the first release."
          I have taken part in a few optimization competitions, and each time #1 has been a crucial part of the solution:

          The usual approach is to optimize the 90-95% case, then bail on the remainder, but this will almost always be beaten by code which manages to turn everything into the "normal" case, with no if/else handling, no testing, no branching.

          When I was beaten by David Stafford in Dr.Dobbs Game of Life challenge, I had lots of specialcase code to handle all the border cases, while David had managed to embed that information into his lookup tables and data structures. (He had also managed to make the working set so much smaller that it would mostly fit in the L1 cache. :-)

          When my Pentomino solver won another challenge, being twice as fast as #2, the crucial idea was to make the solver core as tiny as possible, with very little data movement and the minimum possible number of tests.

          Terje
    • Re:Grok it. (Score:4, Interesting)

      by blahplusplus (757119) on Sunday March 23 2008, @07:04AM (#22835518)
      "Believe it or not flowcharts and Venn diagrams are not obsolete."

      Believe it or not I use mindmapping software to help plan out the structure of a program and draw relationship lines arbitrarily, I wish someone made these mindmapping programs and made them more accessable to programs and programming.

      http://www.thebrain.com/ [thebrain.com]

      Also great flowchart drawing tools:

      http://www.smartdraw.com/ [smartdraw.com]
      • Re:Grok it. (Score:4, Informative)

        by cbart387 (1192883) on Sunday March 23 2008, @07:36AM (#22835632)
        Doxygen [stack.nl] is my favorite tool for C/C++/Java programming. It also handles some other random languages as well. Its main purpose is to create documentation (think javadoc but Open Source, handles more than just java, and better results). Here [usc.edu]'s an example of what it can do.

        Anyways, related to your post, doxygen can map out the call graph [usc.edu] from functions and dependency/include graphs [usc.edu] of files. It may be helpful in understanding the structure.
    • What? (Score:3, Insightful)

      Create policies like "maximum function call depth"
      I would rather have a policy of maximum function size (which may increase function call depth) than this policy.

      Do you want to encourage people to inline their functions manually, and not divide things into small, cute trivial functions?

      Is this a misguided attempt to increase efficiency?
  • Just burning up the comment threads on this one.
  • by Cecil (37810) on Sunday March 23 2008, @03:16AM (#22834898) Homepage
    Code that was written while drunk, high, or half-asleep I will be deeply suspicious of, and probably needs to be refactored immediately. Anything else probably needs refactoring as well, but less urgently.
      • It's suspicious if you're reading it and you don't realise it's your code because you keep thinking "What the hell is this guy doing?"
        • It's suspicious if you're reading it and you don't realise it's your code because you keep thinking "What the hell is this guy doing?"


          Ah, someone else who writes perl...

  • I hate the e.e. cummings [wikipedia.org] method of refactoring, which is to run all your code through a lower-case filter. Never seems to help very much.
    • I prefer the Raymond Chandler method - if you're having a problem with a section of code, have a man come through the door holding a gun in his hand.
      • How about the Joel Spolsky method of refactoring [joelonsoftware.com]?

        10 Never throw old code away.

        20 If code is broken, GOTO 10.

      • > if you're having a problem with a section of code, have a man come through the door holding a gun in his hand.

        Unfortunately, if you are having problem with spaghetti code, like I am, your man would have to crawl on his belly for several miles in twisty passages all alike before reaching the actual problem.
      • I say Jeeves, cancel my engagements for the morning, Aunt Agatha has decided that I must refactor my code so the Drones Club annual 'ship without testing' party will have to wait.

        Adversity strikes when one least welcomes it Sir.

        She claims my code 'smells'. I'll have her know my code smells as spiffly as a, as a, well, as a whatnot Jeeves.

        Indeed sir.

        Yes, a whatnot. I check my code against the very latest coding practices, and sometimes I even run it through unit tests!

        Admirable qualities in a coder, if I may say, Sir.

        Yes you may Jeeves. Now. to work! beastly testing.

        Sir, perhaps one could use some automated tool or other method of achieving the requisite level of quality desired.

        You know Jeeves, you've hit it right on the head there. I'll get Bernie Smetherington-Smythe to do it, he's such a ghastly bore but, well, when it comes to code review testing, there's no-one that can cut the mustard quite like him. Zip the source up Jeeves, we're to go pay Bernie a visit.

        Certainly Sir, but what if Aunt Agatha finds out?

        Pish Jeeves, pish! The auditors won't be around for months, no-one'll be any the wiser, and I can go to the ship-without-testing party after all. Life just falls into place sometimes doesn't it Jeeves? After all, What could go wrong?

        Yes Sir.
        • There's someone at the door, Jeeves.

          Very good, sir. Mr. Fink-Nottle, sir.

          What ho, Gussie.

          Oh, Bertie, thank heavens you're here! Someone is appropriating the prose style of the greatest author the English language has ever produced, and doing it in the most dreadful manner! He's even capitalizing the word "sir", and having Jeeves make interrogatory rather than simple declarative statements!

          Sorry, Gussie, he did a simple what?

          Oh Bertie, you ass, Jeeves would never actually question you! He would never say, "Certainly Sir, but what if Aunt Agatha finds out?" because that's a flat out question! Besides, he certainly wouldn't refer to your relative as "Aunt Agatha"! He might say, "Certainly Sir, but I might draw attention to the fact that Mrs. Gregson would take a dim view of such an approach." Bertie, you have to do something!

          All well and good, Gussie old thing, but what am I to do? The hands of the Woosters are tied, as it were.

          Not you, you fathead. We want Jeeves for this sort of thing!

          Ah, of course. Jeeves?

          Yo, Mr. B, what up?

          Jeeves, if you could forego the anachronistic and inappropriate argot for the moment, we have a problem, or rather a sort of quandry which requires your attention.

          Word. I talk my talk, yo.

          Sharpen your wits, Jeeves, for this is unlike any you have faced before, and I fear that even you may not be up to the task.

          De nada, boss. I got yer solution right here.

          Jeeves? Do I hear correctly? We've not yet set the problem before you, and you have an answer for us?

          Damn, bitch, didn't I just say that? Can't I hear my own self talking? Sh*t, I know what the problem is and I got the answer. It's self-referential code, dude. The problem is the solution, and vice versa. Get the code to recognize it's own faults, and set it to modify itself.

          And we would then end up with...?

          Undying prose, sir.

          Yes, Jeeves. How appropriate.

          With sincerest apologies to the Master, P.G. Wodehouse, whose writings gave me so much pleasure over the years, until I tried to write novels myself. Then they made me want to kill myself for my inadequacies as a writer.
  • by nullchar (446050) on Sunday March 23 2008, @03:31AM (#22834934)
    ...is a neat idea. Besides the mentioned practice of raising and lowering pieces of code that the developers are happy and dissatisified with, hanging code encourages peer review.

    Perhaps not in-depth code review, but physically hanging code in your office might "scare" developers into adhering to their organization's standards for fear of their coworkers mockery of poor code.

    It might be difficult to hide shitty code when anyone can walk by and look at what *you* think is good.
    (At least it might take just as much effort to hide bad code as it does to make it good.)
    • by symbolset (646467) * on Sunday March 23 2008, @03:46AM (#22834990) Journal

      But the screen resolution of fanfold paper hanging on the wall cannot be beaten by the best modern monitors.

      Sometimes just printing the stuff out, papering the floor with it and literally crawling over it yields answers that otherwise escape.

      If the line width won't fit on the paper at a reasonable pitch, there's a clue right there.

    • BTW (Score:5, Interesting)

      by symbolset (646467) * on Sunday March 23 2008, @03:57AM (#22835030) Journal

      I'm agreeing with you. 30k lines is 500 pages. That's roughly 8' high by 50' wide. Definitely doable.

      Not about the scaring though -- just about it being useful. Anxiety isn't something I'd want to deliberately introduce to a working programmer. Most of the ones I've known had enough performance anxiety issues of their own without adding any.

      Hanging the code makes some errors more visible. Not all errors are bugs. Some are structural. Structural fixes sometimes repair "pernicious" bugs.

    • by johannesg (664142) on Sunday March 23 2008, @03:59AM (#22835038)
      I have an even better idea: instead of printing the code on paper, maybe we could represent it by making corresponding holes in little cards. The cards you could hang in front of the window. As the classes get simpler, the holes can get bigger (because less total space is needed) and they get spread around more easily, so more and more light filters through. This way we can emulate the "sun rising on the project", "light at the end of the tunnel" feeling we all love so dearly.

      Need a status update? Just look into the room - if you can see sunlight, the work is done!

  • Big Visible Charts (Score:5, Interesting)

    by EponymousCoder (905897) on Sunday March 23 2008, @04:20AM (#22835094)
    I really like the concept, and it fits in with a bunch of techniques we've been using at work in line with the "Big Visible Charts" ideas. Things like this and Agile stories written on index cards and pinned to the wall do sound hokey. A number of people like Johanna Rothman http://www.pragprog.com/titles/jrpm [pragprog.com] however point out, that these techniques are a lot more inclusive and (as I've found) you get much more animated discussions than the pm/architect/team lead writing a document "for discussion."
    If nothing else it's fun to watch management trying to cope with your walls being covered with sheets of paper, cards and string when they've paid all this money for MS Project and the Rational Suite.
     
  • by S3D (745318) on Sunday March 23 2008, @04:41AM (#22835144)
    My code is not ugly. It's battle-scarred
  • The art (Score:4, Insightful)

    by www.sorehands.com (142825) on Sunday March 23 2008, @04:48AM (#22835164) Homepage
    Most of the books and documents that I read in the last 20 years go towards metrics, statistical analysis of code. This ignores the Zen and art of coding and debugging. While much of coding is science, there is a part of it that is feel. If it is only science, then code generators would have already eliminated programmers.

  • When refactoring dirty code, avoid doing minor cleanups on the other code. This way the places where you still need to work on stand out from the rest. In any case, as soon as minor cleanups go beyond layouting, it also means you're doing changes in code without test coverage. Even straightening out if/else clauses easily leads to errors.
    • the problem with that is without doing minor cleanup it is sometimes rather hard to work out what a peice of code is trying to do. I'm talking the kind of function that has a cryptic name, one or two letter parameter/variable names and no comments.

      • Re: (Score:3, Insightful)

        Agreed. And you have to change code anyway when you're moving functions that are defined elsewhere, so the code does change.
        The key idea though is, you have an array of visual cues that tell you instantly this code still needs to be refactored. These cues often can be removed in bulk, even automated with scripts. Indentation for example. Or use of deprecated functions. Certain types of comments. It's attractive to do these bulk cleanups because they give the overal code a healthier outlook. But they remove
  • The article highlights a principle which we all know (either explicitly or implicitly): we are highly vision-oriented creatures; visual perception is (relatively) easy for us. A quick convincer: coloured and neatly indented code is easier to read than monochromatic unindented code, right? So perception of colour and position is faster than that of symbols and their relationships.

    The methods in the article plays right into this: by viewing the code zoomed out greatly, one can readily see the density of code, and get a visual "fingerprint" of each chunk. By coupling printout position to satisfaction with the printed code, one can readily see which piece of code needs the most work.

    Interesting additions: adding colour to each class and method based on how memory they allocate (or how many objects they construct); or colouring functions relating to their position in the call graph, or their in-degree.
  • ...takes a very long time on the product of two large prime codes.
  • by IainMH (176964) on Sunday March 23 2008, @06:03AM (#22835324)
    Talk about decoupled classes..

    What ho.
  • by johannesg (664142) on Sunday March 23 2008, @06:10AM (#22835346)
    I was under the impression that "large projects" started somewhere around the million lines of code mark, not at a mere 30K lines. But here is what I do, and none of this require any special insights into the source code (note that I do this primarily for C++):

    1. Ruthlessly delete lines. Get rid of ***anything*** that does not contribute to correct operation or understanding. Even including things like version history (that's why you have the damn tool, use it already (1)!), inane comments (but keep the stuff that actually helps with understanding), code that is commented out (if you really need it, it will be in the aforementioned version tool), code that is not called, and code that is not doing anything at all (such as empty constructors or destructors).

    2. Decrease the scope of everything to be as tight as you possibly can. Make everything that you can private, static, or whatever else your language offers to decrease scope. Declare variables in the innermost scope. Make them all const if possible.

    3. Anything that belongs together should be in one file (even if that files becomes 5000 lines long). Anything that *doesn't* belong together should be split into separate files (but don't make a file for just a single function - instead create a file with "leftovers").

    4. Anything that has a non-descriptive name is to be renamed to what it really represents. No more "int x; // x is the number of blarglewhoppers" - just use "int NumBlargleWhoppers" instead.

    5. Keep an eye open for duplicate code. Get rid of the duplicates.

    6. Any special insights gained, write them down as comments in the appropriate place. Anything you do NOT understand, also write them down as comments. Mark those with something you can grep for.

    7. Any homegrown version of something that is available in STL or boost, to be replaced by its "official" alternative.

    8. And that goes double for string operations! No more "char *" anywhere; it is the 21st century, use strings already! I'll make an exception for functions that allow "const char *" to be passed in, but only with the "const". If I find a "char *" without the "const", I *will* come to your office and bash your head against the wall. Repeatedly. Just so you know.

    9. Any error handling through error return codes, probably to be replaced by exceptions, unless it turns the calling code into a wild mass of try/catch blocks.

    10. Pointers, to be replaced by references where possible.

    11. Negative logic and names, to be replaced by positive logic and names. Don't have "if (!NoPrinterAvailable()) {A();} else {B();}" - instead do "if (PrinterAvailable() {A();} else {B();}".

    12. Anything that looks like it was written by drunk lemurs or the French, to be deleted on principle and replaced by something sane.

    So there you have it. In my experience, doing this will remove about half of the lines of code (more if there was a significant number of lemurs on the team), at the gain of considerable clarity and usually performance.

    (1) And honestly, I don't give a flying fuck which one of you messed up on the 29th of february 1823 or why you thought it was a good idea in the first place. I'm concerned with what the code will be doing in the future, not how it came to be in this sorry state. Chances are, whatever you thought at the time is long obsolete anyway. Get rid of the cruft. Get rid of anything that doesn't help - it just clutters the mind.

    • Of course you could just chuck all this object-disoriented stuff and write in good, old fashioned C, like the rest of us.

      If its too big to fit in the address space of a 6502, then you are doing it all wrong. (or maybe it should have been done in SNOBOL in the first place.)

    • by siride (974284) on Sunday March 23 2008, @08:08AM (#22835786)
      > 9. Any error handling through error return codes, probably to be replaced by exceptions, unless it turns the calling code into a wild mass of try/catch blocks.

      Exceptions should be used to mark, well, exceptional failure. I really really hate this pattern that Java (and perhaps from elsewhere) has foisted upon us where we get frickin exceptions because we reached the end of the file. That is technically an error condition in the reader function, but it is not exceptional and it shouldn't require me to write the "wild mess of try/catch blocks" just to read in data from a file. Exceptions say "we are really in a mess and have to abort this operation, and potentially the program. They do not say "could not find element x in array".

      If that's what you were saying, however, then I apologize.
    • by Enleth (947766) <enleth@enleth.com> on Sunday March 23 2008, @08:35AM (#22835950) Homepage
      I'd disagree on pointers and references. If you pass something in by reference, you need to know it goes in there by reference, it's not visible in the calling code. If something's not visible - well, that's a bug just waiting to crawl in there. If you pass something by pointer, the calling code shows it clearly and you know that whatever was passed is likely to be changed by the called function. That's the rationale used by Trolltech [trolltech.com] and it is quite convincing to me.

      Besides, using char * is a must sometimes, when using C libraries that accept, modify and return strings or just some chunks of arbitrary data as char *.
        • Re: (Score:3, Insightful)

          That means you are dependent on a big, clunky IDE for writing your code. Not everyone uses them even for big projects - for example, KDE's Kate is sophisticated enough to handle those, yet still lightweight. Even worse if you are writing an API for a library: you are forcing everyone using it to memorize where the references were or use a big, clunky IDE. And even if you use an IDE, you sometimes need to read a piece of code and see such things without retyping the paren to force a dumb IDE to display the p
    • Re: (Score:3, Insightful)

      12. Anything that looks like it was written by drunk lemurs or the French, to be deleted on principle and replaced by something sane.

      I'm sorry but I find all this French bashing racist. Unless, of course, you have some information on the coding tendencies of the French that I do not. But having worked with a handful French people and I can say nothing bad about them. I know French "jokes" may be acceptable in the U.S. but this is the Internet and try to behave yourselves. As a rule of thumb: r

      • Re: (Score:3, Informative)

        I know French "jokes" may be acceptable in the U.S. ...

              only for a small conservative subset of the U.S., which as we know have proven themselves to be a joke.

          rd
    • by jsebrech (525647) on Sunday March 23 2008, @11:43AM (#22837016)
      So there you have it. In my experience, doing this will remove about half of the lines of code (more if there was a significant number of lemurs on the team), at the gain of considerable clarity and usually performance.

      I work on a 2 million line code base, written by a few dozen people, most of them off-shore, that is poorly commented, poorly documented, and has many modules of code that no one in our team understands well. In other words, a typical large commercial code base.

      At first, I would routinely aggressively clean up sections of code as I made changes in them. But then I started to notice a pattern: there were bugs in the functionality of that code that weren't there before I "cleaned it up". When you refactor highly convoluted code, it is seductive to make assumptions about the working of that code (especially in how the code interacts with the rest of the system), because it is hard work to actually figure it out completely. Those assumptions have a nasty tendency to be wrong.

      Nowadays I approach code changes like this: if I don't understand the code 100 percent, I make my changes as low impact as possible, even if it means uglifying the code. If some part of the code base needs refactoring to allow implementing a new feature, I first figure it out fully, document its existing behavior (often line-by-line, call-by-call, class-by-class), look at every place in the entire code base where it is called (and document those places), and only then do I refactor it.

      The point is this: if code is ugly and slow, but it works, it is better code than clean, fast, beautiful code with bugs. Better in the sense that it makes the user happier, and the user is one of only two metrics that truly matter in software development (the other being cost). Always resist modifying code just for the sake of cleaning it up. If it works, don't touch it.
  • Nope, it's not. And it's a stupid practice, the way some people try to define it.

    "What are you doing?"

    "I'm refactoring code."

    "Oh, you aren't doing anything."
  • by kurisuto (165784) on Sunday March 23 2008, @07:35AM (#22835616) Homepage
    From TFA:

    "The problem is that warty old code isn't always just warty - it's battle-scarred. It has years of tweaks and bug-fixes in there to deal with all sorts of edge conditions and obscure environments. Throw that out and replace it with pristine new code, and you'll often find that a load of very old issues suddenly come back to haunt you. So, a total rewrite is out. This means working with the old code, and finding ways to wrestle it into shape."


    There's a big difference between having code which just happens to somehow work, and having code which works because the code is clearly written and documented, where the person in charge of maintaining it actually understands what the code is doing.

    Whether you rewrite from scratch or work with the legacy code, it's your job as the programmer to understand and document all of the tweaks, bug fixes, edge conditions, and obscure environments. If there aren't comments in the existing code to explain these things, then it's your job to understand why the code is doing what it is doing, and add the comments as needed. If the code isn't clear, it's your job to make it clear.

    The author correctly points out that when you do a total rewrite, then the undocumented special cases handled by the old code will make themselves felt. As these problems present themselves, it takes time to fix them. However, you also get the opportunity to understand the undocumented special cases and get them clearly coded and properly documented, which reduces maintenence costs over the long term. Your judgment whether to maintain or to rewrite should take both of these factors into consideration.

  • by martyb (196687) on Sunday March 23 2008, @08:02AM (#22835764)
    FTFA:

    ... A better solution would be to print a class per page. At the start of the project, the application had about 150 classes, and the refactoring effort is focussed on about 80 of those. Initially, gigantic classes would be an incomprehensible smudge of grey, but as the refactoring process starts tidying the code and factoring out into other classes, the weekly printout would start to literally come into focus, hopefully ending up with many pages actually containing readable code (which happens roughly when the class is small enough to fit on no more than 3 pages at normal size).

    Brilliant! Absolutely brilliant! "Smell test?" Yah, right. But then I got to thinking, "Why are code formatting standards such a hot topic?" The computer doesn't care if indentation is expressed with 2 spaces, 3 spaces, or a tab. But, I do! Over time, I've learned how to see coding errors just from the slight aberrations in the LOOK of code. Couldn't tell you WHAT it was, at first, it just felt (or smelled) wrong. So call it what you will, but I could now see how "smell test" has some basis behind it. Then, I got to thinking of an age-old question:

    How do you find a needle in a haystack?

    1. Make the haystack smaller, and/or
    2. Make the needle(s) bigger

    The technique in the article accomplishes BOTH of these. I'd suggest running the code through a pretty printer [wikipedia.org] to get consistent layout throughout the whole project. The more the semantics of the project can be represented by syntax, the more visible the troublesome code becomes.

      • Re: (Score:3, Insightful)

        I'd suggest running the code through a pretty printer to get consistent layout throughout the whole project.

        Umm... running it through a pretty printer wipes out the very details that printing out is supposed to bring out. After pretty printing, you are no longer seeing the 'native' code - but rather you are seeing the patterns hard coded into the pretty printer.

        I respectfully disagree. Consider a piece of code that has 8 levels of nesting. With a judicious use of short variable names, parentheses, an

        • Re: (Score:3, Insightful)

          Well the obvious problem with the analogy is that you're not finding needles in a haystack - you're looking for hay in a haystack.
  • I love the way the article uses the complete rewrite of Netscape as an example of why you shouldn't rewrite from scratch. Cause we all know how big a failure Firefox is </sarcasm>
    • Not to mention that Netscape was already doomed as a browser company well before that rewrite started.
    • Re:Netscape (Score:4, Informative)

      by balster neb (645686) on Sunday March 23 2008, @10:01AM (#22836472)

      I love the way the article uses the complete rewrite of Netscape as an example of why you shouldn't rewrite from scratch. Cause we all know how big a failure Firefox is
      Have to disagree with you.

      While the Mozilla story did have a happy ending, the rewrite resulted in IE getting a near monopoly of the browser market. The "new" Netscape was massively delayed, and was finally released as a rebranded version of the bloated Mozilla suite. It was in the period between about 1999 and 2004 that IE expanded it's market share. In other words, Netscape lost as a result of throwing away the old code base.

      It was only from around 2004 onwards, with Firefox, was Mozilla able to present a viable alternative to IE.
  • by joe_n_bloe (244407) on Sunday March 23 2008, @04:30PM (#22838868) Homepage
    I'd like to focus on the author's comments about rewriting vs. refactoring. From July 25, 2000 [perl.com]:

    Last Monday, nobody knew that anything unusual was about to happen. On Tuesday, the Perl 6 project started. On Wednesday, Larry announced it at his "State of the Onion" address at the Perl conference.

    It's one thing to decide to rewrite rather than refactor a product that is losing market share because it is not performing as well as its competitors. (E.g. Netscape.) It's another thing to decide to rewrite (and redesign) rather than refactor a wildly successful and popular product because its continued development has become difficult. Just shy of eight years later, Perl 5 is still creaking along nicely, and Perl 6 (White Elephant Service Pack) is still under design as much as development.

    Is Perl 5 so hard to refactor that a determined effort couldn't have made progress, or been completed twice over, in 8 years? Along the way, a lot of the cruft and inelegance in the language could have been removed, and more elegant features inserted.

    It happens over and over again - developers, even experienced ones, can't see the impracticality of what they're getting into, and can't see that they're doing work that isn't needed.
  • by mAx7 (137563) on Monday March 24 2008, @04:09AM (#22843094)
    In the Complexity Map [complexitymap.com], a slightly similar approach, a treemap is used to visualize the code's namespace hierarchy in a 2d-landscape. Results from code metric tools are layed out in the treemap, either for individual metrics (e.g. cyclomatic complexity) or for aggregated metrics (anthing that influences team productivity; e.g. errors that are not logged). Due to the Prefuse [prefuse.org]-based seamless zooming, combined with drill down functionality, it's really easy to visualize and investigate hotspots in extremely large codebases [complexitymap.com].

    The website contains some more background and a nice interactive demo [complexitymap.com]. If you have the patience to wait for the applet to load, I'll guarantee you you'll like it.

    Disclaimer: I am the author of this tool. The website mentions commercial interest, but to be honest: there's hardly any. I've found that the concept is just too difficult to sell over the web, so I'll probably open source it soon.
    • Re: (Score:3, Insightful)

      Elegant as it may be, that version of quicksort is so slow that (IIRC) even the Haskell documentation suggests against using it in "real" code.

      Personally, I think the C++ way is even easier to read, and it has the benefit of being really fast:

      sort(xs.begin(), xs.end());