Forgot your password?
typodupeerror
Programming IT Technology

Interview With Martin Fowler 101

Posted by timothy
from the fowling-piece dept.
Arjen writes "Artima has had a conversation with Martin Fowler, one of the gurus on software development today. It consists of six parts. Parts one, two, three, and four are available now; the rest will appear the next weeks."
This discussion has been archived. No new comments can be posted.

Interview With Martin Fowler

Comments Filter:
  • by stephanruby (542433) on Thursday November 28, 2002 @05:46PM (#4776937)
    Look Martin Fowler is also complaining about duplication !

    hint...

    hint...

  • by dagg (153577) on Thursday November 28, 2002 @05:48PM (#4776946) Journal
    Martin Fowler's books that outline such things as extreme programming and refactoring are tops in my opinion. But in my experience, many middle-tier and upper-level managers think that such concepts are useless and timewasting.
    --

    Not sex (yes it is) [tilegarden.com]
    • many middle-tier and upper-level managers think that such concepts are useless and timewasting.

      Their problem is, it's very difficult for them to tell if refactoring was actually worth it - their developers start with a program that does A, they stop working on new features for the next release for 3 months, and they come back with a program that still does A ("but better!").

      As a developer you can explain how cleaning up code is preventative maintenance to make development easier in the future, but if you're not a developer that's a very hard thing to measure.

      Even if you trust your engineers implicitly, any good manager has to realise that developers love intellectual problems over more mundane tasks - and refactoring's biggest problem is that by design you end up looking like nothing happened.

      • That's nothing a good analogy couldn't explain. I don't know if I can think of a good one off the top of my head.

        Okay, say a new doctor had received his instruments one at a time and upon each delivery they were placed in a random spot in the operating room. During operations, the doctor found himself wasting a lot of time looking everywhere for the right instrument so one afternoon he took the time to organize them all into one place. While operating, he was still able to do the same things, but more efficiently.
        • Well one could ask how he ever became a surgeon in the first place, leaving scalpels on the floor.

          Coders who suck in round one of a project often suck in rounds two and three.

          • You certainly have a point about sucky coders creating sucky code. However many times programs that are initially well designed get bloated with all sorts of cruft over time when features are added and requirements change. Requirement change tends to turn good designs into bad designs, since the original good design was for a different problem than the current program solves. This mess can often be cleaned up with refactoring. Often one is forced to work with code created by someone else, and if that code is messy, but not completely hopeless wonders can often be achieved by refactoring it.
      • As a developer you can explain how cleaning up code is preventative maintenance to make development easier in the future, but if you're not a developer that's a very hard thing to measure.

        Even then it is often a wash. How many people "refactored" to support OLE? CORBA? All of those would be a waste of time now. How many people refactored to get OO into their code? Dubious gains there too.

        Refactoring is often about injecting the last good idea you had into working code.

        I won't dispute that there are legitimate cleanups, but if a developer couldn't do it right the first time, I have my doubts that round two is going to be much better.

        • Refactoring is often about injecting the last good idea you had into working code. I won't dispute that there are legitimate cleanups, but if a developer couldn't do it right the first time, I have my doubts that round two is going to be much better.

          I think you're probably right on both counts... :-) My experience has been that refactoring is most successful when applied to code that was in pretty poor shape to start with.

          Often this is because it was the first attempt by someone who didn't really know what they were doing, they kept hacking on it until it limped into life, and then it got shipped. "Refactoring" in these cases is really just a face-saving way of cleaning up someone else's mess...

          If you've been working professionally for more than 5 years, and you're reasonably good at it, then writing clean maintainable code should be second nature ("refactoring" as a new concept is really for people who didn't pick up "structured programming" the first time round...).
        • by Spruce Moose (1857) on Thursday November 28, 2002 @06:41PM (#4777116)
          Refactoring is often about injecting the last good idea you had into working code.

          Refactoring isn't about making your project buzzword compliant or supporting distributed OLE-foo++. That's adding new features. From the article:

          Martin Fowler: Refactoring is making changes to a body of code in order to improve its internal structure, without changing its external behavior.
          Refactoring is usually saying "hey, I implemented that function the wrong way so I'm going to rewrite it properly". The right way of doing something is often obvious after coding it up the wrong way.
          • Refactoring is also not just something you do (or not) after you've finished (i.e. shipped / deployed). It's something you do as you go along, during the process of building the thing. If you need to refactor substantially after you've finished, then it may be that you didn't refactor enough as you went along.

            Refactoring is a fancy word for what many good programmers who are not geniuses do already, a few good programmers who are geniuses don't need to do, and a lot of bad programmers who may or may not be geniuses don't do enough. Speaking as an OK programmer who is not a genius, I feel I need all the help I can get, be it from test-infection or from taking a considered approach to cleaning up after myself.

            Periodic refactoring helps me keep abreast of shifting requirements; it isn't about prettying-up something that's about to become obsolete anyhow, but about keeping a check on creeping duplication and tacit dependencies so that the code can absorb new requirements instead of being broken by them.

        • by Leeji (521631) <slashdot&leeholmes,com> on Thursday November 28, 2002 @08:21PM (#4777509) Homepage

          Refactoring is often about injecting the last good idea you had into working code.

          I've got to disagree with you on this one. Refactoring isn't about injecting good ideas into working code, it's about making sure that those new requirements you got last week didn't create spaghetti code. In a development world were you design from a static set of requirements, Refactoring really doesn't have much of a place. In that world, you have a valid point: if a developer can't do it right the firt time, round two won't be much better.

          However, that's not the world I live in, and it's not the world that anybody I know lives in. Then again, I don't have friends at NASA. We live in a more realistic development world that changes reqirements after we've designed our framework. We live in a development world where the evolving system makes the customer reconsider their original decisions. Since you didn't design the system with these new requirements in mind, you're usually patching functionality on top of what you got.

          Refactoring is about reviewing this patch-work to make sure that the code is written as though it were designed that way from the beginning.

          Certain agile methodologies (especially XP) treat refactoring as a tenet. Customers never give you a complete spec, and you never want one. They give you "stories" that you implement quickly and minimally. You don't over-engineer to support infinite extensibility via Web Services and a plugin architecture (etc,) unless the story asks for it.

          When your latest minimal implementation starts looking ugly or hacky (ie: 4 page switch statement) then you refactor until the code properly "expresses its intention."

          You had some good points, but I think you are a little mistaken about when refactoring comes into play.

        • I won't dispute that there are legitimate cleanups, but if a developer couldn't do it right the first time, I have my doubts that round two is going to be much better.

          I have my doubts that you have ever developed a major program. As Fred Brooks said: "Plan to throw one away, you will anyway." In other words there is no way that we know of to get software right the first time. In reality you write something that sort of work, and then you know enough to say "Okay, now that I understand the problem I know how to do it right."

          I'll agree that sometimes the first solution that you wrote up is good enough. However it is ALWAYS hard to maintain. Often there are bugs that the design will not allow you to fix.

          Of coruse there is some question of if I was advocating refactoring or re-writing. (And the line is a little fuzzy anyway) However round two will almost always be better than round one for any giving set of requirements. Round three might not be much better, and could be much worse (If you pointlessly refactor to try some new ideas now that round two works good enough).

          You should always plan for two rounds on any software that you plan to support for any lenght of time. The first proves it can work, and the second makes sure that whoever has to look at the code in the future can understand it, and fix the bugs.

          • You should always plan for two rounds on any software that you plan to support for any lenght of time.

            I always plan for three: (1) do it wrong, but understand the problem at a deep level in the process, (2) do it right, (3) do it right, fast, and clean. (1)->(2) is a throw-away and rewrite, (2)->(3) is mostly refactoring-type work, but the term wasn't in use when I started coding so I didn't know what it was until today ;-)

        • No, adding OLE or CORBA support is not refactoring. It's adding new functionality. Refactoring is simplifying code and eliminating redundancies, period.

          Even if we accept your improbable claim that good developers always make their code completely clean on the first iteration, it's still the case that multiple developers on the same project often duplicate work. Every large code base I have worked on has had substantial scope for simplification through refactoring.
        • I won't dispute that there are legitimate cleanups, but if a developer couldn't do it right the first time, I have my doubts that round two is going to be much better.

          What do you mean by first time? Do you mean when the developer first got it compiled? Or do you mean when the developer first wrote the pseudo-code? And then when do you think we should stop refactoring? When the code has gotten to be 20 lines? Or when the code has gotten to be 20,000 lines?

        • You seem to have a somewhat different definition of refactoring than the one Fowler uses in his book on refactoring, in his other writings, and in the interview referenced above.

          First of all, adding OLE or CORBA would not be Refactoring. Fowler described it like this:Refactoring is making changes to a body of code in order to improve its internal structure, without changing its external behavior. [artima.com]

          Secondly, Fowler's book doesn't recommend refactoring for no reason. He has some specific design problems that a developer might see in a body of code they are working on. (a method too long, two classes too tightly intertwined, etc.) In his book, he describes refactoring as being the flip side of design patterns. Design patterns can be used during the design phase to create a good design. Refactoring can be used during the construction phase to become a good design.

          Thirdly, the developer who didn't create a good design initially can use refactoring and come up with something better, because there are catalogs of effective refactorings [refactoring.com] The recipes that define these refactorings describe how to make these changes efficiently and safely without disturbing any more of the code than necessary.

          These aspects work together like this. A developer , while coding finds that some problem is impeding their progress. For example, he discovers that every time he makes a change to one class, he discovers that he needs to make a correllary change to another class. He then decides that it fits the description of "Feature Envy", and performs the move method refactoring. [refactoring.com]

          Basically, I see refactoring as a software developers equivalent of building codes. Building contractors don't need to know, or at least calculate out every time, the physics involved to make a structure solid enough to support itself and its contents. The building codes are a distilled instructions of what the physics calculations would indicate as appropriate action (with a bit of a margin of error.) Performing refactorings based on well known, tested, refactorings is using design tips of people who are much better software designers than you are.

      • Their problem is, it's very difficult for them to tell if refactoring was actually worth it - their developers start with a program that does A, they stop working on new features for the next release for 3 months, and they come back with a program that still does A ("but better!").

        S'funny, most of my refactorings take about three minutes, and come back with a program that still does A exactly the same (but better organised).

        Occasionally, as I've done over the past couple of days, I do spend several hours rewriting a piece of spaghetti into manageable form, just so I can sensibly work with it. I am, at this moment, the living embodiment of the poor soul who is fated to work with hack-on-hack code forever, such that every little change he makes that logically is fine actually causes a different result in the fifteenth decimal place.

        This is the sort of time when refactoring is essential, and of course, if the code has been refactored in small increments along the way, it would never have reached SpagCon 5 and required me to do a big refactor now...

        The only catch, alas, is that now my nice systematic code has to have special cases in to reproduce exactly the behaviour the old spaghetti had in relevant cases, right down to that error in the fifteenth place. :-(

      • Even if you trust your engineers implicitly, any good manager has to realise that developers love intellectual problems over more mundane tasks - and refactoring's biggest problem is that by design you end up looking like nothing happened.

        Sometimes you can manage management expectations. The next time they complain about the padding you put on your time estimate, explain that the reason it will take so long to add a feature is that the code is brittle. That makes the results of cleanup visible and makes it clear that you aren't just doing science fair stuff, but solving an actual problem.

    • I don't think its just the PHB crowd that is suspicious of refactoring. My own opinion is that you should simply try your very best in the first iteration and leave it at that. Constantly reworking code that does the very same thing that it did before is often a pointless exercise - by time you have the internals "clean", new requirements demand a drastic addition or total rewrite anyway.

      Requirements tend to move rapidly. That has been my experience. Dwelling upon one particular snapshot of requirements presumes that your code will have a long active lifespan, which it often doesn't. In my own case, my code has lived on in productive use for about four years on average after it has been written (after which it is often junked entirely), with a creation time often near a year. Those type of stats don't lead me to believe that refactoring is worthwhile.

      • I think you're missing the point, which is that as well as meeting the current requirements, refactored code is probably much easier to change to fit the next set of requirements too. At least, that's how it should be...
      • by Ed Avis (5917) <ed@membled.com> on Thursday November 28, 2002 @07:22PM (#4777280) Homepage
        I think the point of refactoring is that it happens exactly when you add a new requirement. You have some code that does X. Now you have a requirement that it do Y as well. There is a clean way to add feature Y, but it would need restructuring the existing code first. So as a first step you refactor the code so it still does X - 'but better!' - and then you can add code to do Y more easily. Doing the two steps separately - the XP rule Do not refactor and add features at the same time - is probably less risky than both at once.
      • Code review combined with refactoring is, however, a great way to improve code in your project that was written by outside contractors, or programmers with less ability.
      • "drastic addition" "total rewrite". Those phrases are definitely not in my PHB's vocabulary. With great effort I've actually gotten him to hear "take the time to do it right" once or twice...

        Programmers (one would think) are supposed to know the code and the product better than anyone else. So, sometimes when they think "this code sucks" and a piece of the system ought to be reworked they're exactly right. Then again, sometimes not. So, should refactoring occur in any given situation? Maybe.

        Like the Mythical Man Month said: "There is no silver bullet."
      • "Constantly reworking code that does the very same thing that it did before is often a pointless exercise - by time you have the internals "clean", new requirements demand a drastic addition or total rewrite anyway."

        As long as the total rewrite also satisfies the old requirements (unit tests), a total rewrite is a recommended form of refactoring.

        "Dwelling upon one particular snapshot of requirements presumes that your code will have a long active lifespan, which it often doesn't. In my own case, my code has lived on in productive use for about four years on average after it has been written (after which it is often junked entirely), with a creation time often near a year."

        Dweling upon one particular snapshot of requirements presumes that *some* of your old requirements will live on in the next iteration or in the next application. If the old requirements will not be used in the next application/iteration, or for that matter, if you're sure there won't be any new requirements; then there is no need to use refactoring.

      • "Constantly reworking code that does the very same thing that it did before is often a pointless exercise - by time you have the internals "clean", new requirements demand a drastic addition or total rewrite anyway. "

        One last point:
        Constantly reworking code is not about reusing "clean" code, it is about increasing your understanding of the code. Sometimes, the same piece of functionality that originally took you three months to write can be rewritten from scratch in just a few days.

      • by Big Sean O (317186)
        Martin Fowler is quite pragmatic when he's talking about refactoring. First of all, he doesn't think you should refactor for refactorings sake.

        The best piece of advice I've read of his is "Three strikes and you refactor". In other words, if you duplicate some code, you hold your nose and live with it. But if you're putting in similar code three times, then you're better off refactoring out the duplication.

        The other piece of advice I keep with me when I code is (I'm paraphrasing here) "If you need to add feature A, and your code makes it difficult to add feature A, refactor the code until feature A is trivial to add". A well-factored program makes it easier to add new requirements and often prevents a total rewrite.

        I don't think much of the "code for a month, refactor for a month" and I don't think Fowler does either. Most everything I've read is "code for 5 minutes, refactor for 5 minutes".
    • Unfortunately most (thankfully not all) people in management and project management only think about dates.

      Most would rather release a product with a lot of defects rather than missing a date. Granted that missing a date is a pretty bad occurrence too. I would rather release a product with a lot of defects than miss a "marketted" date, but I will miss it for an "internal" date.

      A marketted date is a date that the marketting reps (and other forces that be who usually have no clue in programming) blurt out to the client and/or the public and it is a date that must be kept for fears of reprisals and contract disputes.

      An internal date is basically an internal milestone. e.g. Development stops on this date A so testing can start on date A.

      I am willing to forgo the internal date A to reach internal date A+n where n is usually a smallish number that would most likely not affect the marketting date if it will allow us to create a product with less defects.

      Defects reported by test teams usually take longer to fix than spending some extra time in development to prevent it in the first place.

      Unfortunately most of these people that have the authority would rather announce that they released their component on-time at the expense of other groups that depend on their component. But then that's corporate politics and the larger the company/project the more politics you would have to deal with.

      That's how most of them were brought up and unless the next generation learns from their mistakes the cycle will go on.
  • This cannot POSSIBLY be interesting "Stuff that matters". It's only the first time it's been posted on Slashdot. :-)
  • Telling Point! (Score:4, Insightful)

    by Cap'n Canuck (622106) on Thursday November 28, 2002 @05:57PM (#4776976)
    I read the first part of the interview, and there was a point that Bill made that struck me as very profound. He said that refactoring can cause severe differences between a stream that has been refactored and one that has not. I think that there has to be a limit on refactoring, especially once a code gets beyond a certain number of iterations (releases). For a Configuration Management person, or for CM software, refactoring can quickly turn into a nightmare.

    Just my $.02
    • allow me to retort! (Score:3, Informative)

      by tolan-b (230077)
      i think you're missing the point here, which is that good refactoring shouldn't affect surrounding code, it's typically about fairly small changes and simplifications.

      obviously there will always be some effect, but a proper well written test suite (primarily unit tests, but also higher level tests) should catch the vast majority of cascade effects.

      program to the interface and altering the internals shouldn't matter.

      oh dear i'm beginning to sound like one of those evangelists!
      • No. This is a good point. If you work in an environment where there are many different active streams then trying merge these types of changes could be quite difficult.


        Often however, if you get back to 1 live development stream, this is where it might make sense to do some refactoring.

  • by ssclift (97988) on Thursday November 28, 2002 @06:04PM (#4776989)

    What bothers me, is that the process being plugged here doesn't address methods for keeping all of the expressions of the program consistent. Tests are a (limited) alternate expression of the program's functionality that are trivial to compare to the original. Yes, they need to be good. What about documentation (which I define separately from annotated indexing, such as doxygen or Javadoc produce), which is another, separate expression (usually not in an executable language) of the program?

    Until these authors address how to keep the entire structure from docs down to code consistent during a refactoring I think they are missing an important point. I've pointed this way before on Slashdot, but the only good way I've found for ensuring that non-executable forms of the program are consistent with executable forms, is formal Software Inspection (see Gilb's stuff at Result Planning [result-planning.com] ). I've found that the more versions of the code there are (requirements, user docs, code itself, tests etc.) that are kept truly consistent, the more likely it is you will not make mistakes in the final expression (the code you ship).

    The refactoring process can be even less "like walking a tightrope without a net"; you're net is a web built out of the relationship between all of these documents, not just the code and tests!

    • a good point, although it needn't necessarily be quite that bad.
      a freind of mine who works for a major financial institution in london was telling me the other day that their entire documentation (from code annotation, through javadoc/doxygen style documentation, and suprisingly even the final _user_ documentation) is all done from within the code.

      i was quite suprised to hear this, but he insists it works well, and they are working on _huge_ projects.
    • Why do you have multiple, non-executable expressions of a program in the first place? A large part of refactoring is to make the program text itself more expressive, make it very apparent to the reader what the design is all about. If you can do that, what's the use of having separate commentary?

      John Roth
    • by Anonymous Brave Guy (457657) on Thursday November 28, 2002 @11:14PM (#4778002)
      I've found that the more versions of the code there are (requirements, user docs, code itself, tests etc.) that are kept truly consistent, the more likely it is you will not make mistakes in the final expression (the code you ship).

      I've found that the fewer versions of the code there are at all, the fewer cock-ups happen.

      Most documentation that gets printed out is a waste of time. Make the damn code its own documentation, at least as far as implementation details and such. Keep the printed stuff to be higher level overviews of the design -- the big picture -- and for describing the key algorithms and concepts. This way, others on the team or new joiners have somewhere to start and a frame of reference when they check the details (which are described by comments in the code, or indeed the code itself) but you don't waste massive amounts of time writing a silly document that just describes what every method of every class does. We have tools to do that for us; they're called comments. :-)

      (By the way, notice that simple refactoring rarely changes any of these higher-level overviews. If you're doing something significant enough that it does, it probably justifies a quick review of the overall design and a suitable update to the document anyway.)

      Aside from higher level design docs and feature specs, about the only non-code documents that are often justified are requirements specs and acceptance test specs, which should usually be written in the language of your application domain and not in software terms anyway, and which should be completely independent of the code. On some projects, other forms of testing might usefully be documented as a kind of checklist, but often automated unit tests remove most of the need for this.

      So, there you go: I have exactly one version of the code, and it's the code. There might be high-level summaries of features, or some details of the algorithms the code implements, documented as well, but they are somewhat independent. Everything else is for the customer, and totally independent of the code. If you've never tried working this way, and come from a "heavyweight process" environment, try it some time. I'd bet good money you'd prefer it. :-)

      • Good points, I'm not talking about creating non-executable stuff for the sake of it, just where there is a genuine need for another expression of the system, keep it consistent.

        For example, the big-big-boss might say "We need a more efficient and faster transaction handling system." and might not say anything more. That expression of the "system requirements" is in "big-big-boss" language and must be executed by his staff. For a large system that means inevitable intermediate expressions of the code. I think the problem lies in how one keeps them consistent without increasing the workload, i.e. by making that consistency almost automatic.

        My comment was intended to imply not that there should be many redundant code expressions, just that where there are any, and they are truly required, keeping them consistent catches a lot of problems.

        I've heard your type of objection many times, usually by people who have been burdened by a heavyweight process that isn't delivering savings. It's a shame that poor process has given so many people a nasty taste in their mouth for efforts in that direction; I saw that a lot where I worked. I also saw a project that rejected my advice slip 3 years over on their 2 year schedule, swearing every quarter that they were only 6 months from release.

        My team of 2 cracked out a mathematical analysis system with 56K operating lines of low-redundancy analysis code, 40K lines of support code with about 80% of that in support documentation, fully tested, cross referenced etc. in 8 months. It ran 2 years in production and never failed. I credit low-effort design techniques that maintained consistency to being able to do that.

    • One way they've addressed it is in at least one of the principles of agile modeling [agilemodeling.com]: travel light. I quote:

      Every artifact that you create, and then decide to keep, will need to be maintained over time. [...] Every time you decide to keep a model you trade-off agility for the convenience of having that information available to your team in an abstract manner (hence potentially enhancing communication within your team as well as with project stakeholders). Never underestimate the seriousness of this trade-off. [...] a development team that decides to develop and maintain a detailed requirements document, a detailed collection of analysis models, a detailed collection of architectural models, and a detailed collection of design models will quickly discover they are spending the majority of their time updating documents instead of writing source code.

      Also check some of the practices [agilemodeling.com], particularly "Discard Temporary Models" and "Update Only When It Hurts". In brief they suggest that you learn to throw things away when you've finished with them, minimise the number of models that you keep, and don't fret about making sure all your models are consistent just for the sake of it.
    • Oddly enough, he's not missing anything. You're just missing how he addresses that.

      The critical point he makes is /avoid duplication/. Don't say the same thing twice -- or rather, if you've said the same thing three times, write a tool to automate the thing so that you only say it once, and the tool generates the three copies.

      Javadoc give you a start at documentation, but as you say it's just an index. So, use it to generate your code index -- that's one less thing to duplicate.

      Try to find ways to say things without repeating yourself. Your unit tests, for example, will express your detailed design; so don't write detailed design, just write unit tests (and write them at the time you'd write your detailed design).

      Massive reviews of multiple views of the source will also work. I work at a company which uses that. I'd say that one crucial requirement is that the multiple versions not be simple duplications of each other, but rather each say something truly unique. In other words, your multiple views of the document really aren't just different views, but are actually parts of the program. Your user docs aren't just rehashing the design manual in different language; they're actually the part of the design manual which specifies how the user will see the program. The programmers and customers write a prototype of a section of the user docs before they agree to add a user-visible feature; it's called a 'use case'. Tech writers may later elaborate on it, and automatic processors may add screenshots to it, but in essence, the User Stories become a critical part of the user documentation.

      I'm talking too much. I'm also being needlessly specific. You don't have to do anything specific that I've said, but take my point: in most systems, redundancy is a problem, not a solution. In some systems redundancy is useful; but even in those I would suggest that the redundancy appear as forks of the design at design time, not at multiple levels of the software process (one version in user documentation, one version in design, another version in the code, etc...). What you have if you allow that redundancy is simple chaos -- you have no solid way to prefer one version of the software to another, you just have to choose one and hope you got it right.

      -Billy
      • I'm glad this comment has generated some debate!

        I don't think we're actually off that much on our opinion. Maybe I expressed the point "I've found that the more versions of the code there are..." a little unfortunately. I agree fully about keeping expressions such that they have a unique point to make, that content has to win over form (as noted in that Agile Modelling site).

        I'm just saying that where content is not directly testable, inspection fills that consistency checking gap. But you seem to agree with me there... :-)

  • by Ars-Fartsica (166957) on Thursday November 28, 2002 @06:21PM (#4777044)
    Most refactoring advocates seem to make two key assumptions:

    Your code will be needed for a long time

    Your requirements will not change drastically

    Even if they don't state these explicitly, it is asusmed in the premise of refactoring. If code did not have a long productive life ahead of it, you wouldn't waste your time continually replumbing it. If requirements did in fact change rapidly, there would not be time to revisit completed working code.

    In my experience most code lives for about four years and then dies. During that time the requirements often shift by at least 25%. Given these observations, refactoring appears to be a waste of time.

    • by Dog and Pony (521538) on Thursday November 28, 2002 @06:57PM (#4777168)
      That is why refactoring by itself is no silver bullet, and noone is saying that it is. At least, noone that has any insight. It is, however, a great tool to meet the changed requirements.

      You have to Embrace Change [c2.com], and in that refactoring will really, really help you a lot.

      If requirements change so much it is not the same program anymore, well, then I'd not say that requirements have changed in the project. It is a whole new project, right? And then, of course, no rewrite will help you.

      But if they change a lot within the same functionality, you use refactoring to get to the new goal without breaking anything, and because you have been refactoring out good interfaces as you went the changes are easier to implement. You do not code for two months, then refactor for two, the code etc. You do both all the time.

      I think you are missing the point here.
    • by Anonymous Coward
      Your requirements will not change drastically

      You are wrong with this assumption. Refactoring (like other XP practices) are for software whose requirements changes constanly.

      Your code will be needed for a long time

      You don't need the code for a long time because if you never refactor your design (not only your code), the system, with a lot of patches, becomes a Big Ball Of Mud [laputan.org], and then you need to throw away your code [laputan.org].

      The point isn't only about "Refactoring", is about the evolutionary code (refactoring is only a small technique for to allow future evolution in parts of the code, cleaning the code to allow more drastic design changes).

      The best example about software evolution is Smalltalk (note that the book "Refactoring" is based on ideas taken from Kent Beck's book "Smalltalk Best Practices Pattern").

      The base code of Smalltalk comes from the 70's. And the recent version of comercial Smalltalks (with support for all the buzz tech like XML, WebServices..) still uses a lot of objects who evolve from the original implementation.

      (offtopic - Talking about Smalltalk, I don't know why any company doesn't made an object based OS, based in the concepts of Smalltalk, for example Squeak [squeak.org] is a good example of an Smalltalk Enviroment -or Object Enviroment- that can be used as a complete OS desktop).
    • In my experience most code lives for about four years and then dies. During that time the requirements often shift by at least 25%. Given these observations, refactoring appears to be a waste of time.

      Yes, requirements most certainly do change. And refactoring can be a waste of time. And as a matter of fact, Martin Fowler himself may agree with you. To an extent, anyway.

      From the first section:

      Martin Fowler: Refactoring improves the design. What is the business case of good design? To me, it's that you can make changes to the software more easily in the future. Refactoring is about saying, "Let's restructure this system in order to make it easier to change it." The corollary is that it's pointless to refactor a system you will never change, because you'll never get a payback. But if you will be changing the system-either to fix bugs or add features-keeping the system well factored or making it better factored will give you a payback as you make those changes.

      So Mr. Fowler would say that is that it is a waste of time to refactor code if it will not be used in the future. However, refactoring doesn't igonore changes. In fact, according to Fowler, the opposite is true. I understand that refactoring makes code more maintainable and understandable, and that what he seems to be saying. And considering that there are many software systems that use "legacy code" refactoring is certainly not always a waste of time. Refactoring, is exactly what such systems need.

    • by alder (31602)
      From the book:
      Refactoring is the process of changing a software system in such a way that is does not alter the external behavior of the code yet improves its internal structure
      This process is not a rigit set of procedures one must perform in order to make a refactoring happen. The refactoring could be as simple as introduce explaining variable or extract method or even rename method, which take less then a minute to complete. While they are simple and easy to complete, especially if your IDE/editor is equipped with refactoring support/tools, they provide tremendous value - the code becomes cleaner and more maintainable.

      I had to perform those tiny refactorings zillion times on my own and other peoples code to better express its purpose in code itself. Sometimes without preliminary refactoring it was simply impossible to understand someones code at all (one prominent example - class had a method that consumed 12 Letter pages in 8pt Courier). It was my experience that most of the time refactoring never takes more then 10 min. to complete. Then you switch your activity to coding, and only later may want to do some more refactoring. You may not have notices this, but if you are a good software engineer you do refactor you code even if you do it subconsciously and think that you don't!

      Refactoring, like XP, is not a religion and does not attempt to be "all or nothing" for you. Use as little (or as much) from them as you find convenient. Spend as little (or as much) time on a particular activity as you find useful. Most importantly - use what you find useful and keep open minds for other parts as you may find them useful in the future. After all refactoring stems from the same roots as design patterns - "a core of the solution to the problem which occurs over and over again in our environment". They just provide value to different domains - design and implementation.

    • Often the code becomes obsolete because it got refactored out.
    • It's not a waste of time to the guy that got hired to update the project after all of the original developers quit.

      If a project has been properly refactored in step with requirement changes, odds are it will be easier to read and understand than it was when it was first deployed.
  • by Junks Jerzey (54586) on Thursday November 28, 2002 @06:24PM (#4777053)
    Pick up any Forth book from the early 1980s and a major theme is that of properly factoring your code. I was always surprised that no one ever picked up on the term--until the last several years, that is.
  • Heavy reading (Score:2, Offtopic)

    by jki (624756)
    I knew people consider his books, but could not have imagined this heavy

    Read More... | 15 of 20 comments | Developers

    That must be the lowest amount of comments to a frontpage story for a long time. Anyway, I think the UML distilled is one of the best organized books I have read. Works for a novice as well as someone who only knows something. Not my favourite though, but a piece of art in clarity :)

    • someone who only knows something

      s/only/already/ - sorry.

    • That must be the lowest amount of comments to a frontpage story for a long time.

      Well, for a change, this is a frontpage story that you can hardly comment on without actually reading the linked article first. Also something that hasn't happened in a long time...
  • by Anonymous Coward on Thursday November 28, 2002 @06:34PM (#4777084)
    I thought you meant that punk kid from Eastenders.
  • Automated Testing (Score:4, Insightful)

    by DoctorPhish (626559) on Thursday November 28, 2002 @06:38PM (#4777105) Homepage
    I agree with the author on the subject of writing a comprehensive test suite as you code, but I've found that in applications that need to process a significant variety of real-world data in large volumes, your mock-data will be far from sufficient.
    Often, the only real way to get good data for your tests is to have the software used in the field, and then use some of the more complex cases to test against. Corner cases are also a problem, esp. if you are relying on your tests to be comprehensive, and verify the correctness of your code. Tests are certainly valuable, but are by no stretch infallible. I've found that you don't get any really useful data until around the second revision of the code (assuming fairly wide use and access to end-user data). Sure, running tests against some custom widget might be pretty reliable, but once you run up against stuff that is inherently hairy, you need data that is representative of real-world usage patterns before you can be sure that changes you make won't break it out in the field.
    • in the case of OO code, surely if the code is well architected and implemented (high cohesion, loose coupling) then you should be able to test small parts independently?

      many small tests, written to define the interface. if you can define the interface you can write the test first, if you can't define the interface clearly before user testing then you're going to be in trouble anyway.
      • Small parts can indeed be tested independently of one another. It's when you need to tie a large amount of data together to come up with one or more results that things tend to come apart. Under loads/values/configurations/complexity that you didn't model in your test data, results are uncertain by definition. It's less a matter of poorly defined interfaces, and more one of a highly complex domain. I'm just warning against treating your test functions as an absolute measure of code correctness. Code that breaks a large number of real-world cases that were never considered when the data was designed, may test as good. Quality of data is hard to certify, except through actual use.
    • One way to try to improve this is to do some coverage analysis of your test suite. Basically you profile the code when you run the test suite and see how many basic blocks (chunks of code with only one entry point and exit point) you're hitting. You'd be surprised, but often a test suite that looks good will only hit 50% of the basic blocks in the code. The bugs that are found in the field will normally be in the basic blocks that weren't touched by the test suite. With a coverage analysis report you can try to write new tests that hit basic blocks that you haven't got already.

      Your comment about corner cases hints at this: a corner case will usually come down to a path through the code that hits a basic block that hasn't been tested before.
  • by Shackleford (623553) on Thursday November 28, 2002 @07:12PM (#4777234) Journal
    It's quite interesting to see an interview from Martin Fowler just shortly after attending a lecture in a software engineering course in which maintenance was discussed. The lecturer, in his discussion of software maintenance, compared software maintenance to other forms of maintenance. Ususally, when people speak of maintenance, it is simply the act of ensuring that something is working as intended. In the context of software, however, when maintenance is done, so much about the software is changed that it maintenance would be an inaccurate term. When I hear about refactoring, however, I think that it is a more accurate term for it could be "maintenance" simply because it does not change the system, but improves the way in which it is built.

    Just as in any other case in which maintenance must be done, it is quite important that this maintenance be done. It may not change the functionality of the code, but it can help make the software more easily adaptable. It can also help developers understand their own code, view it differently, and find different ways of implementing their systems. It may be more popular with Dilberts than PHBs, but perhaps those in the latter category should understand that even small amounts of refactoring can help save much time later on.

    This [amazon.com] is one of my favourite books on programming/software engineering and one of the many topics it covers is refactoring. I'd say that it does a good job arguing the importance of refactoring and how to convince those PHB types to accept it. But if you're just interested in refactoring itself, I suppose that this [amazon.com] one is the best reference on the topic. I must say that for quite a few reasons, refactoring is something that should not simply be considered just another trend/buzzword, but an important part of maintenance, which in turn is an important part of the software development life cycle.

  • ...he's been missing from Eastenders for ages.
  • Q:- Why does MF find refactoring / works?

    A:- The clue was in the TEST bit

    How do you know your test TEST the code? esp. if you write them?

    A test could be:-

    Syntax, picked up by a good compiler

    logic, picked up by gigo test data

    response, getting someone to use your CORBA/DCOM/JE++/RPC piece of crap and ask why the new computer is sooo much slower than the old one, hey but nice graphic of a pussy cat licking its arse! I repeat TEST!!!

  • the boys at Microsoft should take a lesson from Fowler. Martin Fowler XP (extreme programming), Microsoft Windows XP (extreme prejuduice).
  • by _wintermute (90672) on Thursday November 28, 2002 @10:16PM (#4777837) Homepage
    ~sigh~

    Some of you people simply have NO idea how code works in the real world, i am sure of it. Hacking perl scripts is so unlike developing the large OO software that drives most information systems.

    One of the fundamental issues with software architecture is that more often than not architecture is emergent. 'build one to throw away' is an old old adage (I believe it was Brooks who orginally declared this) and neatly summarises the key problem with developing software architecture.

    "Even when one has the time and inclination to take architectural concerns into account, one's experience, or lack thereof, with the domain can limit the degree of architectural sophistication that can be brought to a system, particularly early in its evolution." From the Big Ball of Mud (link below).

    We design, we develop, we learn, and then we design again ... the sceondary design phase can be called refactoring. There are a number of refactoring patterns (I recomend the 'Refactoring' book) and some of the coolest Java IDEs support refactoring (check IDEA and Eclipse) - you can do things like move methods up the object hierarchy into base/parent/super classes, extract methods from bodies of code, change names, merge classes etc etc). These features let the savvy developer leverage the emergent aspects of design. Driven by time/cost/deadlines, we often do the thing that works rather than the thing that works elgantly. Refactoring lets us recapture some of the elgance of our code. Coupled with test-first methods, we have an incredibly powerful system.

    Pretty much ALL modern software lifecycle models are iterative, simply because up front design does not work. The waterfall model is a naive concept of a bygone era.

    Refactoring is therefore a crucial aspect of an efficient design process. Typically, I would suggest that refactoring occurs at the end or begginning of each iteration ... our refactoring is informed by the evolution of the software - we don't refactor for fun, we clean up code SPECIFCALLY to target aspects of the product we know will change or

    To see refactoring in action, join an Open Source project. Most OS teams that I have witnessed in action will employ refactoring of some description, even if they don't call it that. It makes a great deal of sense in OS, because we have large, distributed teams working on software, refactoring helps consolidate changes by disparate team members.

    further reading: http://www.laputan.org/mud/
    • ~sigh~
      Some of you people simply have NO idea how code works in the real world, i am sure of it.
      Which ones, i'd like to ask?
      I do have an idea. Even couple of them :o)

      Hacking perl scripts is so unlike developing the large OO software that drives most information systems.
      The word 'most' used is inadequate!
      For two reasons:
      1. If You meant most by quantity then You can't be wrong more.
      2. If You meant most by significance then You again can't prove Your avaluation.
      Nah! I'm sure You made or participate in at least one realy big project. And thus You surely know that refactoring interfaces is a dead sign to the all project. Hence we say "NO!" to interface refactoring. And we change interface at the points of major architecture redesigns only. Do You agree?
      We of course may pee with a boiling liquid each time we mended some improperly realized method. Or hit our chest in proud when we "refactor" wait-loop.

      Realy I'd like to know: does anybody here think that poorly designed architecture, module responsibility and interaction interfaces could be mended by stitches and patches? I don't think so...

    • Having some professionnal experience too, i'd like to point out a few things:
      • if the code works correctly, what's the point of refactoring ? You'll need to test _again_ the code, refix bugs, and so on
      • ok, maybe you can split code somewhere, or make a common procedure / function / class out of existing things, to reduce code complexity. But usually you do that because you need to (to implement new things, mainly)
      • from the management's point of view, refactoring is often seen as wasting time. After all what's the point of reviewing code when you could be coding other stuff, like new functionalities ?
      • usually, when you wanna fix a bug, you go for the quick & dirty way, because it's faster... why spend a week rewriting / testing when you can fix in 10mins ? Specially if you have bad time constraints

      I don't mean refactoring is bad, after all i think i do enough while coding, even if it's more adding features / correcting bugs that drives the refactoring process.
      I like 'nice' / 'elegent' (very subjective thing :-)) code, true... But usually if a part works correctly, no point in rewriting it.

      On the other hand, sometimes you have a few mins, start browsing the code, add a few comments on things so people can understand better... and decide to simplify the code because it's a pain to understand !!

      I think bottom of line the best way is to document while coding, so you (or others) can backtrace easily why you are doing something, and why you are doing it that way instead of another. The mere fact of documenting clearly sometimes points weird things you are doing, maybe soon enough to make you stop to think on a better way to implement.
    • Of course open source software is amenable to heavy refactoring - there is no opportunity cost. The developers would not be working on otherwise productive code because they are often tied for some reason or another to one project.

      In a commercial setting, there are always new projects that need work, so it is almost always a waste of time to replumb code that won't be used in three years anyway.

  • circular? I'm not dumping on refactoring, as improving your code is never a bad thing. But some of these statements, especially since Fowler seems like an OO advocate, contradict each other.
    "Refactoring is making changes to a body of code in order to improve its internal structure, without changing its external behavior."
    "But if you will be changing the system--either to fix bugs or add features--keeping the system well factored or making it better factored will give you a payback as you make those changes."

    So if your classes don't change externally, there should be little merit to this argument. You can extend them just as well as before as you can after, performance considerations aside.

    For that reason I think there is a problem with this definition of refactoring, at least in the scope of OO and functional (and any others where you can wrap glue around code) systems. "Refactoring" is necessary in poorly written (ie - code that needs "refactoring", hehe) OO and functional code because it will be buggy, broken and slow (even MS gets around to rewriting their code). It can't be avoided in procedural code because it will sooner or later become impossible to maintain. This isn't as optional a process as he implies, and is not a newfangled invention.

    • So if your classes don't change externally, there should be little merit to this argument. You can extend them just as well as before as you can after, performance considerations aside.

      I think you confuse system external behaviour with class external behaviour. When you refactor you might very well change class external structure, but you don't change system external behaviour (that is add features).

      Then once the system is well factored it is easier to make changes that change the system external behaviour.
  • From the trenches... (Score:4, Interesting)

    by cheezehead (167366) on Friday November 29, 2002 @02:26AM (#4778677)
    Ok, a true story.

    Years ago, I worked on a big software project for the aerospace industry. This was in the early 90s, and I am proud to say we were a bit ahead of our time. Iterative design, unit tests automatically running overnight, peer reviews, refactoring (although we called it "rewriting"), collective code ownership, we did it all.

    After extensive integration testing (and finding and fixing many bugs), we installed the system at the customer site. After running for two weeks continuously, the system froze. Fortunately, the operating system was kind enough to report that it had run out of memory.

    The cause of the problem was obvious: we had a memory leak somewhere. We had never run our code through a memory leak detection tool. The reason for this was that management did not want to spend money on something like this (...). Fortunately, we happened to have an evaluation copy of such a tool when the problem was detected. Installing the tool took 20 minutes, finding the leak took 2 minutes. It also found a memory leak in on of the OS libraries, but that was a one time leak. Our problem leak was only 8 bytes in size, but since the leaking routine was called several times per second, the system ran out of memory eventually.

    Anyway, the leak was all my fault, and fixing it took about 20 seconds. Rebuild, and everybody was happy again.

    So, what did we do wrong?

    1. We should have had better checks for memory leaks in the first place. So, blame management...

    2. We should have tested for a longer period than we did. Our integration tests included stress tests, but we never ran tests for more than 24 hours or so without rebooting (rebooting obviously hid the memory problem). Running for two weeks would have revealed the problem, but that doesn't always cure everything (read on for part two...)

    Two years later. I had moved to another country, and was doing consulting work at a customer site. I got a phone call. The system had frozen again, but this time not after 2 weeks, but after 6 months running continuously. I investigated the problem, and after lots of debugging I isolated the problem to a (vendor supplied) device driver. Code inspection revealed that this driver incremented an internal counter every time it was called. This counter was a signed 32-bit integer. So, after 6 months of calling it several times per second, the counter rolled over, and the driver stopped working. Of course, a reboot fixed it, and the system was good to go for another 6 months. I'm not sure if this driver was ever fixed. You could very well argue that it's not worth the effort: just reboot once every 6 months, and you're fine.

    What is the point of all this? Well, a lengthy testing period would have revealed our first bug. However, to find the second one, we would have had to do a stress test for more than 6 months. No matter how enlightened management or your customer are, they'll never agree to something like that in real life. Besides, there is a known defect in this system that will manifest itself in 2036 or so...
    Also, where should you stop testing? We trusted the 3rd party driver to be coreect, and in this case we were wrong to do so. I see no practical solution for this, though.

    Lesson: you can do (almost) everything right, use proper methodologies, test early and often, have a team of brilliant people, etc., and still you can have problems.

    Ironically, the fact that we had stable software running on a robust operating system (UNIX), caused the bugs to manifest themselves. Had we been writing unstable software or running on Windows, reboots would have happened more than once every two weeks, and neither bug would have shown up...

    • Lesson: you can do (almost) everything right, use proper methodologies, test early and often, have a team of brilliant people, etc., and still you can have problems.

      Sad, but alas true.

      I don't think you give yourselves enough credit, though. OK, the memory leak was careless on someone's part, both in the fact that your techniques/tools allowed it in the first place, and in the fact that you didn't run a decent checker on the finished product. But if that's the worst you found in the first few months in an app on the scale I imagine you were dealing with, you did remarkably well, and I'd say your methods were a great success.

  • Design Patterns (Score:5, Interesting)

    by jtdubs (61885) on Friday November 29, 2002 @07:13AM (#4779189)
    Everyone all hopped up on Design Patterns that has any background in high-level programming needs to read:

    Revenge of the Nerds, by Paul Graham
    http://www.paulgraham.com/icad.html

    Paul Graham is a Lisp guru, among other things. His perspective on "patterns" is quite in contrast to that of the XP fab four.

    Specifically, I refer to this fabulous quote:

    "If you try to solve a hard problem, the question is not whether you will use a powerful enough language, but whether you will (a) use a powerful language, (b) write a de facto interpreter for one, or (c) yourself become a human compiler for one. We see this already begining to happen in the Python example, where we are in effect simulating the code that a compiler would generate to implement a lexical variable.

    "This practice is not only common, but institutionalized. For example, in the OO world you hear a good deal about "patterns". I wonder if these patterns are not sometimes evidence of case (c), the human compiler, at work. When I see patterns in my programs, I consider it a sign of trouble. The shape of a program should reflect only the problem it needs to solve. Any other regularity in the code is a sign, to me at least, that I'm using abstractions that aren't powerful enough -- often that I'm generating by hand the expansions of some macro that I need to write."

    He's my hero. :-)

    Justin Dubs
    • Yes indeed. Two more quotes for you:

      "anything that you can treat as construction can and should be automated"
      Martin Fowler, Refactoring (2000)

      "any [programming] processes that are quite mechanical may be turned over to the machine itself"
      Alan Turing, The ACE Report (1946)

      This is not a dig at MF - I think he's a cut above the GoF because he recognises many of the fundamentals.
    • The topic of Design Patterns (GOF-style) came up as a story a few weeks ago [slashdot.org] if anybody is looking for slashdot-style comments on the topic.
  • refactoring.

    "The Editing Trap [Substituting Writing for Thinking]

    Computers seem to tempt people to substitute writing for thinking. When they write with a computer, instead of rethinking their drafts for purpose, audience, content, strategy, and effectiveness, most untrained writers just keep editing the words they first wrote down. I have seen reports go through as many as six versions without one important improvement in the thought. In such writing, I find sentences that have had their various parts revised four or five times on four or five different days. Instead of focusing, simplifying, and enlivening the prose, these writers tend to graft on additional phrases, till even the qualifiers are qualified and the whole, lengthening mess slows to a crawl.

    Drawn in by the word processor's ability to facilitate small changes, such writers neglect the larger steps in writing. They compose when they need to be planning, edit when they need to be revising.

    [...]

    Reused Prose

    Writers easily become attached to what they have written, even when it serves the purpose badly. The computer frees many writers from this attachment by making the text fluid and continuously editable; for some writers, though, computers make this attachment harder to break. Typewriters challenge this attachment; in writing with a typewriter, writers typically retype each passage several times, which forces them to reread word for word and presents an excellent occasion to hear the passage and make changes. By contrast, a word processor enables writers to reuse passages from the developing piece so easily that reuse becomes a universal, invisible step in writing.

    Being pragmatic, professionals often reuse blocks of material from previous reports. A good writer can do this well, but a less accomplished writer easily succumbs to a clumsy kind of self-plagiarism. Most of the adult writers I have worked with reuse "boilerplate" materials in a simple, modular fashion, stacking blocks of self-contained material in the midst of new passages, having little sense of how to combine the different parts. Most of them are tone deaf to the lurches, shifts in convention, and changes in tone between new and old writings.

    I often advise authors to throw out these drafts and rewrite from scratch, but no one ever has. In part, they are always too busy; but more important, they are not writers. They are unaccustomed to taking responsibility for a piece of writing, devising an effective strategy, and seeing it through. Few of them have developed an effective writing process, and their approaches to writing lack flexibility. Such people do not need an editor; they need a writing instructor-something they lack but your students are fortunate enough to have.

    [...]

    Distortions of Length: Prolix and Telegraphic

    The ease of writing on a microcomputer liberates many writers: And though this liberation helps reticent students, aids brainstorming, and makes many professionals more productive, the very ease of writing can lead to problems. People who have little to say suddenly take a long time to say it. Word-inflation multiplies. Instead of saying it well one time, unfocused writers devise dozens of ways of coming close to saying what they mean. They continue writing. The words pile up. The results look impressive, but I never know quite what the writers meant to say.

    Computers have the opposite effect on other writers. Normally intelligible, they become cryptic. Each mysterious word stands for phrases, sentences, even whole pages of unwritten intentions. I have to pry the words apart to uncover the thoughts concealed between them.

    [...]"
    How Computers Cause Bad Writing [longleaf.net] by Gerald Grow, PhD

  • Are all two man teams truly of equal skill? If so, you're luckier than I've been. If not - do you trust them to really be able to change any code in the system? Unit tests aren't the check and balance here - because Unit tests are usually too low-level to help diagnose global code changes.

    How do you warn people not to refactor code that may be ugly but has been performance tuned (or tuned to meet a requirement that can't be easily reflected in a unit test)? Think about the atomicity and performance issues of creating a transaction log. Your unit tests aren't going to test the key characteristics of this code (as fast as possible and uncorruptable in the face of failures). So your culture allows anyone to twiddle with this? This goes in hand with my previous question, really. Are all teams and all code created equal, as XP seems to claim? In my experience, no.

    Any anecdotes out there on refactoring wars within development teams?

    On making things explicit: it sounds like Fowler is arguing that static behavior is geneally better than dynamic behavior (see his Dictionary vs. hard coded class example). Is it just me or does his comments completely ignore the fact that the Dictionary example has far more flexibility than a hard-coded class?

    In all I like alot of XP's characteristics - but they're always taken to Extremes! I'm serious here - many of the ideas are great (and well-known), but to reap the XP benefit it seems you always have to do it in the most extreme way possible. Despite the use of the phrase "common sense" often by XP supporters, the entire practice seems to disallow it. In fact, many web pages on XP tell you to ignore your instincts and blindly follow XP for a sizable chunk of time before judging it. This is common sense?

Wherever you go...There you are. - Buckaroo Banzai

Working...