Rewrites Considered Harmful? 670
ngunton writes "When is "good enough" enough? I wrote this article to take a philosophical look at the tendency for software developers to rewrite new versions of popular tools and standards from scratch rather than work on the existing codebase. This introduces new bugs and abandons all the small fixes and tweaks that made the original version work so well. It also often introduces incompatibilities that break a sometimes huge existing userbase. Examples include IPv4 vs IPv6, Apache, Perl, Embperl, Netscape/Mozilla, HTML and Windows. "
Windows XP was a complete rewrite? (Score:4, Interesting)
Was it a "good idea" for Microsoft to rewrite Windows as XP and Server 2003? I don't know, it's their code, they can do whatever they like with it. But I do know that they had a fairly solid, reasonable system with Windows 2000 - quite reliable, combining the better aspects of Windows NT with the multimedia capabilities of Windows 98. Maybe it wasn't perfect, and there were a lot of bugs and vulnerabilities - but was it really a good idea to start from scratch? They billed this as if it was a good thing. It wasn't. It simply introduced a whole slew of new bugs and vulnerabilities, not to mention the instability. It's just another example of where a total rewrite didn't really do anyone any good. I don't think anyone is using Windows for anything so different now than they were when Windows 2000 was around, and yet we're looking at a 100% different codebase. Windows Server 2003 won't even run some older software, which must be fun for those users...
Design desitions (Score:5, Interesting)
Rewrite can be good, if done properly. (Score:3, Interesting)
Don't Forget To Include Winamp! (Score:4, Interesting)
I'm mighty happy sticking with Winamp2, thank you very much.
Re:Ego? (Score:3, Interesting)
Furthermore, all of the features that creep into v1.1,
Maintainability (Score:5, Interesting)
The other side of the rewrite issue is, how long can you continue to maintain code from a legacy system? I worked on a project a couple years ago that had been migrated from assembler to COBOL and is now being rewritten (as opposed to being redesigned) for Oracle. Nevermind for a moment the fact that the customers wanted to turn the Oracle RDBMS into just another flat-file system--which included designing a database that had no enabled foreign key constraints and that was completely emptied each day so that the next day's data could be loaded. . .
Some of the fields that are now in the Oracle database are bitmapped fields. This is done because there's no documentation for what those fields originally represented in the assembler code and because the designers are afraid of what they might break if they try to drop the fields or attempt to map the fields out into what they might represent. I had the good fortune to get out of the project last August. . . last I checked, they had settled for implementing a Java UI over the COBOL mainframe UI.
Anyway, my point is this: at some point, you have to decide whether the system you're updating is worth further updates. Can you fix everything that's wrong with the code, or are there some things you'll have to jerry-rig or just shrug your shoulders and give up on? Under circumstances like what I mentioned above, I truly think you're better off taking your licks and designing from scratch, because at least that way you can take advantage of the new features that more recent database software and programming languages have to offer.
as a programmer's skills increase (Score:5, Interesting)
Re:Windows XP was a complete rewrite? (Score:2, Interesting)
Structural bugs (Score:3, Interesting)
Take for example, Windows.
Sometimes you really do need to throw out the baby with the bath water, if the baby is that dirty. Besides, making a new one can be fun!
rewrite, but do it correctly (Score:2, Interesting)
Apple OS X Aqua (Score:1, Interesting)
Re:Design desitions (Score:5, Interesting)
LOL! Good grief man... the client I'm working with, their specs can't see past 4-6 weeks!
Over the last year and a half I've been working on building a "policy engine" that manages this company's various business policies... everything ranging from ordering, or communications to whatever.
Well, the ding-dang business users and their minions the "business analysists" can't see past a month or so... then oops... more functionality... change existing functionality... because "oops... we really need it to do this" to the point where I have to make this a unified system of "one off's"
Yeah, ugh... and the idea of "rewrite" has come up because right now... the code base is huge... its a mess and looks like, well, like patch work. We are trying to get management buy-in... and calling it "upgrading and refactoring" because we know full well that "rewrite" is a dirty word in these parts
rewrites are always a good idea. (Score:3, Interesting)
Why? I have better way's of doing things now, I need to be scalable to handle a worldwide company instead of simply a regional tool, and to increase speed, useability and stability.
a rewrite is the only way to achieve these things. anyone who has been with a project for an extended period of time and had to expand/modify it beyond it's origional capabilities knows this.
aha-time (Score:2, Interesting)
There comes a time in every software development project where the best thing to do is to delete all code and start again, perhaps you are there now?
But that is usualy when a student has started coding without knowing a squat about how to solve the assignment and realized after a couple of hundred lines of code.
Refactoring + Unit Testing (Score:2, Interesting)
A full rewrite may have a cleaner architecture, but often, those fixes for particular, tiny problems are lost in the rewrite ("what is this two-line if supposed to do?").
The solution? Redesign, but get to the new architecture from the old architecture by refactoring, one small step at a time. To do it quickly and with confidence, do unit tests. Lots of unit tests. Ideally, those tiny problems that prompted the fixes should have unit tests specifically built to trigger them.
Re:as a programmer's skills increase (Score:3, Interesting)
I was once working on a time card program for a buisness a few years ago. I was relativly new at programming in the real world (I was still in High School, actually) and after the intital program was 'finished' and we introduced it into the current setup, we had to fix bugs. After fixing bugs, and fixing the bugs that the previous fixes introduced, I realized that if I would have written the program using a different layout, the entire system would just be better.
I didn't ever completely rewrite the program, but I did slowly migrate the codebase over to the new layout (all in all, I guess I probably rewrote the thing from scratch again... but it didn't seem that way).
Author ignores some important points (Score:3, Interesting)
Give and take (Score:3, Interesting)
I'm not sure that the author of the story really discusses the give and take of patching an old codebase, vs a complete rewrite. Instead, he focuses on a negative that isn't really there.
As soon as I read the headline, the first apps that sprang to mind were Sendmail, and WuFTPD. Both have been historically full of holes, and a complete mess. I haven't really looked at Sendmail code, but having to configure each option with regular expressions, while powerful, is just lame (IMO). The WuFTPD code is a mess. It's been passed on and passed on, and patched and patched. It eventually became a total whore that nobody really wanted to touch on any level.
Now, both of these (AFAIK) were not rewritten from scratch, and suitable replacements have been produced all over the place. However, would it have been so bad to rewrite those from scratch, while still maintaining the older versions? How would it be any different from, say, the Linux kernel. I run 2.4.x on my production machines. 2.6 is out, but I'm not going to run it until it's proven itself elsewhere (and is integrated into a mainstream distribution). 2.4 will be maintained for a long, long time -- and it's not eve na complete rewrite (AFAIK). Usually code rewrites are adopted by the public...not right away, but eventually.
Finally, his gripe about Mozilla/Netscape are interresting, but not really warranted (and he does acknowledge this). The applications became more bloated as system resources became more plentiful. Software tends to do this -- it has to do with greater layers of abstraction as hardware gets better. But furthermore, it's because Mozilla had to be able to "compete" with the latest greatest from Microsoft...which MSFT will always be updating as new standards are added.
The point is, it doesn't really matter. It doesn't do a disservice one way or the other, and since much of the software we're talking about is Free Software, it matters even less, since the code it out there -- if there are enough people using the older versions, there will always be someone to maintain it.
Maintenance programming can be done well (Score:3, Interesting)
1) The usual way: fix small sections of code in the same style and technique that it was originally written,
2) rewrite large sections of code that were _truly_ hard to maintain, taking great care to leave something much more maintainable behind. This route requires much more thorough testing than (1).
I remember another of us "programmers" who said he didn't do maintenance, he was a "development animal." Wrote abysmal code. When he rewrote a major module of our system he tried to make FORTRAN look like ALGOL, using GOTO statements in the righthand margin of the code.
I bet that module got a rewrite not long after that. Something that was maintainable and written for the language that was being compiled, not the language that didn't even exist on our system.
Re:speaking of IPv6 vs IPv4 (Score:2, Interesting)
1. This is a Cisco problem, a general problem is in the points below:
2. I didn't know there were "16.7 million addresses per square metre of the earth's surface, including the oceans", interesting. The problem with the present system is the 'chunking' of IP addresses... large groups of IP numbers lay dormant, reserved for academic/government institutions, the portion of private addresses is being squeezed in places. The IPv4 problem is an allocation problem, but allocations are determined by committees, and as soon as disparate and opposed committees get going progress and action stop for years, yes it is annoying beaucracy, but a technical work-around may well be easier than fighting this pettyness.
3. A few years ago the internet was here and it did its job while processing power of out-of-the-factory internet infrastructure was so much lower than it is today. Volumes are higher now, the dotcom boom of endless financing has gone (maybe?!) but I believe the internet with IPv6 and its higher volumes could be cheaper now than it was (I am happy to have my mind changed with hard data but not supposition and not opinion based on loose facts and assumed weightings). Lecacy equipment and software has to be replaced sometime, all that COBOL was re-written pre-2K pretty easily.
4. See 3.
There may be "16.7 million addresses per square metre of the earth's surface" but if this can provide a solution for the next few decades, and is easier to overcome than the present beaucracy in IPv4 allocation (which is huge and incredibly difficult to overcome IMHO) then it is worth it.
A similar article... (Score:3, Interesting)
Here's a much better article with a similar thesis: Joel on Software - Things You Should Never Do, Part I [joelonsoftware.com]
There are parts of it that I've never agreed with:
This should never happen! If you have all these bugfixes in your code and no way to know why they were put in, you've screwed up badly. You should have each one documented in:
So the idea that you'd have all these important bugfixes without any way of knowing what they are should be laughable! Given a codebase like that, you probably would be better off throwing it out, because it was clearly developed without any kind of discipline.
Also, he's embelleshing a lot. If it's just a "a simple routine to display a window", it doesn't need to load a library, require Internet Explorer, etc., and thus can't possibly have bugs related to those things. He makes the situation sound a lot more extreme than it really is.
But in general, I think he's right. Refactor, not rewrite. That's the same thing the XP people say to do. They also have extension unit tests to make it easier to refactor with the confidence that you haven't screwed anything up. Which can help in situations like this [joelonsoftware.com]:
Ugh. I bet it would have been a lot less tuning if there were a decent way to test that the change to support #60 hasn't broken any of the previous 59 server types. Or that just a refactoring hasn't broken any.
I don't think this advice always applies, though. I rewrite one major project from scratch at work: our personnel system. Our database schema was hopelessly denormalized and broken. That's not something you can refactor easily - with a widely-used database schema, it's easy to make one big change than many smaller ones, because a lot of the work is just hunting down all the places that use it. That's easier to do once. So I believe there are situations this advice does not apply, but I also believe they are rare.
Hard to write code that doesn't need rewrites (Score:5, Interesting)
I'm not a great programmer, and don't do it regularly, but when I have written fairly big projects, I find that the need for rewrites came out of poor design choices that I had made.
I typically start out with something small, that can handle the core functionality expected from the project. Then I try to add features and fix bugs.
Eventually, the code becomes very difficult to maintain, and ultimately, you get to the point where the ad-hoc architecture simply won't support a new feature.
To the user, everything looks fine, everything runs reliably, but under the hood, there are real problems.
My worst experience was with a web app. I started out with script based pages in ASP (not my call), and kept writing new pages to do different things. It got to the point where I had a about three hundred script pages and lots of redundant code.
When it would become necessary to change the db table structures for another app hitting the same data, I'd have a lot of trouble keeping up, fixing my code quickly in a reliable way.
The problem was that it just wasn't possible to stand still. I couldn't go to my boss and say, "I need a three month feature freeze, to rewrite this stuff."
Writing a new version in parallel was hard because maintaining the crummy but functional code was taking more and more time. It was a real problem, and caused me a fair amount of pain, and suffering.
After digging myself into that hole, I stepped back and tried to figure out how other people did it. I would have been a lot better off building on top of something like struts.
The lesson I took from this is that it's important to study design patterns, and to use tested frameworks whenever possible. You have to think like an engineer, and not someone who codes by the seat of his pants. I'm not an engineer, so it's not easy for me to do that.
I'm not saying that the people who run the projects mentioned are in the same boat that I was. As programmers, they're in a different league.
But they're often working on problems that aren't well understood. Patterns and frameworks are ways to leverage other people's experiences. But if that experience doesn't exist, you have to guess on certain design decisions, and see how it comes out.
Top notch programmers are obviously going to guess a lot better than someone like me will. But they're still going to make mistakes. When enough of those mistakes pile up, you're going to need to do a rewrite.
You could make a point that's opposite of the one that the article makes by looking at the java libraries.
They made choices with their original AWT gui tools that were just wrong. They weren't dumb people -- they just didn't know, the experience necessary to make the right choice simply didn't exist. Once they tried it, they realized it wasn't working, and they came back with Swing.
Rewrites are always going to be necessary for new sorts of projects, because you can't just sit in your armchair and predict how complex systems will work in the real world. You have to build them and see what happens.
There's room for middle ground (Score:3, Interesting)
Once all of the old code has been either pasted back in, revised or deleted, I've usually got a program that does everything the old one does and more, but it is smaller, simpler and cleaner.
Most of the subtle features and knowledge embedded in the old code is not lost by using this approach; it gets pulled back in.
Re:Netscape 4.x fast? (Score:3, Interesting)
4.x is much faster than Mozilla. By a long shot. Its downfall, aside from having unmaintainable source code, was that it was unstable, did not follow any kind of standards, and had a tendency to screw up whatever *should* have worked right. I think Internet Explorer 1.0 is the only browser in existance to beat the general crappiness of Netscape 4.x.
Give me a slow and bloated, yet stable and standards-compliant web browser over the opposite (Netscape 4.x) any day.
my experience with J2EE rewrite from CGI (Score:2, Interesting)
But 4 years ago management had to jump on the J2EE bandwagon and introduce the "Java" version of our financial product. Here it is FOUR years later and our "new" Java app is still not in production because of spec changes, clustering issues, etc., etc., etc.
I kept telling them K.I.S.S., but sales said that we need the new buzzwords to get clients and everyone knows about "Java". Hell, the way I see it, every time management changes their mind, it just adds to my job security since we need to make more changes.
Re:Rewrite of the article (Score:2, Interesting)
Waaaaaaa!!
Excellent rewrite. I found this post to be much clearer and more concise than the original article, while still maintaining the same message. I'm now convinced that rewrites can be A Good Thing.
Re:Ego? (Score:3, Interesting)
I've done (and still do) regular rewrites. In my case, it frequently ends up being rewritten because Management didn't accept my initial recommendations on development and/or back-end technology. Their chosen technology, which I am compelled to use, fails to adapt to easily foreseen circumstances and I have to rewrite my stuff to target the new technology.
I can't blame them in the short run, though (even if the long run is certainly going to be very painful), as they had to get -something- up and running in just a few weeks, and the really bad technology was the only thing that was close enough to the desired target functionality to meet the deadline.
Sometimes I rewrite old applications because they cannot possibly adapt to desired criteria or just perform so badly that they just scream for a redesign. All my existing VB apps fall into this category.
Re:Windows XP was a complete rewrite? (Score:3, Interesting)
MS IE? (Score:2, Interesting)
I know a lot of it had to do with MS's business tactics, but Netscape/AOL took like 5 years to put out a new browser after 4.7. And do you guys even remember Netscape 6 Preview? What a god-awful browser that was. My friends starting calling it Nutscrape cuz it was so painful to use
It _can_ be faster... (Score:3, Interesting)
Start adding tables and forms, trying to reflow the page when resizing (especially if it's a long one), and prepare for the wait of your lifetime.
At least mozilla can display part of a page while the rest renders, and resolve more than one domain name at a time when connecting to resources in parallel.
As long as you don't make this mistake... (Score:3, Interesting)
I usually find Jamie Zawinski to be an arrogant rude asshole, but occasionally our opinions overlap. In this brief rant [jwz.org] he describes the Cascade of Attention-Deficit Teenagers software development model, which often leads to rewriting code from the ground up. Over and over and over.
Stay out of that trap, and actually fix stuff during your rewrite, and there's nothing at all wrong with doing it over from scratch. Rewrite it just because you don't feel that modifying other people's code is sexy enough, or that your version will surely be bug-free -- because, hey, it's you -- or because "you would have done things differently," and you'll have failed.
Re:Windows XP was a complete rewrite? (Score:2, Interesting)
There is no OS/2 code in Windows NT. Microsoft made that very clear to those in the Windows NT 3.1 beta.
Windows NT 3.1 was a true first generation product.
Netscapes mistake was not rewriting soon enough (Score:1, Interesting)
http://ocw.mit.edu/NR/rdonlyres/Electrical
Netscape Story
For PC software, there's a myth that design is unimportant because time-to-market is
all that matters. Netscape's demise is a story worth pondering in this respect.
The original NCSA Mosaic team at the University of Illinois built the first widely used
browser, but they did a quick and dirty job. They founded Netscape, and between April
and December 1994 built Navigator 1.0. It ran on 3 platforms, and soon became the
dominant browser on Windows, Unix and Mac. Microsoft began developing Internet
Explorer 1.0 in October 1994, and shipped it with Windows 95 in August 1995.
In Netscape's rapid growth period, from 1995 to 1997, the developers worked hard to
ship new products with new features, and gave little time to design. Most companies
in the shrink-wrap software business (still) believe that design can be postponed: that
once you have market share and a compelling feature set, you can refactor the code and
obtain the benefits of clean design. Netscape was no exception, and its engineers were
probably more talented than many.
Meanwhile, Microsoft had realized the need to build on solid designs. It built NT from
scratch, and restructured the Office suite to use shared components. It did hurry to
market with IE to catch up with Netscape, but then it took time to restructure IE 3.0.
This restructuring of IE is now seen within Microsoft as the key decision that helped
them close the gap with Netscape.
Netscape's development just grew and grew. By Communicator 4.0, there were 120 developers
(from 10 initially) and 3 million lines of code (up a factor of 30). Michael Toy,
release manager, said:
We were in a really bad situation... We should have stopped shipping this code a year ago...
It's dead... This is is like the rude awakening
Interestingly, the argument for modular design within Netscape in 1997 came from a
desire to go back to developing in small teams. Without clean and simple interfaces, it's
impossible to divide up the work into parts that are independent of one another.
Netscape set aside 2 months to re-architect the browser, but it wasn't long enough. So
they decided to start again from scratch, with Communicator 6.0. But 6.0 was never
completed, and its developers were reassigned to 4.0. The 5.0 version, Mozilla, was
made available as open source, but that didn't help: nobody wanted to work on spaghetti
code.
In the end, Microsoft won the browser war, and AOL acquired Netscape. Of course
this is not the entire story of how Microsoft's browser came to dominate Netscape's.
Microsoft's business practices didn't help Netscape. And platform independence was a
big issue right from the start; Navigator ran on Windows, Mac and Unix from version
1.0, and Netscape worked hard to maintain as much platform independence in their
code as possible. They even planned to go to a pure Java version ('Javagator'), and built a
lot of their own Java tools (because Sun's tools weren't ready). But in 1998 they gave up.
Still, Communicator 4.0 contains about 1.2 million lines of Java.
I've excerpted this section from an excellent book about Netscape and its business and
technical strategies. You can read the whole story there:
Michael A. Cusumano and David B. Yoffie. Competing on Internet Time: Lessons from
Netscape and its Battle with Microsoft, Free Press, 1998. See especially Chapter 4,
Design Strategy.
Note, by the way, that it took Netscape more than 2 years to discover the importance of
design.