Code Generation in Action

Code Generation in Action 262

Posted by timothy on Thursday September 04, 2003 @12:40PM from the automation dept.

Simon P. Chappell writes "Now, I enjoy a good technical book more than the next geek, but it's been quite a while since one left me quite so excited with the possibilities that it presented. Code Generation in Action is beyond interesting, it is a masterful tome on its subject matter, written by one who is obviously an experienced practicioner in his craft." If "code generation" isn't a familiar term to you, this enthusiastic overview on devx.com is a concise introduction to what code generation is about, though it makes no pretense of ambivalence about its importance as a programming tool. Read on for the rest of Chappell's review.

Code Generation in Action
author	Jack Herrington
pages	342 (10 page index)
publisher	Manning
rating	9
reviewer	Simon P. Chappell
ISBN	1930110979
summary	A masterful tome.

Overview

Code Generation in Action, CGiA to its friends, is presented in two parts. The first part is four chapters, and covers a code generation case-study, the basic principles of code generation, including the different types of code generation strategies together with reasons why you would or would not use each strategy. The book's chosen toolset for building generators is presented next, and then some walk-through examples of building simple generators wraps up the first part.

The second part is a kind of a cross between a cookbook and a list of engineering solutions. There are nine chapters with the breadth of solutions covered being quite impressive, covering the gamut of generation of user interfaces, documentation, unit tests and data access code. Each chapter presents a couple of solutions within its topic area, often for different technologies within that topic. For example, the user interface chapter covers the generation of Java ServerPages, Swing dialog boxes and then Microsoft MFC dialog boxes. No favouritism here!

What's To Like

There's a lot to like with this book. The writing is very clear and of good prose. I found the introduction to be very compelling, and I felt completely drawn in by the opening case-study. The four chapters of part one are a concise case for code generation, and would be very useful information to help persuade co-workers and management of the positive risk/benefit ratio with trying code-generation on a live project.

It would be impossible to try enough of any solution from part two in a time-frame short enough to make this review useful, but in the solutions that match my areas of knowledge, I found myself admiring Herrington's straight-forward and pragmatic approach.

What's To Consider

There are two aspects of this book that I want to flag. One of these aspects, some will love and others will hate, and that is the choice of generator language for CGiA. The author has chosen to use Ruby as his working language. This is an interesting choice. Ruby is certainly a language that is inspiring a lot of admiration these days (in fact, it's hard to get Dave Thomas to stop talking about it :-), but with the majority of the code-generation examples being for Java-related technologies, I wonder why Java was not selected instead.

I also found myself wondering about the lack of discussion of how to integrate these Ruby tools into a typical Java build process. Many developers I know use ant to bring automation and consistency to their builds, yet the book doesn't mention this. (JRuby anyone?) Certainly something to consider for the second edition or future code-generation authors.

Summary

This is a masterful tome that inspires and delights, although the two issues raised above did cost it a perfect score of ten.

Code generation fundamentals
1. Overview
2. Code generation basics
3. Code generation tools
4. Building simple generators
Code generation solutions
1. Generating user interfaces
2. Generating documentation
3. Generating unit tests
4. Embedding SQL with generators
5. Handling data
6. Creating database access generators
7. Generating web services layers
8. Generating business logic
9. More generator ideas

You can purchase Code Generation in Action from bn.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page.

Code Generation in Action

This discussion has been archived. No new comments can be posted.

Search 262 Comments Log In/Create an Account

Comments Filter:

Something missing? Like a definition? (Score:5, Insightful)

by Trigun ( 685027 ) writes: <evil@evil e m p i r e . a t h .cx> on Thursday September 04, 2003 @12:46PM (#6870147)

Code generation is a time-saving technique that helps engineers do better, more creative, and useful work by reducing redundant hand-coding. In this world of increasingly code-intensive frameworks, the value of replacing laborious hand-coding with code generation is acute and, thus, its popularity is increasing.

Why not put a little bit about code generation in the review. Even a little blurb like "It builds the mundane portions of coding" would have helped out a bit.

Code generation == metaprogramming (Score:5, Insightful)

by Anonymous Coward writes: on Thursday September 04, 2003 @12:52PM (#6870221)

And Lisp has been doing it for years, as has, to a lesser extent, C++ (and Ada) templates. And the good thing about Lisp metaprogramming is that it is in the one language - lisp.

Having 1 language generating another - Ruby and Java - is the recipe for confusion and complexity.

We didn't know we were doing anything special... (Score:5, Insightful)

by Fnkmaster ( 89084 ) writes: on Thursday September 04, 2003 @12:54PM (#6870240)

But we built a very large code generation system as part of my old company's electronic trading system. It used the JavaDoc system and a custom JavaDoc parser we built to generate oodles and oodles of very repetitive code from the base business objects, such as XML parsers and translaters and the like. It was a big time saver, undoubtedly. The big problems we had were versioning and its interaction with our build system, and more importantly, the fact that the code generator itself becomes very complex to read, such that only one or two developers are capable of making changes to it.

My rule of thumb is if I find myself writing code that bores me silly and thinking "a frigging monkey could be taught to code this piece", I will strongly consider writing a program to write the program for me. Be warned about the maintenance and readability issues though in larger development projects where there are a lot of mediocre programmers around. You can always assign those mediocre programmers to hack on monkey-easy code, but you can't get them to hack on a code generator, so carefully consider the nature of the development organization you are dealing with, and the tradeoff between available "high-value" time and resources vs. "low-value" (i.e. monkey coder) time and resources. This perspective has been brought to you by my pointy-haired side.

Am I FUD? (Score:2, Insightful)

by mcc ( 14761 ) writes: <amcclure@purdue.edu> on Thursday September 04, 2003 @12:58PM (#6870283) Homepage

Code Generation is for people who don't understand or are too lazy for abstraction, and it will ALWAYS have the problem of, what if you want to go through all your projects and change one single thing about the generated part of your code? What if you have a hundred tiny projects, each of which contains the generated code snippet that needs to be changed? Let's hope either the change you want to make is very simple or you are very good at regular expressions.

If you are able to clearly separate your code into "You can edit here" and "You cannot edit here" chunks, you can DEFINITELY seperate your code clearly into local chunks and delegated chunks-- i.e., "you cannot edit here" means you just do stuff, "you can edit here" means you talk to a delegate object or method. If EJBs are so frigging complicated that you have to do a bunch of repetitive grunt work that's the same every time you do it, you should somehow be building a slightly higher-level abstraction off of the things you do in common on each EJB and working from there. If EJB does not make this possible you should perhaps not be using EJB. There's always some way to do these things through abstraction, and it will ALWAYS in the end wind up more flexible then either generated or cut and paste programming.

If you've got a code generator sitting around, then sure, go ahead and use it. But I cannot think of any case in an object-oriented language where it would be both less work and more maintainable to write a code generator than to just abstract away the parts that would be autogenerated..

Re:Software Engineering is not just there yet (Score:2, Insightful)

by a_ghostwheel ( 699776 ) writes: on Thursday September 04, 2003 @01:04PM (#6870337)

Nobody is talking about completely automating whole development. But there are two things where code generation helps immensely: 1) "mundane code", like accessor/mutator methods , class/interface definitions, standartized header comments, object mapping - this is covered by generators built into the Rose and similar products. Very helpful. 2) Ability to specify business rules in more efficient form (be it XML, some proprietary language, etc) and generate code appropriate for your framework. This is technique which I personally used several times on large custom software development projects - code quality goes way up. To the certain extent same approach is used in any modern RAD tool.

not impressed. (Score:3, Insightful)

by Pinball Wizard ( 161942 ) writes: on Thursday September 04, 2003 @01:11PM (#6870402) Homepage Journal

As engineers we build time-saving applications for others but never think to apply the power of computers to our own problems.
Huh? Software engineers use more software than anyone else. We have tools for our tools. I found the above statement bordering on the ludicrous, and almost stopped reading at this point.
Code that is copied and pasted to multiple places is difficult to maintain properly across all of the copies. Active code generation does not suffer from the same maintainability issues as copy-and-paste coding. When you need fix something, you apply the bug fix to the templates used to generate the code, which then propagates the fix to all of the code maintained by the generator. This design ensures that no code that needs fixing is left scattered around and forgotten.
That's why we use functions and classes. Then, when you change your function, the changes are magically propagated to all the places in your code where that function was called! Copy and paste programming has been frowned upon pretty much since the days when the goto was declared bad programming practice.
It really sounds like this book is just putting on a fancy name for an incomplete set of good programming practices. Really, what is covered here that Design Patterns doesn't cover in a much more thorough and professional way?

Code generation a necessity (Score:4, Insightful)

by russotto ( 537200 ) writes: on Thursday September 04, 2003 @01:12PM (#6870408) Journal

I work on a large Java project where we use entity EJBs. Code generation isn't an _option_ here, it's a necessity. We have hundreds of tables (over 500) and each of them has an EJB. Writing out the infrastructure for each and every one by hand would be a huge and boring waste of time. I think the necessity for code generation actually points to a problem with design of EJB itself, but that we're pretty much stuck with.

Swings and roundabouts (Score:5, Insightful)

by Rogerborg ( 306625 ) writes: on Thursday September 04, 2003 @01:17PM (#6870457) Homepage

I worked at a telco where we generated C code on the fly from high level Structured Definition Language [etsi.org] for the main call control processing.
It was a great idea... in theory. In theory, it was impossible for the implementation to get out of sync with the detailed design (the SDL). In theory, there's no difference between theory and practice, but in practice there is. Some of the features that we had to add simply couldn't be modelled in SDL, plus there were performance issues, and it produced ugly source.
It was used for fifteen years (yeah, pre-ANSI), but it eventually collapsed under the weight of all of the hacks that were required to work around the limitations. We eventually had to admit that the behaviour of the complete source (generated plus all the stuff around it) was now so different from that defined by the SDL that it was no longer worth putting up with the limitations of the SDL.
In the end, we just took a snapshot of the generated code and set developers free to actually fix it rather than hack around it. At that point, there were only a few people left who even knew SDL, so there were very few tears shed. The rest of us cheered, and the product got significantly cleaner as we refactored the bejeesus out of all the generated C and removed the hacks.
I'd recommend giving code generation a try, but don't be ruled by it. Once the product is mature, if the code generation is limiting you, then don't be afraid to drop it and fix the lower level generated code.

Re:Am I FUD? (Score:3, Insightful)

by querencia ( 625880 ) writes: on Thursday September 04, 2003 @01:22PM (#6870501)

Code Generation is for people who don't understand or are too lazy for abstraction

The article that timothy suggested as background reading (here [devx.com]) points out that code generation is most useful when you're forced to use a framework that requires lots of simple-minded "scaffolding-style" code. EJB is the prime example.

In other words, I agree with you --- if code generation is useful, it's probably because the infrastructure you're using was poorly designed. But that doesn't mean you don't have to use it. Managers who have no idea what J2EE is require you to use J2EE. So you use code generation.

I thought J2EE was supposed to simplify things (Score:3, Insightful)

by KenSeymour ( 81018 ) writes: on Thursday September 04, 2003 @01:28PM (#6870552)

A quote from the Sun web site:

J2EE technology and its component based model simplifies enterprise development and deployment.

But now I hear that we need code generation to keep up with all the mundane tasks made necessary by the use of EJBs.
So we build a code generator and we have to maintain that.

This is on top of all the J2EE design patterns you are supposed to do because the world would come to an end if you just accessed a database table using JDBC directly.

Once in a while someone should look at the assertion that it would be harder to maintain a lot of imbedded JDBC code in your application than it would be maintain the 5 or so classes you need for each business object in order to maintain architectural purity.

Re:Code generation a necessity (Score:4, Insightful)

by BigGerman ( 541312 ) writes: on Thursday September 04, 2003 @01:33PM (#6870617)

there is a _huge_ design problem with EJB.
To me it manifests itself in the ass-first design:
if I work with a framework, the typical scenario for OOP would be: framework provides interface, I implement it with my classes so they can play in the framework.
With EJBs, it is another way around: you define an interface and the container/framework creates classes and implements your interface for you!
Code generaton became so popular with J2EE for this very reason: there is _always_ a step where you need to produce a lot of redundant EJB artifacts so you might as well automate it.

Re:NOT about compiler code generators (Score:3, Insightful)

by naarok ( 102579 ) writes: on Thursday September 04, 2003 @01:40PM (#6870700) Homepage

A code generator is a compiler. It takes some source and produces an implementation of the source using a more primitive language.

I wrote a code generator for EJBs. The template that was passed in was very much the source as a very high level language. The output was Java code.

I wrote a Pascal compiler for a virutal machine. The source taken in was Pascal, the output was VM code.

The two were very similar except the gramar for the code generator was much simpler so I didn't need a complex lexical parser.

Don't dismiss code generation as some trick. It is a very valid approach to simplifying repetitive steps and is as much a compiler as what you are thinking of.

Re:good stuff (Score:3, Insightful)

by zero_offset ( 200586 ) writes: on Thursday September 04, 2003 @01:43PM (#6870734) Homepage

Ugh. XSLT is a nightmare.
We farmed out a project to a company which used a ton of "elite" off-shore resources, and they sent back a project which relied heavily on XSLT. Granted it made sense on paper -- prior to their involvement, the data was already available in XML format. But the net result was a nightmare to debug, maintain, and upgrade. XSLT reminds me of the old saying about APL -- it's a "write-only" language.
Ok, I concede it's not actually as bad as APL, but it isn't nearly as easy to debug as regular old XML DOM based code, and we've done some side-by-side tests that adequately prove (to us; hey, they're our tests) that the code isn't any more concise or easy to write. We already knew it was harder to read and debug. And the current crop of parsers seem to run XSLT a LOT slower than the equivalent DOM calls. It strikes me as a solution in search of a problem...

Re:Am I FUD? (Score:5, Insightful)

by smagoun ( 546733 ) writes: on Thursday September 04, 2003 @01:44PM (#6870745) Homepage

You're assuming you run the generator once, and that's it. That's what many people do, and it's wrong. The generator should be part of the build cycle. If you want to change the generated code snippet, change the generator!
While I agree with you in principle - that better abstraction is usually the way to go - that's not always possible in the real world. For example, sometimes you have to produce an API for someone else, or hook up to their API. In those cases, better abstraction isn't always an option. Sometimes your boss says "you're using EJB" and you're stuck with it. In those cases a generator can be a big help.
The cool thing about integrating a generator into your build is that you get the benefits of abstraction without many of the drawbacks. The generator becomes your abstraction, so you can make modifications in one place. Sure, using a generator requires a little more thought than cut-n-paste. So does proper abstraction. The two aren't all that different, and both approaches have their place.

Re:Something missing? Like a definition? (Score:5, Insightful)

by register_ax ( 695577 ) writes: on Thursday September 04, 2003 @02:00PM (#6870944) Journal

I concur.
[rant]
What the hell is up with these book reviews? I equate book reviews with SCO. All I see is a large body of reviewers making unsubstantiated claims to a book that tickled their fancy in a personal sort of way. They say basically, "this book tickled a nerve, but I will not say what I already know that led to that nerve being in place already." Of course it turns out then to be some haphazard therapy session where the reviewer begins to delve into themselves while completely ignoring their audience!!!
OK, I know /. may not be extremely high with the English majors, but how about we don't post such things submitted by such arrogant posters? (is it possible?) I don't claim myself as being a master of prose, but you won't see me trying to do something I knowingly can't.
One more point, the reviews most commonly given are little more then amazonian [www.amazon] reviews. They rave about how great something is, realize that only constitutes of 2 senteces, and proceed with immediately filling in the blanks with worthless prose to create content.
And lastly, I don't receive book reviews on my main page because of their vapid nature. This means that /. is losing on my potential business. I would love to see them prosper, but they have to create something that is interesting. Slashdot is all about bringing the obscure to the masses, but I hardly see that through their book reviews. What is the freakin deal with OReilly's cookbooks? Are there any really, really bad OReilly books at all? We all have hordes of them or wish we had complete collections. We know what a cookbook is and they don't really differ that much between subject.
I am trying to prove a point here, that reputation and common sense through the title allow for little differentiation between your preconceived notions about the book and what the book is realistically about. I want books to come to the forefront from the dirges, such as the recent classic The Elegant Universe: Superstrings, Hidden Dimensions, and the Quest for the Ultimate Theory [amazon.com] (Feb 2000) by Brian Greene, or "been around the block more then once" Men of Mathematics [amazon.com] (June 1937) by Eric Temple Bell. My point being, either book is known to those in the "know," but is rare to be known by budding scientists (high school students) and younger folk. I won't let this branch off into a rant in poor public schooling and child upbringing. I only want to see objective viewpoints of why the book may be helpful to me, not why it made you change your perspective on life! FSCK!
[/rant]
Thank you for being patient.

Re:NOT about compiler code generators (Score:4, Insightful)

by kmo ( 203708 ) writes: on Thursday September 04, 2003 @02:03PM (#6870969)

Maybe I'm being too fussy about this, but a code generator, traditionally has always meant a part of the compiler back-end which actually translates intermediate code to machine-level instructions.
I think you're being too fussy. Anything that generates code can reasonably be called a code generator. You are just most familiar with the backends of compilers.

Code generation can allow a developer to compensate for missing abstractions in the underlying language or architecture. For example, it's almost trivial to write generator for a Java type-safe enum class [sun.com]with a couple of pages of java tied to a Velocity [apache.org] template for an enum. You just have to be sure that your build process regenerates the class if your input changes. The input could be something as simple as
color {red, green, blue}

It's then easy to add features to your enum infrastructure that aren't in Bloch's class and have them show up in all the project enums (like correct serialization and localization).

The GUI builders that spit out code fragments are just the tip of the iceberg when it comes to code generators. Imagine a utility that could generate the plumbing to make EJB or RMI methods directly accessable from a Windows DCOM client, without it knowing or caring that it's talking to Java on the other end.

Or automatically generating classes for Customer, Order, and Product from an existing database table, and seamlessly loading, caching and saving their instance data without the application knowing or caring that it is persistent.

Automating this sort of tedious programming is what code generation is all about.

Re:NOT about compiler code generators (Score:3, Insightful)

by Eponymous Coward ( 6097 ) writes: on Thursday September 04, 2003 @02:08PM (#6871027)

Don't get your panties in a bunch! :)
I remember when Microsoft was launching all of their visual programming products. Visual programming purists complained (correctly) that these products had nothing to do with visual programming- they were just IDE's that included visual form designers.

The more obscure definition is going to lose. But guess what- it doesn't really matter.

Like Juliet said, "What's in a name? That which we call a rose/By any other word would smell as sweet."

Not FUD, but not correct, either. (Score:5, Insightful)

by tmoertel ( 38456 ) writes: on Thursday September 04, 2003 @02:22PM (#6871157) Homepage Journal
mcc wrote:

Code Generation is for people who don't understand or are too lazy for abstraction ...

Baloney.
Code generation is a practical, efficient tool for solving many problems where OO-style abstraction need not enter the picture. One such class of problems is building interfaces and glue code from external specifications.
A few years ago, I wrote a simple code generator that reads the SQL DDL for a large database and generates an object-based interface to the database. Client coders could then use the object-based interface to access the database. The advantages of this approach proved to be numerous:
- Single, authoritative reference specification. The object interface was always in sync with the reference, which for this project was the database schema.
- Richer compile-time error detection. The projection of the schema into the object interface was fully available to the type system so that many kinds of client errors could be caught at compile time, not run time.
- Reduced opportunity for errors between subsystem boundaries. Because the object-based interface was generated by machine from the actual database -- and not derived from some programmer's understanding of the database -- there were fewer opportunities for impedance mismatch across the boundaries of the application code and the database. (Studies of errors in complex projects have shown that errors are more common between subsystem boundaries, and so this benefit is important.)
mcc further states:

But I cannot think of any case in an object-oriented language where it would be both less work and more maintainable to write a code generator than to just abstract away the parts that would be auto-generated.

If you can't think of any such cases, it's because you're thinking too small. Look at the bigger picture. For starters:
- When the number of variables affecting the desired code characteristics is large enough to make hand-coding (at any level of abstraction) impractical. E.g., FFTW [fftw.org]: "FFTW uses a code generator to produce highly-optimized routines for computing small transforms."
- When your code must conform to an external reference specification that changes rapidly enough to make hand coding (at any level of abstraction) impractical. (See my example above.)
- When the requirement for correctness is so stringent as to make hand-coding methods impractical, mandating code generation from a formal specification.
- When you must target an output language whose native abstraction capabilities are too crude to capture directly the degree of abstraction that is merited. Believe it or not, most popular OO languages fall into this category for many commonly occuring problems. Hence the popularity of design patterns. (Compare, e.g., with the abstraction capabilities of modern functional programming languages like Haskell [haskell.org] and O'Caml [ocaml.org].)
Make no mistake about it, code generation is a practical, effective tool that every programmer should understand. To dismiss it out of hand is a costly mistake.
Re:Ruby not Java (Score:3, Insightful)

by spRed ( 28066 ) writes: on Thursday September 04, 2003 @02:35PM (#6871295)

PHP has nothing to do with text processnig. It is highly specialized as a page-based web language.
That makes it great for small dynamic sites (which it is frequently and effectively used for) and crap for everything else.

Re:Software Engineering is not just there yet (Score:3, Insightful)

by amightywind ( 691887 ) writes: on Thursday September 04, 2003 @02:35PM (#6871296) Journal

Amen to LISP/Scheme macros. Java and C++ have reimplemented so many other Lisp ideas you wonder why attention never turned to the preprocessor. After 15+ years of C++/Java and OO we would still all be better off programming in Lisp.

Re:NOT about compiler code generators (Score:3, Insightful)

by p3d0 ( 42270 ) writes: on Thursday September 04, 2003 @02:39PM (#6871335)

I do compiler work too, and I think you need to relax a bit. The term "code generator" means "some device that generates code". Just because you misuderstood it at first (as I did) doesn't mean it's wrong.

Yes: it's about covering weaknesses (Score:3, Insightful)

by Anonymous Brave Guy ( 457657 ) writes: on Thursday September 04, 2003 @03:22PM (#6871802)

Code generation can allow a developer to compensate for missing abstractions in the underlying language or architecture.

Thank you; that was the most insightful comment I've seen here all day.

Code generation, like design patterns and such other trendy things, is just a technique you can use with weaker languages or designs to gain some of the power of stronger ones, if you don't have the option to use something more expressive directly. As such, it merits serious consideration as a tool in the toolbox, but if you find yourself writing generator code too often, you're probably using an underpowered and/or overcluttered language to start with.

The type-safe enum idiom in Java, which several people on this thread have mentioned, is a great example. In a language with native support for enumerated constants, such as C, the generator and idiom would be unnecessary; you'd simply write the code. In turn, in a language with stronger support for disjunctive types and pattern matching, a lot of the hackery you see with enums and switch in C is also unnecessary. But some people have to use Java and would find enums useful, and some people have to use C and would find pattern matching useful, and for these people, code generation can be a way to simulate the real thing acceptably well.

Re:Code generation == metaprogramming (Score:4, Insightful)

by Usquebaugh ( 230216 ) writes: on Thursday September 04, 2003 @04:44PM (#6872839)

It's a shame that the humor of your post will be lost on the majority of /. readers.

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.