Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
×
Java Books Media Programming Book Reviews IT Technology

Code Generation in Action 262

Simon P. Chappell writes "Now, I enjoy a good technical book more than the next geek, but it's been quite a while since one left me quite so excited with the possibilities that it presented. Code Generation in Action is beyond interesting, it is a masterful tome on its subject matter, written by one who is obviously an experienced practicioner in his craft." If "code generation" isn't a familiar term to you, this enthusiastic overview on devx.com is a concise introduction to what code generation is about, though it makes no pretense of ambivalence about its importance as a programming tool. Read on for the rest of Chappell's review.
Code Generation in Action
author Jack Herrington
pages 342 (10 page index)
publisher Manning
rating 9
reviewer Simon P. Chappell
ISBN 1930110979
summary A masterful tome.

Overview

Code Generation in Action, CGiA to its friends, is presented in two parts. The first part is four chapters, and covers a code generation case-study, the basic principles of code generation, including the different types of code generation strategies together with reasons why you would or would not use each strategy. The book's chosen toolset for building generators is presented next, and then some walk-through examples of building simple generators wraps up the first part.

The second part is a kind of a cross between a cookbook and a list of engineering solutions. There are nine chapters with the breadth of solutions covered being quite impressive, covering the gamut of generation of user interfaces, documentation, unit tests and data access code. Each chapter presents a couple of solutions within its topic area, often for different technologies within that topic. For example, the user interface chapter covers the generation of Java ServerPages, Swing dialog boxes and then Microsoft MFC dialog boxes. No favouritism here!

What's To Like

There's a lot to like with this book. The writing is very clear and of good prose. I found the introduction to be very compelling, and I felt completely drawn in by the opening case-study. The four chapters of part one are a concise case for code generation, and would be very useful information to help persuade co-workers and management of the positive risk/benefit ratio with trying code-generation on a live project.

It would be impossible to try enough of any solution from part two in a time-frame short enough to make this review useful, but in the solutions that match my areas of knowledge, I found myself admiring Herrington's straight-forward and pragmatic approach.

What's To Consider

There are two aspects of this book that I want to flag. One of these aspects, some will love and others will hate, and that is the choice of generator language for CGiA. The author has chosen to use Ruby as his working language. This is an interesting choice. Ruby is certainly a language that is inspiring a lot of admiration these days (in fact, it's hard to get Dave Thomas to stop talking about it :-), but with the majority of the code-generation examples being for Java-related technologies, I wonder why Java was not selected instead.

I also found myself wondering about the lack of discussion of how to integrate these Ruby tools into a typical Java build process. Many developers I know use ant to bring automation and consistency to their builds, yet the book doesn't mention this. (JRuby anyone?) Certainly something to consider for the second edition or future code-generation authors.

Summary

This is a masterful tome that inspires and delights, although the two issues raised above did cost it a perfect score of ten.

Table Of Contents

  1. Code generation fundamentals
    1. Overview
    2. Code generation basics
    3. Code generation tools
    4. Building simple generators
  2. Code generation solutions
    1. Generating user interfaces
    2. Generating documentation
    3. Generating unit tests
    4. Embedding SQL with generators
    5. Handling data
    6. Creating database access generators
    7. Generating web services layers
    8. Generating business logic
    9. More generator ideas


You can purchase Code Generation in Action from bn.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page.

This discussion has been archived. No new comments can be posted.

Code Generation in Action

Comments Filter:
  • by gbvb ( 304328 ) on Thursday September 04, 2003 @12:55PM (#6870241)
    if you ever read the "Pragmatic Programmer" book about software developer practices, it mentions that they used code generation for many things. Code generation definitely helps when you have many similar but not same implementation.. But, in Java or any other object oriented languages, inheritence can be used to avoid similar looking code (which is what code generation would do).
    But, even here, I think the UI pages, or template based generators definitely help. There was a tool by DevelopMentor which used to generate code for ATL. It was based on templates..
  • Ruby not Java (Score:5, Informative)

    by Colonel Panic ( 15235 ) on Thursday September 04, 2003 @01:07PM (#6870363)
    The author has chosen to use Ruby as his working language. This is an interesting choice. Ruby is certainly a language that is inspiring a lot of admiration these days..., but with the majority of the code-generation examples being for Java-related technologies, I wonder why Java was not selected instead.

    I think the author makes it pretty clear why he chose Ruby instead of Java. Essentially, in order to parse text (which is one of the primary functions in code generators) you would have to write 2 to 3X more code in Java than you would in Ruby. Java is not an optimal text parsing language - first off you have to find a regex engine for it. That leaves you with choosing one of the scripting languages: Ruby, Perl or Python.

    Here's what the author says about the cons of using Java for code generation (page 41):

    * Java is not ideal for text parsing. (I would agree)
    * Strong typing is not ideal for text processing application (again, I would tend to agree, strong typing only gets in your way)
    * The implementation overhead is large for small generators. (you'll be writing a lot more java code than you would in Ruby to get the same thing done)

    Overall, I'm finding it to be a great book, and the use of Ruby for implementing the examples is a plus as far as I'm concerned.

    As far as integrating Ruby into the build process goes, I recall hearing something about a project that uses Ruby to drive Ant.
  • by sircrown ( 82531 ) on Thursday September 04, 2003 @01:12PM (#6870409) Homepage
    You might want to check out XDoclet [sourceforge.net] next time rather than write your own parser/generator. It's pretty widely used now and has lots of tags for many common uses like EJBs, Hibernate, web.xml generation for all the major appservers, etc. It's also integrated with Ant and it looks like Sun is going to be borrowing some ideas for use in Java 1.5+.
  • by brooks_talley ( 86840 ) <brooks@@@frnk...com> on Thursday September 04, 2003 @01:15PM (#6870437) Journal
    ...it's really important to use them properly. "No pretense of ambivalence," indeed! What, are reference materials supposed to be ambivalent?

    From http://www.hyperdictionary.com/dictionary/ambivale nt :

    1. [adj] uncertain or unable to decide about what course to follow; "was ambivalent about having children"
    2. [adj] characterized by a mixture of opposite feelings or attitudes; "she felt ambivalent about his proposal"; "an ambivalent position on rent control"

    Is that really what anyone would expect from a reference material? I think the poster wanted "...no pretense of objectivity."

    Not that I care that much, of course.

    Cheers
    -b
  • by johnnyb ( 4816 ) <jonathan@bartlettpublishing.com> on Thursday September 04, 2003 @01:18PM (#6870464) Homepage
    Actually, it's very much there. It's not a _replacement_ for development, but there are many parts of coding which is benefitted by code generation. I often write tools to write code for me. Once I had to write a color-picker in HTML (VERY repetitive code), so I wrote a code-generator in Emacs Lisp and it saved me several hours. Several implementations of this concept exist:

    * Templating (see Alexandrescu's book on Modern C++ design)

    * Macros for those in the Scheme/Lisp world (these are GREAT and AWESOME)

    * Compile-time programming (only available in LISP as far as I'm aware through the eval-when construct)

    * Custom program-generators

    And then there's the related concept of partial evaluation that, while excellent, has received very little attention by the commercial sector.

    Now, many code-generation facilities could be done better with good libraries, but this isn't universally the case. Delphi is probably the best at putting in libraries/properties what others put in code generators, and their software is much easier and better because of it.

    Macros and compile-time programming are two of the best ways to do this, but Scheme and LISP are the only ones that do this reasonably.
  • by SWPadnos ( 191329 ) on Thursday September 04, 2003 @01:34PM (#6870626)
    For those who are saying that the term "Code Generator" isn't applicable - it is. Consider a C++ compiler. It may generate asm code, which then gets converted into machine code.

    (generic) C++ -> (specific) asm -> executable bits

    (obviuosly, the C/C++ compiler doesn't need to generate asm, but it's still code generation if it does)

    Code generators just take this a level higher, so the code "life cycle" looks like this:

    (generic) Diagram / CG description -> (specific implementation) C++ -> (specific machine) asm -> machine code.

    Code generators have a great potential for easing coding and documentation. Just like GCC has many backends to generate code for different processor architectures, the code compilers can have different backends to make source code in different languages (C, C++, Fortran, whatever). Even better, you can run a different translator and get documentation out of the "source" - in HTML, DocBook, XML, or any other format you want.

    There are tools to let you make UML diagrams (Google for "Executable UML" for great goodness), and generate real-time C code for an application, a C++ app simulator that runs on a PC, and documentation for the system, all from the same diagram. The tools are expensive (like $15k-$30k), but for large projects, they can be a great savings.

    I saw a program called BridgePoint (from Project Technologies [projtech.com]), which was able to generate embedded, real-time code that was as efficient (more in some areas, less in others, but it averaged out the same) as hand-optimized code done by expert programmers. It all depends on how goo dyour translator is (and this program lets you write your own).

    Some books on the subject:
    "Executable UML: A Foundation for Model-Driven Architecture", by Stephen J. Mellor and Marc J. Balcer
    "Executable UML: The Models are the Code", by Leon Starr
    "Real-Time UML: Developing Efficient Objects for Embedded Systems, Second Edition", by Bruce Powel Douglass
  • by Raffaello ( 230287 ) on Thursday September 04, 2003 @01:46PM (#6870769)
    Yes, of course you can do this in lisp.

    The point your parent poster was making, is that a lisp program has the full power of the language available at macro expansion time, just prior to compilation. This means you can redefine the syntax of the language at will to create any language you like on top of lisp. Lisp macros should not be confused with c-style macros, which are merely token substitutions, not redefinitions of language grammar.

    You only have this in perl in a very crude and hackish sense. You have to write your own parser; you have to write your own code generator; you have to run every piece of code through your home-brew parser/code-generator before you send it to the perl interpreter. Debugging? Heaven help you if any of:
    1. Your code written in your new-language-built-on-perl has errors
    2. Your parser has errors.
    3. Your code generator has errors.

    In Lisp, the macro facilities come for free, are part of the standard (so macros are portable). Vendors are responsible for correctness, so debugging is a simple matter of using the built in functions macroexpand and macroexpand-1.

    Saying that Perl has the same sort of code generation capabilities as Lisp is rather like saying that, since all languages are Turing equivalent, assembly language has the same macro capabilities as lisp.

    The power of a language comes from its expressiveness, the things it lets you do easily without having to resort the 21st century equivalents of a turing machine. With Perl, you only get this level of expressiveness by using convoluted, error prone, home-brew substitutes for real macros.
  • Re:Ruby not Java (Score:2, Informative)

    by aziegler ( 201013 ) <`halostatue' `at' `gmail.com'> on Thursday September 04, 2003 @01:54PM (#6870860) Homepage
    Earlier this year, I helped review this book during the publication process. At one point, the question was raised whether Ruby was the "ideal" language for this. In my opinion, the answer is "absolutely yes." Ruby -- and Python, if you can get past its syntactic oddities that I can't get past -- is "executable pseudo-code."

    -austin
  • Comment removed (Score:2, Informative)

    by account_deleted ( 4530225 ) on Thursday September 04, 2003 @02:12PM (#6871065)
    Comment removed based on user account deletion
  • by evilpenguin ( 18720 ) on Thursday September 04, 2003 @02:16PM (#6871105)
    Metaprogramming can be a useful and time-saving technique. It can also really mess up maintenance and future refactoring of a project. The time saved one developer isn't a good measure of the utility of a technique.

    When I first learned lex and yacc, I got tempted to turn every single useful C/C++ library into a scripting language. "Think how much time I could save!" I thought.

    Well, while I still think developing an application specific language (to basicially make pseduocode functional) is an occasionally useful technique, what I found was that it usually made project transfer and maintenance more difficult and more expensive.

    Using XML and XSLT to do the same thing as lex and yacc doesn't inherently add much. The exception would be the evolution of an industry standard DTD for, for example, common UI constructs. I can see value there. But rolling your own metaprogrammer strikes me as rarely of real benefit. The metaprogram becomes another thing that must be documented, explained, maintained, and transitioned. It adds something that may not be easy to integrate into a present or future automated build process.

    I guess I'm coming down firmly on both sides here. My point is that the cost/benefit analysis for a single developer doesn't necessarily align well with the cost/benefit analysis for the project as a whole. I think we have all seen projects that are in "tool and library hell" where developers have included their favorite libraries and tools willy-nilly (a technical term I like very much -- so concrete, so precise) so that no one can actually get the project to build. The GnuCash project was like that for the longest time (and it is still a bit messy if you ask me).

    Faster isn't always better or cheaper.

    In other words, I have seen metaprogramming do more harm than good in my experience. And the few successes come when the metaprogrammed portion was well analyzed and understood, and a standard could be made that would apply to an entire enterprise and not merely to a single project. More often than not, the inclusion of metaprogramming became the first reason to rewrite an application -- no one wanted to figure out or maintain the metaprogram. So they chucked it.
  • Re:Am I FUD? (Score:3, Informative)

    by ketan ( 3574 ) on Thursday September 04, 2003 @05:09PM (#6873190) Homepage
    Code Generation is for people who don't understand or are too lazy for abstraction, and it will ALWAYS have the problem of, what if you want to go through all your projects and change one single thing about the generated part of your code?

    I take it you've never used a compiled language? In an abstract sense, that is a code generation. Actually, so are interpreted languages: you give them a high level expression and it turns it into executable code that it runs for you instead. Have you ever set up a build environment to embed a build number and timestamp into the executable? That is code generation. What about a template language? What about something like JSP, which generates Java code from templates? Code generation is everywhere. You use it implicitly. Anything that's not machine code could be considered a code generation template language. Hell, at this point, even that's not true. With both AMD and Intel decomposing x86 instructions to internal RISC-like sub-languages, x86 assembler can be viewed as a template for the decode stages of CPU to generate micro/macro-ops code for the backend. These just happen under your radar. Don't dismiss a universal and useful tool simply because you've seen it done badly. Most of the time it works so well you don't even know it's there.

    Your example with EJB reflects a half-baked design for EJBs. Sun is working to fix that with metadata in Java 1.5, but until then, code generation tools are too useful to ignore. Besides, the only difference is how explicit the code generation step is; it's going to be there regardless.

  • by heironymouscoward ( 683461 ) <[moc.oohay] [ta] [drawocsuomynorieh]> on Thursday September 04, 2003 @05:29PM (#6873397) Journal
    Is GSL (aka GSLgen), part of the RealiBase OSS toolset from iMatix [imatix.com].

    Yes, I'm biased, I use it extensively. Extensively.

    Write your metamodel in XML, build code generation scripts, generate anything from interfaces to database layers to entire applications.

    I took some of the examples from CGIA (which is an excellent book, I read it and I like it and I recommend it heartily) and converted them to GSL - simpler, clearer, more obvious.

    If you are a professional programmer you need code generation. This is simply a basic technique, like editing text with a visual editor and not edlin. And of all the code generation tools out there, GSL is by far the most flexible and powerful, mainly because it was designed from the ground up, and has been used and evolved over about 10 years specifically as a code generation tool (unlike XSLT which does the job but with more weight and less elegance).

    In my journal, I include a GSL script that generates a complete C interface layer for MySQL, turning a simple description like this:

    <table name = "history" description = "Message History" >
    Holds all messages received and sent. The command and body are parsed
    from the smstext.
    <field name = "id" domain = "recordid" >Record id</field>
    <field name = "groupid" domain = "id" >Parent group</field>
    <field name = "userid" domain = "id" >Parent user to/from</field>
    <field name = "incoming" domain = "boolean" >Incoming message?</field>
    <field name = "appl" domain = "msisdn" >Application MSISDN</field>
    <field name = "text" domain = "smstext" >Message text contents</field>
    <field domain = "audit" />
    </table>

    Into a complete abstract interface.

    Whatever: code generation is a cult technique that deserves a place at the center of every serious developer's toolbox, and this book is possibly the first one that I've seent that may achieve this.

    Enjoy.

  • by Clifford.H ( 532557 ) on Friday September 05, 2003 @04:30AM (#6877384)
    If the CG is well written, the generator and the resultant code is more functional, performs as well or better, and is much more maintainable.

    An example we're using at present takes an XML description of a database schema (our main database is described in 1200 lines without the embedded documentation, 58 tables). It generates perfectly-formatted, commented, readable, and eminently predictable code in five different programming languages, totalling 40,000 lines, as well as high-quality printed documentation. In fact the code is indistinguishable from code produced by a human, except that it lacks human quirks and inconsistencies.

    The generator itself is around 4000 lines of Ruby including all the templates, and has been significantly extended by at least six developers, each of whom found it took them less than thirty minutes to learn enough Ruby to read, understand and extend the generator. The generator itself isn't a simple template expander; it understands and analyses the structure of the database, taking only a few hints to decide exactly which operations are required on each table, relationship and index.

    The generated code includes:

    • SQL DDL (so far only for 1 database product, but additional ones with ease),
    • SQL stored procedures for a range of data access operations including all the bulk operations needed for performance (things like efficient subset-replace from a temporary table, etc),
    • Beans-style C# objects, collections, and nicely segmented access (CRUD) APIs,
    • Composite data type conversion code for the same tables in C++,
    • Business logic base classes
    • Consistent naming, logging and error management across all generated code.
    A further generator in C# augments the generated (and manually subclassed) APIs to produce (sorry, no line count):
    • WSDL interface definitionss for all these APIs,
    • Web Service implementations with location and version brokering,
    • Web service client proxy APIs supporting location transparency (including local linkage),
    I think you'll agree that to generate all this from such a small description file adds significant value over hand-crafting the same. And that's before we start generating user interface, performance predictions and analytical models, test code and adta, etc.

    The generator is integrated into our build environment so that the resultant code never gets checked-in (so is not susceptible to modification) and yet it's always available for debugging. We can subclass all the objects where hand-tweaking and additional methods may be necessary.

    Because it does the grunt work of handling all the bulk operations on the database, those higher-performance features are more likely to get used - developers wouldn't always bother. In these cases, the performance will exceed that of hand-written code.

    Since introducing the generator, we almost instantly found other applications for it (other databases), and also have become much more able to make changes to the schema during the project as needed. Adding a field is a one-line change in one file, not a series of dozens of changes spread across many files, following by a simple recompile. Adding an access method is as simple as adding the index which supports it - the retrieval APIs are implied by the existence of the index.

    A couple of times an additional method has been needed, and it's been added to the generator - this method is immediately available for all relevant tables in the database. Again, adding such methods has been as little as a 2-20 line change in the generator.

    The simple fact is our project is literally months ahead of schedule, and future projects will be even more efficient. As far as I'm concerned, you might as well have asked me what Java or C++ could possibly do that assembly code couldn't have done - the comparison matches at almost every point, with the possible exception of performance.

"Life begins when you can spend your spare time programming instead of watching television." -- Cal Keegan

Working...