Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
Java Programming

Java Native Compilation Examined 486

An Anonymous Coward writes: "DeveloperWorks has an interesting article about compiling Java apps to run as native appplications on the target machine without the need for a JVM."
This discussion has been archived. No new comments can be posted.

Java Native Compilation Examined

Comments Filter:
  • by Anonymous Coward
    What's the point of taking a language that jumps through hoops to be "cross-platform" and cutting it's legs off?
    • What's the point of taking a language that jumps through hoops to be "cross-platform" and cutting it's legs off?


      Huh? You still have the Java source .. You can compile it to a native executable for whatever platform you need, or compile it to Java bytecode. Obviously, compiling to a native executable is not applicable for applets served from websites, sending objects over the network, or Remote Method Invocation (RMI). The point of the article is that if you have a large, slow Java application, you can compile to run natively on a given platform to increase it's speed and reduce the disk and memory requirements.

      • by thomas.galvin ( 551471 ) <slashdot AT thomas-galvin DOT com> on Wednesday January 30, 2002 @10:16PM (#2928922) Homepage
        The point of the article is that if you have a large, slow Java application, you can compile to run natively on a given platform to increase it's speed and reduce the disk and memory requirements

        Well, yes and no. There are some very slick Vms out there, and some very lazy compilers, though in most cases, you are correct, native code will execute faster than interpreted code.

        The disk requierments, though, can in the long run be larger for native code apps. The VMs have a lot of common tools (i.e. all of the java.* classes) available for any app that needs them. If you plan to compile to native code, you hav to link these in. You can save space by only linking in the libraries you need, but will still end up loosing space when you start installing mulitple applications that all include the SWING packages.

        I have been a bog fan of Java for some time, but the need for a VM held me back from a full-out endorsement for some time...it seemed like I was writing toy code because I needed another program to do anything with it, and I didn't like leaving all of those class files laying around. I have gotten over a lot of this, especially once I learned how to make an executable JAR file, and considering the widespread use of DLLs. Plus, I realy like being able to work on my text editor on my home WinXP box, and take it in and use it on our Sun achines.

        Still, I'm downloading one of those compilers right now. Why? Because native Java would just be neat. I probably won't even use it that much, but it will be nice to see that it can be done.

      • There is nothing in theory stopping RMI from working even if the two ends are running two different languages. As long as there is a well defined API, where each end can understand the paramaters sent & received, then it makes no difference what the languages are. A good example of this sort of thing working would be COBRA, which allows basically any language to call methods in remote subsystems.
    • by MisterBlister ( 539957 ) on Wednesday January 30, 2002 @09:47PM (#2928811) Homepage
      Not everyone cares too much about binary cross-platform. Many people would be happy just with 100% source cross-platform porting, which doesn't exist with C/C++, etc.

      Further, not everyone even cares about the cross-platform nature of Java to begin with. I've worked on a few projects where the OS requirements were completely fixed but Java was chosen anyway -- for its rapid-design features (built-in garbage collector, nice network access classes, etc) rather than its cross-platform nature.

      All in all, its good to have a choice..Just because you can native-compile Java doesn't mean you have to do it.. And in situations where cross-platform is not needed, why not compile to native and get the extra efficiency? Choice is good.

      Its a shame Sun spreads so much FUD about native-compiled Java.

      • I agree. I think there are still a ton of benefits to Java without it being interpreted.

        It seems to me that most discussions of Java lose track of that fact that the key to the "write once run anywhere" idea is that the Java source code is translated to bytecodes and then executed in a non-ambiguous way. In other words, the language definition doesn't have all the "implementation defined" behavior that C/C++ language definitions have.

        It takes a lot of discipline to write truly portable C/C++ source code. It seems a lot easier to achieve this source code portability with Java. I think having bytecode portability is a big plus in some cases but not very important in others.

        So there's many ways to execute a given set of byte codes - strict interpretation, JIT compilation, native compilation, etc. This flexibility seems pretty good to me.
      • Java for servers (Score:2, Insightful)

        by Nick Mitchell ( 1011 )
        Java, in the form of EJBs and servlets, is becoming fairly common on "big iron". One reason is indeed the cross-platformability of it. But, frankly, most often a change in platform accompanies a change in framework. For example, a switch from UNIX servers to a mainframe may accompany a switch from CGIs to an application server (such as Websphere [ibm.com]). The dramatic switch here is probably not from UNIX to OS/390 (after all, OS/390 has a POSIX layer), but from CGIs to Websphere. So, as you said, it's not really cross-platformability which is driving Java in the server realm. I think the principal reason is that Java provides (at least some semblance of :) a language-level security model. By security, I don't mean crypto and virii and external attacks. I mean an application is secure from itself, and its (all too human) programmers.
        • Re:Java for servers (Score:4, Interesting)

          by CrazyLegs ( 257161 ) <crazylegstoo@gmail.com> on Thursday January 31, 2002 @10:33AM (#2930660) Homepage

          I agree with your points, mostly. The company I work for has a TON of function implemented on Websphere running on AIX. We really haven't cared too much about cross-platform features (do a bit of sharing between NT PC apps and Websphere AIX). We were an early adopter of Java because it seemed to address a lot of issues we had in developing C++ and Smalltalk apps (e.g. packaging, garbage collection, change mgmt, etc.).

          However, we are planning to move our Websphere AIX implementation (for EJBs anyways) to an IBM390 box (or Z Series these days I guess) and we've found the proof-of-concept effort easier than we anticipated - simply because we can pickup our code (in JARs) and run it on big iron. Very nice! It also gives us some encouragement that we can really start to share more Java classes/components between browser-delivery and fat-client environments if we're smart about it.

          Anyways, just one experience that's worked out...

    • What's the point of taking a language that jumps through hoops to be "cross-platform" and cutting it's legs off?

      It seems like you might not have read the article all the way through. It doesn't recommend only native compilation, but it does make a nice comparison between the two solutions. It points out when you might want to take advantage of native compilation, and when it doesn't help. You are always free to generate byte code in addition.

      Much of the work I do would be nicer and easier in Java than in other languages, but Java is just too slow and large for my purposes. This gives me the chance to use a language that has a little elegance, without giving up the speed of execution that I require. C++ just doesn't cut it, and it's tough trying to write this kind of code in C (but not impossible, it just takes a little discipline).

      Java isn't really cross platform anyway. Where's the JVM for MVS?

  • by Ryu2 ( 89645 ) on Wednesday January 30, 2002 @09:10PM (#2928645) Homepage Journal
    Runtime bounds checking, typecast checking, etc... do those get included in the native executable as well (and if so, then wouldn't the performance hit negate the advantages gained thru native compilation?), and if not, then it could be dangerous.
    • Actually i believe tha implementing Runtime bounds checking, typecast checking and so on wouldnt be easy to include un the native executable... as said in the article :
      Diagnostic tools can be somewhat thin on the ground, which makes it potentially more difficult to diagnose problems that occur in natively compiled Java apps (particularly if the error doesn't occur in the Java bytecode version!)
  • This would solve my greatest beef with Java - that messy JVM which piles into the processor when it compiles bytecode. Of course, the idea of running Java natively seems to defeat the purpose of Java entirely: runs the same on any computer. Interesting idea, though...
    • You simply can't run most Java programs without either a JVM or a bundled compiler. The reason for this is dynamic class loading. You can load classes by arbitrary name, from urls, from byte codes you've created on the fly in memory, etc, etc. These classes cannot be precompiled. So you either need a JVM to interpret them, a compiler to compile them, or both in the form of a jitted JVM.

      Why do people complain about the size of the JVM? Every scripting language has to have the equivalent in the form of an interpreter. The *base* perl installation on Linux is 21MB. That's without docs and a UI toolkit (Java includes both). I don't hear complaints about the size of Perl's interpreter or the heft of the Tk or Gtk toolkit libraries.

      Would people like Java more if it didn't have byte codes? If we had a Java shell like Perl, Tcl and Python? Skip the byte codes and let it be tokenized each and every time? Or is it just the cross-platform talk that rankles everyone? Maybe it's Sun's stupid open/not-open antics. Or perhaps it's just language l33t1sm.
  • Not new ... ho hum (Score:5, Insightful)

    by Evil Pete ( 73279 ) on Wednesday January 30, 2002 @09:19PM (#2928684) Homepage
    Lets see TowerJ has been out since when ? 1999. Having tried my hand at this I have some reservations. The project I was on ... a large dinosaur of a thing which will remain nameless had 12,000 classes which TowerJ turned into C files which were then compiled by GCC. Resulted in 50 megabyte executables on a Sun. Didn't really solve the problem which wasn't really about speed but throughput. The solution was a better design using servlets and Java Webserver ... result DRAMATICALLY faster without any need for native compilation.

    Mind you I noticed in the IBM article that the memory footprint was much smaller. That might be nice.
  • AWT support a must (Score:5, Insightful)

    by Coulson ( 146956 ) on Wednesday January 30, 2002 @09:23PM (#2928695) Homepage
    Most notably, there is very little support for AWT, making GCJ unsuitable for GUI applications.

    That's the real shame of the matter. Java shines most in its ease-of-use for creating complex GUIs -- unfortunately that's also where the worst memory/performance problems appear. For instance, Swing is good for client apps if you can ship with minimum RAM requirements of 64+ mb (and even that's cutting it close). Performance is most important in the UI, where the user notices any lack of responsiveness. Hopefully some Java native compilers will help out here.

    Different compilers support differing levels of class library; Excelsior JET is one compiler that claims to completely support AWT and Swing.

    Maybe there's hope yet!
    • by addaon ( 41825 ) <addaon+slashdot@gma i l .com> on Wednesday January 30, 2002 @10:09PM (#2928900)
      Having used JET somewhat extensively, I can say that Swing works beautifully in it... exactly as you'd expect, and at a reasonable speed, to boot. It doesn't make small applications, but it optimizes relatively heavily. There's more than hope: it's already there!
  • Benchmarks (Score:3, Interesting)

    by Andrewkov ( 140579 ) on Wednesday January 30, 2002 @09:23PM (#2928698)
    The program used in the benchmarks was very simplistic .. I would rather have seen a program creating and destroying many objects of various sizes. In the nativly compiled version, how does garbage collection work? Is it possible to write AWT apps? Swing?
    • The program used in the benchmarks was very simplistic .. I would rather have seen a program creating and destroying many objects of various sizes. In the nativly compiled version, how does garbage collection work? Is it possible to write AWT apps? Swing?

      and string processing too! It's a well-known performance handicap for java. SUN is heading the right direction to improve the performance of it in each version, I wonder how well IBM would perform in this battle? :)
    • Re:Benchmarks (Score:2, Informative)

      by snowman153 ( 555473 )
      Some links to more extensive native compiler benchmarks:

      Osvaldo Pinali Doederlein.
      The Java Performance Report - Part IV: Static Compilers, and More. August, 2001 [javalobby.org]

      The Java Performance Report is an independent study of Java performance, where both virtual machines and static compilers are evaluated. Part IV compares Excelsior JET and two other static compilers with Sun's HotSpot Server VM.

      Volano, LLC.
      The Volano Report. Dec 26, 2001 [volano.com]

      The Volano Report covers VolanoMark(tm) - a widely recognized pure Java server benchmark characterized by long-lasting network connections and high thread counts. This report includes Excelsior JET a native compiler.

      Excelsior JET Benchmarks [excelsior-usa.com]

      From Caffeine and SPECJVM through XML Transformations to Java2Demo and VolanoMark.

  • The big advantage of GCJ isn't speed, it's the vastly better interface to C++ code (CNI vs. JNI). Of course, using that really does make your code non-portable.
  • by Bigger R ( 131370 ) on Wednesday January 30, 2002 @09:28PM (#2928726) Homepage
    Good progress is being made in native compiler for Palm OS. See http://sourceforge.net/projects/jump/

    and also http://www.superwaba.org for info on the related JVM for PDAs that it replaces.

    Good stuff.
  • by adamy ( 78406 ) on Wednesday January 30, 2002 @09:31PM (#2928740) Homepage Journal
    Seems to me the problem with Java is that it waits until memory is full to garbage collect. In C++ Code, if you allocate and free a lot of memory, eventually you are going to fragment your memory to the point where you may not be able to allocate large enough memory blocks for your purpose. GC is supposed to get around that. But if C++ doesn't GC, how can a C++ app run indefinitely. I would really appreciate someone who understands the subtle nuances to explain. If there are better memory allocation schemes, couldn't you use them until you were forced to use GC?
    • I would really appreciate someone who understands the subtle nuances to explain.

      Bruce Eckel explains some different approaches to GC, pros and cons, etc., especially as it relates to Java. Check out Thinking In Java 2nd Edition, pp. 207-219. You can download it here [mindview.net].
    • Seems to me the problem with Java is that it waits until memory is full to garbage collect.
      Your information is out of date. See the information on the HotSpot VM and its use of generational GC. Essentially, short-lived objects are GC'd frequently.
    • A good source for starting to address all of your garbage collection questions is Jones & Lins' book, Garbage Collection : Algorithms for Automatic Dynamic Memory Management. Another good source is The Memory Management Reference [memorymanagement.org]

  • by Quizme2000 ( 323961 ) on Wednesday January 30, 2002 @09:33PM (#2928747) Homepage Journal
    I was a little disappointed in the selection of the the JVM's they selected for their example. Especially now that PDAs/CellPhones are now powerful enough to run real applications on the client. I have yet to see a StrongArm device ship with the Sun, IBM, or KaffeJVM, why, because they are slow. The JVMs listed in the subject line run 10-200x faster on devices. Soon we will see EJB and other J2EE compents on PDAs. As a developer of client java applications/applets I would never distribute a device specific application, it would be a nightmare to have 160 diiferent compiled versions of the same application. The good JVMs already have done that, and my code I compiled years ago works on the newest PDAs...So there
  • by Alea ( 122080 ) on Wednesday January 30, 2002 @09:37PM (#2928768)
    I'm not going to try to champion Java, JITs, or native compilation, I'm just going to point out what's wrong with this "study".

    This has to be the third or fourth weak study of Java performance I've seen over several years. Issues such as whether or not all Java features are in place in the native compilations (e.g. array bound checking, but note that GCJ turns this on by default) or what sort of optimizations are performed by the native compiler and JVMs are completely ignored. The author also suggests that compiling from bytecode to native code is the natural route when it's quite possible that gains could be made by compiling directly from source to native. While GCJ accepts either Java source or bytecode, it's not clear from the documentation I've read whether or not it first translates source to bytecode or goes straight from source to native.

    When comparing disk space, the author comments that an installed JVM can be up to 50 MB vs. 3 MB or so for libgcj. This is a ridiculous comparison since those directories probably include all of the tools, examples and libraries, and as far as I know, libgcj doesn't include the whole J2SE API set, so it's nowhere near a fair comparison. It's a pretty limited set of benchmarks for making these judgements too.

    I played around with native compilation of Java a few years ago. At one point (probably around 1996/7?) native compilers could offer noticable gains over Sun's JVM on a numerically intensive application (neural network stuff). However, after the initial versions of HotSpot and IBM's JIT, I couldn't find a native compiler that gave competitive performance on similar apps. I think this is largely due to underdeveloped native compilers with poor optimization (HotSpot is quite insane in its runtime optimizations).

    Anyway, I sure hope IBM doesn't pay too generously for studies this poor. Its final message is essentially "who knows?" - not terribly useful.

    Alea
    • by dvdeug ( 5033 ) <dvdeug@emailMENCKEN.ro minus author> on Wednesday January 30, 2002 @10:30PM (#2928975)
      While GCJ accepts either Java source or bytecode, it's not clear from the documentation I've read whether or not it first translates source to bytecode or goes straight from source to native.

      It goes straight from source to native; for one thing, the source format exposes stuff that allows it to be more heavily optimized by GCC's optimizer than the bytecode format does.
  • With the renewed interest in beos thanks to the leaked 5.1/dano release, developers for beos should seriously consider trying for a native compiler since the various jvm projects arent going anywhere fast. If beos could run java apps, I could see this renewed interest continue to grow at amazing rates..
  • "...compiling Java apps to run as native appplications on the target machine without the need for a JVM."

    In other news, they have also decided on a name for this wonderful new technology: C.
    • they have also decided on a name for this wonderful new technology: C.
      In other news, computer scientists have developed a way to preserve all the flaws of assembly language without having to remember those pesky 3-letter pneumonics. They have also decided on a name for this wonderful new technology: C.
  • VM: a definition (Score:5, Informative)

    by fm6 ( 162816 ) on Wednesday January 30, 2002 @09:43PM (#2928787) Homepage Journal
    In a previous life, I sat in a corner taking notes while around me, engineers designed Java VMs. The experience didn't make me into a real expert, but it did make one thing clear: there's no such thing as running Java without a VM.

    People think of the VM as an interpreter that executes the bytecodes. That's a particular implementation of a VM. And not a very good one -- which is why no production VM works that way.

    The simplest optimization is to use a JIT [symantec.com]. This gives you native execution speed once the class files are loaded -- but loading is slower, because it includes compiling the byte codes. You can end up wasting a lot of time compiling code you'll only execute once -- most programs spend 90% of their time in 10% of their code. Depending on the application, you can end up wasting more time on unnecessary compilation than you save by running native code.

    Intuition suggests that the most efficient thing to do is to "get rid" of the VM by compiling everything to native code before you distribute your app. But that doesn't get rid of the VM -- it just converts it to a different form. There are some VM features you can't compile away, such as garbage collection. Some experts claim (not me, I get dizzy when I even read benchmarks) that "pure" nativeness is illusory and not that efficient. Plus you lose a lot of the features of the Java platform when you run the program that way. Might as well stick with C++.

    Some VM implementations use a sophisticated comprimize between interpreters and JIT compilers [sun.com]. If you can identify the small part of the program that does most of the actual work, you know what parts of the program really need to be compiled. How do you do this? You wait until the program actually starts running!

    Advocates of this approach claim that it has the potential to be faster than C++ and other native-code languages. A traditional optimizing compiler can only make decisions based on general predictions as to how the program will behave at run time. But if you watch the program's behavior, you have specific knowledge of what needs to be optimized.

    Computer science breakthrough, or illogical fantasy? Don't ask me, I'm just a spectator.

    The engineers I picked this stuff up were very contemptuous of "microbenchmarks" like those described in the developerWorks article. Nothing to do with the real world.

    • Intuition suggests that the most efficient thing to do is to "get rid" of the VM by compiling everything to native code before you distribute your app. But that doesn't get rid of the VM -- it just converts it to a different form. There are some VM features you can't compile away, such as garbage collection.

      Where did you pull this assertion out? It's rubbish. Eiffel and Ada are garbage collected out of the box. C and C++ can be by just linking code to a different library. The 'V' in VM stands for Virtual. There is no virtual machine being used, it is using the real machine. Real registers, real opcodes. In fact, gcj compiles down to what amounts to C++ (the object layout is the same)

      Some experts claim (not me, I get dizzy when I even read benchmarks) that "pure" nativeness is illusory and not that efficient. Plus you lose a lot of the features of the Java platform when you run the program that way.

      Partly true on both counts. A VM has some known constraints that lends utsekf to optimization. Pointers in C and C++ for example create potential aliasing problems that just don't exist in Java. However, a good native compiler (not a merely adequate one like gcc) can do instruction ordering that a VM cannot. Bytecode has advantages like portability, and with clever classloaders, you can do all kinds of wizardry to objects by rewriting the bytecode. Not every app needs these features however, and so native compilation is also a compelling option. Native compilation also gains you the capability of using more native features efficiently. Case in point, JNI is slow -- Macromedia Generator is largely a Java scaffold around C++ components, so it uses JNI extensively. Turns out to be slower than an alternative written in pure java.
      • While I agree with "Where did you pull this assertion out?" the follow up doesn't go far enough. People are forgetting that languages have models of data and regardless of translator and execution style you need to implement the language's data model. For Java this means objects with GC. VM or real-M.
    • Some VM implementations use a sophisticated comprimize between interpreters and JIT compilers [sun.com]. If you can identify the small part of the program that does most of the actual work, you know what parts of the program really need to be compiled. How do you do this? You wait until the program actually starts running!

      Advocates of this approach claim that it has the potential to be faster than C++ and other native-code languages. A traditional optimizing compiler can only make decisions based on general predictions as to how the program will behave at run time. But if you watch the program's behavior, you have specific knowledge of what needs to be optimized.

      Err - but you can do that with C++. Ever heard of Feedback Directed Program Restructuring (FDPR)? You run your code under whatever load you feel like optimizing and use the FDPR tools to rebuild the code, changing the ordering and ensuring that cache misses are minimized as much as possible. Try taking a look here [ibm.com] for some more details.

      Cheers,

      Toby Haynes

  • Wow, GCJ produced executables that were, on average, slower and used more memory than the bytecode versions running on the IBM JVM. That's just sad. I always expected that native code compilation would really bring Java into the mainstream, but obviously, GCJ isn't up to that task yet.

    I wonder how the commercial compilers would compare?

    -Mark

  • by kaladorn ( 514293 ) on Wednesday January 30, 2002 @09:46PM (#2928808) Homepage Journal
    First, I have to identify that we (my company) do use the JET byte->native compiler by Excelsior. Good product, I've recommended it to others and they've had success with it too. In our case, it produced a 10-15% speed increase, some in-memory size savings, and it had one huge advantage missing from the byte code: SECURITY!

    After experimentation, I'm pretty convinced that the decompilers on the market that work on obfuscated byte code KICK THE CRAP OUT OF THE OBFUSCATORS. The long and the short of it is the decompiled code is pretty decipherable.

    If you want to protect your IP (Intellectual Property), that's not a good thing. In fact, that might be (if you are in a competitive arena) a VeryBadThing(TM). The native code (especially optimized native code) is far harder to effectively decompile into something usefully readable which crackers and script kiddies can abuse or which competitors can peruse. This benefit alone makes it worth going this route if you can.

    One of the other things the article missed:
    It didn't devote much thought to the settings and optimizations these compilers provide. The Excelsior compiler (by example, I looked at Bullet Train and some others before we picked Excelsior) provides ways to disable certain safety checks (like array bounds checks) for extra speed. If you're in a fairly aggressive environment with some pretty significant timing issues (I won't say hard realtime, because anyone doing hard realtime should be using something like QNX [qnx.com]), you will find that even these small gains may be useful (and the risks they introduce acceptible). But the article didn't even hint at these possibilities.

    So, if you want to build something that is less likely to be cracked or examined, this type of tool is the way to go. Excelsior, for example, is fairly easy to setup. I did get some help from their guys, but only because our product includes OpenGL, QuickTime, a bunch of XML parser stuff, DirectX, sound support, UI, etc. - a whole pile of external dependencies. The buddy I recommended it to had his project up in going in half an hour or so, with a more modest project (but still a useful, full fledged app with GUI).

    Undoubtedly, these won't solve all your ills and they may introduce some new difficulties in bug hunting (though some of the new debuggers coming out with these products are very neat also). So you will want to look at what you need, what your concerns are (security, speed, cross platform deployment, etc) and decide accordingly.

    • After experimentation, I'm pretty convinced that the decompilers on the market that work on obfuscated byte code KICK THE CRAP OUT OF THE OBFUSCATORS. The long and the short of it is the decompiled code is pretty decipherable.

      That's probably because there's really no way to obfuscate Java byte code, since it's all java.lang.reflectable anyway - you can use Java code to dynamically load a class name at run time and then discover methods it contains and public variables it has.

      As far as I know the .class format in essense requires methods and class variables to be determinable makes it fairly hard for Java to be "secure" when it comes to making code hard to disassemble. Especially because all your classes wind up being one-class-to-file and by default end up with the names of the classes and the package structure laid out in the directories created...

      Sun may want to consider creating a secure JAR file which is loaded by a "secure" class loader that prevents reflecting, but a lot of cool stuff can be done with the reflect interface (like loading plugin classes at runtime).

      Bottom line is that every .class file contains a list of all methods and public variables as well as being one instance of an original class definition - not useful for making your program structure hard to dissassemble. However, the ability to determine methods and load classes at runtime can be useful to. (As well as required for Java's RMI, I think - I might be wrong, though.)

    • KICK THE CRAP OUT OF THE OBFUSCATORS.

      This is not the case anymore. An example is Zelix Klassmaster. We use it for our java web application. No decompiler can cope with it. The decompiled code is littered with byte code that it couldnt work out what to do with. I wrote the code, and I cannot for the life of me work out which methods are which in the decompiled after obfuscated classes. They also do things like String encryption so even string constants are unrecognisable.

      Java is certainly not perfect in this respect, but my experience with obfuscation (this is very recent) has been very good.


    • After experimentation, I'm pretty convinced that the decompilers on the market that work on obfuscated byte code KICK THE CRAP OUT OF THE OBFUSCATORS. The long and the short of it is the decompiled code is pretty decipherable.


      Oh man, you are not kidding. One department at the shop I work at holds their source code very close to their chest. It is easier to JAD [tripod.com] the class files into java code than it is to go through channels to get a copy of the source. Try it - even on big commercial code like an app server. You will be shocked.
  • Back when I first learned Java (1996?) I thought that native compilation would solve some of the speed problems that came with Java. This article makes it look like the speed-up might not even exist. But let me get to my idea.

    I also realized that native compilation would destroy the cross-platform capabilities of Java. So I always thought it would be cool to distribute Java apps with both native compiled code for a specific platform and Java bytecode too. That way if you happened to be running the target platform you could get the speedup. If not you could still run the app, and maybe even compile the bytecode to a native app for your platform. This is similar to the fat-binary idea that Macs used when they switched from 68k chips to PPC, allowing a binary to run on either platform.

  • by DaveWood ( 101146 ) on Wednesday January 30, 2002 @10:14PM (#2928913) Homepage
    I found it interesting that this author, an IBM researcher, chose to only test a single java-to-native compiler, the GCJ (GNU product). This is an immature open-source package that I would not expect much performance from. His paper rehashes a lot of really basic info, then gives some performance results which show IBM's JVM spanking Sun, Kaffe, and GCJ. This is no great surprise; IBM is tooting it's own horn - fine, they deserve to IMHO. But as an exercise in "the state of native compilation" it's useless. What would actually be really useful is a comparison that also included at least a half-dozen other major players in the java native compiler market. I suspect you'd see some different results.

    As an aside; I see people call Java "painfully slow," but in my experience it's not that painful post 1.3. I'm not giving you benchmarks, and anti-Java people will just "no" me, but these are my experiences after a few hundred thousand lines of Java code over the past few years. Anyway, it's a good exercise to ask naysayers what _their_ basis is; they often have none.

    Also, as other posters have pointed out, the speed loss must be seen in the runtime safety context, as bounds checking and garbage collection yield stability and security dividends and, at the end of the day, we almost always want those things and are willing to wait the extra few ms for the guarantees.

    All these complaints about speed are especially ironic given how many massive wasters there are in the modern computer, _especially_ in Windows NT/2k/XP.

    But the biggest flaw in this Java vs. C debate is that often you don't get a choice between writing code in Java vs. C/C++, since you don't have the extra 50% in your time budget to do C/C++, and your real choice is between Java and VBScript...

    All the people shouting "I can write C++ 10 times as fast as you can write Java, loser" please form a line in an orderly fashion, you will be spanked in the order you arrive...
    • >All the people shouting "I can write C++ 10 times as fast as you can write Java, loser" please form a line in an orderly fashion, you will be spanked in the order you arrive...

      I have yet to be proven wrong that developing using C++ is any harder or time consuming than writing in Java. Arguments for usually revolve around Java's libraries and its garbage collection.
      To that i say Java's libraries are outclassed and outnumbered by the amount of source for C++ and the number of libraries like Boost ,Dinkum, etc for instance. As for garbage collection, I myself have 7 different conceptual collection systems implemented that can be easily integerated into any code you want.
      I can even make the case that developing for C++ is easier given the use of templates. Those who say that Java is easier to program in really don't want to learn the advantages of programming in C++.
      • by Roy Ward ( 14216 ) <royward770&actrix,co,nz> on Thursday January 31, 2002 @12:42AM (#2929382)
        I've used both C++ and Java extensively (although I haven't used garbage collected C++).

        For ease of coding, I find that Java simply outshines C++ because it doesn't leave me dealing with low level stuff, like pointers.

        An occasional big time-killer with C++ is trying to debug something that corrupts memory.This doesn't happen with Java (although you can muck up the synchronization with threading and get unpredictable results which is just about as bad).

        On the other hand, if I want performance (such as writing image processing software), I'll go with C++ (or assembler), as there is no way that Java can compete on speed for low level stuff.

        And even the awful C++ templates are better than no templates at all.
      • by FastT ( 229526 ) on Thursday January 31, 2002 @01:04AM (#2929421) Homepage Journal
        I have yet to be proven wrong that developing using C++ is any harder or time consuming than writing in Java.
        From this I deduce that you are either a student or someone in academia, or someone working at a small shop somewhere writing non-commercial or only minorly-commercial software. There is just no comparison when writing large, real world commercial (or even non-commercial IMHO) software. You wouldn't need it proved to you if you saw the misery C++ causes in these situations.
        I myself have 7 different conceptual collection systems implemented that can be easily integerated into any code you want.
        You know, this is exactly what I would expect guys like you to say. You have 7 different ways, the guy in the next cube has 3, my last company had N. This is the problem. Java has this capability built into the system, it works one way, and everyone understands it, both its strengths and its flaws. Same goes for all the other class libraries you mention in you argument.

        The point is, I can come in and maintain someone else's code with far less trouble if it's written in Java than I can if it's written in C++. Same goes for the support engineers. Same goes for customers. If it's written in C++, the application can essentially be a language unto itself. You have different mechanisms for just about any major feature between development teams; none of them are standard, and none of them are even remotely as easy to learn and use as the equivalent Java API. Maintaining these applications costs a huge amount, and is fraught with support issues exactly because there are so many incompatible ways of doing the same thing.

        Your argument is so tired because it doesn't take into account the actual cost of developing real applications--maintenance and support.

        Those who say that Java is easier to program in really don't want to learn the advantages of programming in C++.
        Wrong: Those who say that Java is easier to program in (and who don't know C++) really don't want to learn the disadvantages of programming in C++.
      • I have yet to be proven wrong that developing using C++ is any harder or time consuming than writing in Java.

        Hmm. Do you include debugging and maintenance as part of development? I'm sure you can type C++ about as fast as you can Java, but the work doesn't end there. One of Java's advantages is the vast reduction of memory leaks (not elimination, some guys can leak resources in any language...). In my opinion memory mismanagement is largely responsible for the typical C/C++ bug.
      • I know I'll get flamed for this, but ...

        If you want to use templates as well as garbage collection, you might like Managed C++ [gotdotnet.com], the .NET version of C++.

        It'll let you code in C++ (with templates etc) but also use garbage collected arrays etc, and the .NET class library [microsoft.com].
    • The biggest speed issues in Java are not from the JVM being an interpreter or JIT, they are from the API and the very high level of safety guarantees provided by Java.

      Java code can be very fast, but you have to take the API and the nature of Java into account in your design. If you want to do nothing but sling strings around, you're going to pay a massive penalty for doing it in Java, because all of the Java API's (at least in 1.3 and prior) require String objects for any string activity, and String objects are non-mutable due to a concern for thread safe API calls. If you want to change the third character in a String, you get to either create a new String with that change in place, or you get to create a StringBuffer class that will spew non-mutable String copies for each time you want to use the characters in your StringBuffer.

      A C program would never ever do things this way, and so you get insanely better performance. But if you write a C++ program that does do things this way, you'll actually get performance that is very close to Java's, except you have no guarantees as to the integrity of any of your objects because some random piece of code can scribble all over anything it wants to. Plus no universal set of exception classes that everything knows about, no portable thread synchronization primitives, etc., etc., etc.

  • by stph ( 541287 ) on Wednesday January 30, 2002 @10:15PM (#2928914) Homepage

    These kinds of articles raise more questions than they answer. I have to ask what does Java native compilation gain me? Martyn writes in the article that performance and memory consumption were basically a wash on the more complex app, so what are the other considerations that might drive me to use a native compiler?

    • Does it make programming in Java easier? Not really. Most of my development environments for Java would take serious tweaking to get something like gcj to work. And somethings, like WebObjects where much of my stuff has to run might never be made to work with native code.
    • Does it make debugging easier? Now this might be a useful avenue to explore. Debugging an app that works in one JVM but not the other(s) can be a serious pain. I do a fair bit of developing on a PC and deploying on a Linux server, where the former has Sun's JVM and latter uses IBM's JVM. Maybe native compilation would help solve that, especially if you could hook back to the source code. Without source support, though, it would be troublesome.
    • Can I support my customers more efficiently with native compilation? I don't see how native compilation would make this easier. Instead of JVM differences, now I have hardware differences.
    • Does it reduce the load on my servers when I fire up these applications? We get a lot of individual JVMs running on our application servers when they are loaded up with lots of multithreaded apps and other such things. It is possible that the total footprint across a bunch of threaded apps would make this a compeling reason to explore, but Martyn's article doesn't really address that issue. Of course, our JVM proliferation could be the result of the various frameworks we're plugged into: WebObjects talking to DB2 and SQL Server databases.

    Native compilers have been here for a long time and they haven't really taken off. They either need to offer something absolutely necessary that I can't get via regular Java compilation and runtime, or they need to offer performance improvements that are orders of magnitude better than what we already have. If the vendors can do that, then I want to talk to them. Otherwise it's just another experiment in an already too busy world. Stph

  • by trance9 ( 10504 ) on Wednesday January 30, 2002 @10:31PM (#2928981) Homepage Journal
    Lots of experts here.

    Some experts who have never used Java want to tell me that it's no good, and will never be any good--why? They don't know, but they know!

    And some experts who want to tell me all about why Java's compilation, why it is hard or easy even though they really don't know anything about a compiler.

    And some experts on Java's market share who really don't know anything about who uses Java.

    And some experts who sat in a room where Java was... gosh gee... being implemented, telling me... well I don't quite know what, but gosh!

    So many experts here--I must be reading slashdot!
  • Hold on. (Score:3, Informative)

    by Anonymous Coward on Wednesday January 30, 2002 @10:36PM (#2929003)
    There's a thread about this article over at the gcj's mailing list: here [gnu.org].

    The author chimes in, BTW.
    • That thread also references this page [shudo.net] that has a bit more comprehensive bit of performance testing.

      A quick scan seemed to indicate that Sun's HotSpot, IBM's JVM and GCJ all do very well.

  • Bad review (Score:5, Insightful)

    by Animats ( 122034 ) on Wednesday January 30, 2002 @10:45PM (#2929038) Homepage
    The simple case, the prime number finding loops, should have been followed by an analysis of the object code to find out why the native compilation is so slow. Look at the inner loop:
    • for (long test=2; test < i; test++)
      { if (i%test == 0) { return false; }
      }
    If the compiler generates slow code for that, something is very wrong in the compiler.

    On the safety front, subscript checking is almost free if done right. Subscript checking can usually be hoisted out of loops. Old studies on Pascal optimization showed that 95% of subscript checks can be optimized out at compile time without loss of safety. GCC, though, being a C compiler at heart, isn't likely to know how to do that. Merely putting a Java front end on it doesn't make it a Java compiler deep inside.

    • for (long test=2; test < i; test++)
      { if (i%test == 0) { return false; }
      }
      If the compiler generates slow code for that, something is very wrong in the compiler.

      According to this message [gnu.org] to the GCJ mailing list, it's because GCJ doesn't produce particularly good 64-bit integer math code..

    • Just as a quick test I decided to build his Java program with my version of Javac and also write a C version that is basically identical with the exception of the time/print calls and variable ordering and compiled it with both GCC and G++ under MacOS X (PPC.)

      Java version: 51250 ms
      C version: 6392 ms
      C version compiled with G++: 6412 ms

      If GCJ can approach even half the performance of G++ in the future it looks like it'll be a complete win over Javac. As I understand it, Apple's version of Java is quite good, but the two just don't compare.
    • AFAIK, the GNU Ada compiler is not too bad on eliminating unnecessary bounds checking (array bounds checking usually only adds a penalty of a few percent), so something can be done even in the GCC context.

      BTW, I would expect that gcj-compiled Java applications start much more quickly than entire VMs. The comparison doesn't seem to take this into account.
  • by jsse ( 254124 ) on Wednesday January 30, 2002 @10:51PM (#2929066) Homepage Journal
    It's being done by IBM, anyone would think it's biased? :)

    Don't get me wrong, I'm a big fan of IBM, and IBM has really, really made a JDK multi-times out-performing SUN's JDK.

    However, I'd believe the selection of 'opponents' are simily..unfair. :)

    Kaffe is surely an easy-pick. Yes it's the only GPL JDK out there but many people(at least Java developers here) would avoid kaffe as it has a fatal security flaws [kaffe.org] that kaffe team doesn't seem to want to solve it. :/ (We call Kaffe 'MS' JDK' [openresources.com]2, not just for humor, but it's really the case. ^_^)

    Even the GNU people know gcj is slow, even GNU guys know it; but speed is not a real issue for gcj, it's basically a starting point for all implementation - or reference implementation as we like to call it. We would pick a commercial java compiler if we need it.

    SUN's JDK, well...you know what I think it just what you think - IT'S SLOW! Yes, we all know that. :D

    Nevertheless I think developer's work is doing a great job here, it confirms something everybody know - that IBM's JDK is fast, and that's it. I don't see it could conclude the performance of native compiled java programs. Unless they include all other commercial java compilers into testing, I wouldn't think we have reached a conclusion yet. :)
  • by PRR ( 261928 ) on Wednesday January 30, 2002 @11:06PM (#2929133)
    One of the best articles on native compilation is this performance report [javalobby.org]

    Also see on Javalobby the The "native compilation" myth [javalobby.org] and RFEs for JDK1.5 [javalobby.org] threads which discuss native compilation.

    Yes, there are some companies like Jet and Jove (and GCJ) making native compilers, but I'd like to see a company of the clout of Sun (or IBM) make native Java compiling more accessable by having a "javac -native" option with their JDK and a smaller standardized runtime (instead of the full JRE) specifically for native compiled apps. The most likely target for native compiled apps running w/o a JVM would probably for the client/desktop side which don't have the resources of server boxes. Remember... it would be an OPTION in the JDK... the old bytecode compiling for VM/JIT would still be there.

    It's arguable whether the runtime performance of a native compiled app would be substantially better than JIT compiled (if some of the Hotspot tricks were moved upstream into the JDK compiler?), but there are also other advantages, such as a faster startup time (no need to startup a VM processes and JIT compile) better memory usage (no need for VM process as well as the meta data and bytecode which must be held in memeory). There would also be some drivespace savings as the full JRE wouldn't be needed (though at this time, a developer can't include a partial runtime, Sun wants the full thing if they include one... Sun needs a change of heart and make a standardized smaller runtime for native apps). Also, with native binaries an obfuscator (used for btyecode classes) isn't needed for commercial apps for those who want to keep their code proprietary.

    There are lots of suggestions going around for the VM/JIT issues such as loading the VM at bootime, a singular shared VM, and JIT caching. These workarounds aren't neccesary with JVM-less native compiled apps.

    Of course, there are WORA extremists who don't like this idea of native compilation because it goes against the dogma of only needing to compile ONCE and run on any OS with a JVM. IMO, as long as the source stays the same, compiling natively for different OS's is still pretty near to the WORA ideals. The Java language is VERY tweaked for portability as-is. It's the OPTION of native compiling that should be available in the Sun or IBM JDK.
  • For anyone that wants to learn more about Java performance tuning you might want to check out my book. You can read it online here:

    http://java.sun.com/docs/books/performance/ [sun.com]

    -Steve

  • Excelsior JET (Score:2, Insightful)

    by LadyLucky ( 546115 )
    For my own curiosity, I compiled an interpreter I had written in java using this, just to compare the performance.

    It took forever to compile, but once it was done, I had a (large) executable for my native platform, windows.

    It ran about 30% slower than the JDK. There are a number of things that are still pretty slow in java, but in general, it's a pretty fast language these days. JIT compilers and hotspot to a good job. It can never be as fast as C++ due to things such as GC, but the performance tends to be close for most applications.

    This is indeed interesting, but I think it is entirely irrelevant. Speed of execution is usually about 7th on the list of important-stuff-im-thinking-about when choosing a language and starting a project. There are so many more important things, such as maintainability, scalability, code reusability, code robustness (the number of stuff you get so easy compared to C++ just leaves you wondering how you could ever program in C++ again), you know the stuff. These things are often far more of an issue than raw performance. Look at Slashdot, it's written in perl, presumably because they thought it was easier to write the website in, not because it was the fastest thing around.

  • Some gcj facts (Score:3, Informative)

    by tromey ( 555447 ) on Thursday January 31, 2002 @12:23AM (#2929334) Homepage
    Some facts about gcj and the IBM article.
    • They tested an old version of gcj. gcj 3.1 beats the Sun and IBM JDKs on SciMark. It also wins on the "primes" test once you change it to use int and not long; this is a known gcc weakness.
    • In general we haven't done a lot of performance tuning. There is still a lot of room for us to improve.
    • You can see a much better (IMHO) comparison of gcj with other VMs here [shudo.net].
    • Contrary to what one poster said, my understanding is that gcj has better I/O performance than the Sun JDK.
    • It is true that gcj is missing AWT (though much progress has been made on that front recent, we still aren't there) and some other things. However, it is still useful for many things.
  • but it [native compliation] can result in the faster execution and smaller footprint essential to so many of today's applications

    WHEN are we going to get over these Enterprise Application Beanie Stick-Some-Code-In-An-Amorphous-Cloud-And-Forgetta boutit Framework fooey articles? I was hoping this would die with the dot bombs.

    I've done *a lot* of Java programming and I am very disappointed (no, I am pissed!) with the direction Sun, IBM, and the Others are taking the Java language. They have surrounded it with it's own little economy and marketed it to the point were clients instist on using some bloated J2EE Super Servlet Slinger (read drooler) when all you need is like

    env | $JAVA_HOME/bin/java -cp $CLASSPATH proj.CgiMain $CGI_PROPERTIES

    and a couple of utility funtions.

    Native compliers are not going to solve this problem. Java code is just inherently bloated. If you just wrote Java code like you would C we wouldn't be having the conversation. Instead, some jackass has to throw a DB thread pool that doubles as a major object sink for an app that gets 20 hits an hour.

    The problem here is that Java is much more forgiving than other languages. But it doesn't relinkquish the developer from choosing proper datastructures and algorithms. I think I should become a DBA. That's were the heavy lifting should be done anyway.
  • Honest, there's only one time where the VM's performance has bit me in the rear. Other then that, it's never been an issue. Then, again I'm not writing a lot of real-time code.

    The bigger problem is the the programmers lack in understanding the how "Java as a whole" works. Top questions I ask when interviewing a potentional java programmer. Generic questions spawn thoughtful generic answers.

    ) What's the JVM?
    ) What's a pointer and how do they work in java. (This one gives you looks.)
    ) How and when does GC work.
    ) What happens in the VM when I write "new Object()"
    ) What does your typical compiler break the follwing line into?
    String s = "A"+"2"+new Integer(5)+new String(new byte[] { 66,65,68});
    ) *Insert ob thread question*
    ) vi or emacs? :)

    It's not that java is god awful slow, it's b/c it has been hyped so much that non-programmers are getting into the game (not to the scale of VB) and bad code is being written and distributed. I'm not here to harp on non-progrmmers, but I wouldn't hire one for the work I do.

    Too many people depend and rely on Java's special features such as GC. The programmer who understands "java as whole" can program with and against the features.
    i.e. A routine that really does need to 'create' 1E+6 Objects. In C you can alloc and malloc on demand, but in java you have to deal with the GC. Thus you can give either help the GC out by nullifing obejct refs after use so the GC's scope and refrences scans are quicker or better yet, use an ObjectPool. It's all about optimizing the 10% of the code that does 90% of the work. But PHB bitch about performance when they insist on RAD tools. (Have you ever analyzed the shit code that JBuilder produces??)

    There's a hunderd ways to skin a cat, but 90% of those ways are still going to leave fur on the little guy.
  • Hasn't the world realised that java bytecode is yet another attempt to re-write the p-code system of old?

  • Some points elided over in the article:
    • Memory footprint can be a major factor. Very nice that one program performs better than another given infinite memory (340M---large enough anyway). My computer, however, has considerably less memory, and most other people i know have 64M--128M; moreover, there are often several applications/processes competing for that memory. How well would these VM/compilers work on IBM's own netvistas, standard with 128M? How about with a benchmark that actually needs significant memory (as opposed to a prime sieve, which has minimal memory req.)? Once you start to run low on memory, things slow down dramatically, and once you start swapping you have 0 performance---i do not believe their results reflect a realistic usage scenario.
    • Tiny, micro-loop benchmarks are of little value for most java users. Programs that spend the majority of their time in a few small, simple loops are extremely sensitive to the exact optimizations performed---depending on the exact instruction sequence, branching setup, cache layout etc your otherwise great compiler may look like crap, or vice versa. If your program consists of a few very tight loops then these results are meaningful; for the other (100-e)% of java users, it's of minimal value
    • Their benchmarks do not test the features people rely on most---how much garbage collection does scimark really require? How many string operations, virtual function calls etc?
    • In a small benchmark it is just far too easy to identify (off-line) the few APIs called, and then make sure your VM/compiler spends disproportionate effort optimizing those apis. These are programs with trivial hot-spots; locating such bottlenecks in a real application is difficult for many reasons, so again these results do not transfer well
    • When do they start measuring time? It looks fair, using the java code itself, but there are subtleties. My user experience with an application begins when i press enter on the command line (or click on an icon); for the native app it has to load its libraries etc. The interpretted version has to load all the vm libraries, and start compiling the startup code. There's a lot of work done before it ever gets to the currentTimeMills() call---i do not see this accounted for anywhere. Further, their results are 'average of 3'---are these 3 runs after 3 cold boots, or 3 runs seeing how much the cache still holds?
    • Their "pros and cons of native compilers" section is much heavier on the cons then the pros. They ignore packaging simplicity (a single executable or one with a few dlls might be easier to give to a customer than an entire vm), interaction with other applications (uh oh, user installed jdk1.3 but we need 1.4, and that other app needs 1.2, ...), security (already mentioned by others here, it's a fair bit harder to decompile binaries than bytecode).
    • I'm astounded they list VAJ in their resources section for native compilers. VAJ used to contain HPJ, but it hasn't been there since java went to version 1.2.

    I expected a little better from their 'developerworks' people---this is just marketing drivel.

Let's organize this thing and take all the fun out of it.

Working...