Linux Number Crunching: Languages and Tools

Follow Slashdot stories on Twitter

Linux Number Crunching: Languages and Tools 322

Posted by chrisd on Thursday January 02, 2003 @05:08AM from the hyperthreading-seems-neat dept.

ChaoticCoyote writes " You've covered some of my past forays into benchmarking, so I thought Slashdot might be interested in Linux Number Crunching: Benchmarking Compilers and Languages for ia32. I wrote the article while trying to decide between competing technologies. No one benchmark (or set of benchmarks) provides an absolute answer -- but information helps make reasonable decisions. Among the topics covered: C++, Java, Fortran 95, gcc, gcj, Intel compilers, SMP, double-precision math, and hyperthreading."

This discussion has been archived. No new comments can be posted.

Linux Number Crunching: Languages and Tools

Load All Comments

Search 322 Comments Log In/Create an Account

Comments Filter:

0th post (Score:3, Funny)

by Anonymous Coward writes: on Thursday January 02, 2003 @05:10AM (#4997419)

At least if you're developing in C....

Share
twitter facebook
Octave (Score:5, Interesting)

by sql*kitten ( 1359 ) writes: on Thursday January 02, 2003 @05:13AM (#4997425)

Interesting numbers. Have you considered benchmarking Octave or rlab also? (Or is there a native MATLAB for Linus now?)

Share
twitter facebook
- Re:Octave (Score:2)
  
  by StandardDeviant ( 122674 ) writes:
  
  Well, I don't know if there is a swedish localization for MATLAB for Linus' usage, but I know for a fact there is a native binary for Linux (I worked as a junior admin at my college's math department for a while back in '99). :-)
- Re:Octave (Score:2, Informative)
  
  by bezza ( 590194 ) writes:
  
  I have been using native MATLAB binaries for Linux for a while now...and unfortunately they are still much slower than the windows versions (both in number crunching and interface).
- Re:Octave (Score:2)
  
  by joib ( 70841 ) writes:
  
  is there a native MATLAB for Linus now?
  
  Matlab has been available for Linux since 1995.
- Matlab for Linux (Score:2)
  
  by Kunta Kinte ( 323399 ) writes:
  
  Matlab for linux http://www.mathworks.com/products/system.shtml/Uni x [mathworks.com]
He didn't include K. (Score:5, Interesting)

by Jayson ( 2343 ) writes: <jnordwick @ g m a i l . com> on Thursday January 02, 2003 @05:23AM (#4997441)

K [kx.com] is a high-performance array language. It is based on APL and Lisp. It really shines when crunching obscene amounts of data. This seems like something that would be perfect for the language. The proof of K's speed lies in KDB, a database written entirely in K. On TPC benchmarks [kx.com] is spanks Oracle and other leading databases (including some amazing scaling [kx.com] across processors: simple table scans with 2.5 billion rows take 1 second and multi-dimensional aggregations take 10-20 seconds).
There is a quick and dirty intro to K [kuro5hin.org] over at Kuro5hin.
Some more links for more inforation:
Kernigan's benchmark test [kx.com]
more examples [kx.com]
Kx [kx.com]: the people who make K and KDB

Share
twitter facebook
- Re:He didn't include K. (Score:3, Interesting)
  
  by sql*kitten ( 1359 ) writes:
  
  The proof of K's speed lies in KDB, a database written entirely in K.
  
  It looks like an interesting product - I'll definitely take a closer look once it goes MT and 64-bit. Seems a little strange to me that it wasn't built like that from the ground up, since it seems to rely so heavily on clever data structures and virtual memory caching. (Altho' I do note that the slave processes share memory, which is the way Oracle does it if you don't want MT).
  
  Also, I'm unconvinced the inverse design will work well on sparse data. In every deal there are usually plenty of unused fields on the ticket, unless you fully normalize. It works well enough with rows, you just place all the nullable columns after the non-nullable, and Oracle will simply skip over them to the next-row marker in physical storage. Inverse tables will be fast for simple aggregates, not so sure how well they would perform for complicated multi-table joins and groups with many predicates.
- Kernigan's k and java and perl (Score:2)
  
  by Camel Pilot ( 78781 ) writes:
  
  From Kernigan's benchmark test
  
  k is faster (sum of times)
  k(32) perl(95) java(300) tcl(1400+)
  [k is much faster than c on strings and memory management.]
  
  k is shorter (lines of code)
  k(9) awk(66) perl(96) tcl(105) scheme(170) vb(200) java(350)";
  
  Interesting comparison. It took 350 lines of java to compare to 9 lines of k and 96 lines of Perl. also Java was 10 times slower than k and 3 times slower than Perl.
- nor matlab, nor ... (Score:3)
  
  by g4dget ( 579145 ) writes:
  
  There are plenty of "high performance array languages". Matlab is one of them, and so is Numerical Python, and A+. I don't see any particular reason to push a commercial product like "K". Languages like that derive their speed from excellent underlying libraries; there is nothing amazing or special about that.
  On this particular benchmark, "K" would probably perform very poorly--because it doesn't involve any big arrays. But, since you like "K" so much, why don't you try for yourself and report back?
- - Very true (Score:2)
    
    by Jayson ( 2343 ) writes:
    
    KDB performs as well as it does because of this. It inverts the tables and store them by columns instead of rows. The K language goes along with that principle and has special meta-operators and operators for dealing with large amounts of data of the same type.
    
    The language is interpreted, too.
    - Re:Very true (Score:5, Interesting)
      
      by homb ( 82455 ) writes: on Thursday January 02, 2003 @06:22AM (#4997530)
      
      Yes, but indeed if you're really looking to benchmark only, comparing a row-based database engine with a column-based one is like comparing an apple to an orange. Both are fruits, both give you calories, but they're quite different.
      
      Now as we're going off-topic from the original submission, one could benchmark KDB with Sybase IQ Multiplex. Here you're talking about 2 column-based db engines. In my testing, KDB is indeed up to an order of magnitude (10x) faster than Sybase IQ which is itself 2 orders of magnitude (100x) faster than row-based database engines.
      
      However, as the article in the post says, benchmarks don't give the whole story.
      
      Apart from the usual learning curve issues and available management tools (which KDB sadly lacks compared to Sybase IQ), there is one fundamental difference between the 2 db engines (and Oracle, DB2, Sybase ASE, etc...):
      
      KDB is single-process, and does not pool memory. I'm not saying this is bad, but it makes for very interesting architectural issues when designing a system. For example, if you're going to use KDB, you're better off with the fastest possible single-CPU system. The best platform for KDB is probably the fastest Intel P4 Xeon, dual-processor, and as much RAM as possible on the machine. One processor will be used exclusively for KDB, the other for the OS. To grow, you'll implement a farm of those.
      
      On the other hand, the other major DB engines generally perform much better in multi-CPU systems such as 16-way Sun servers. They pool the memory and use all the CPUs you'll give them. This makes for a more expensive single system, but an easier implementation if your application is larger than what a single dual Intel box can provide. In such a case, KDB will need one write engine and multiple read engines, significant storage pooling issues, etc...
      
      Anyway, one last point regarding column-based database engines: they are certainly amazing for reporting and most read commands. Where they lose to row-based engines is in inserts, and in selects that return data from a large number of columns.
      In the former case, you trick KDB and Sybase IQ into performing batch inserts (where the loading of columns will only be "wasted" once per batch). In the latter case, you're going to be hurt with KDB and Sybase IQ whatever you do, as they'll have to load in memory all the columns out of which you need the data.
      
      Bottom line:
      
      If you need OLTP (lots of inserts/updates) and aren't worried about extreme speeds, go for Oracle, Sybase ASE, DB2, etc...
      
      If you need fast reporting with very quick time to market, go for Sybase IQ Multiplex.
      
      If you need the absolute ultimate in reporting speed and have the time and resource to apply to it, go for KDB.
      
      Parent Share
      twitter facebook
- - Re:He didn't include K. (Score:5, Insightful)
    
    by DGolden ( 17848 ) writes: on Thursday January 02, 2003 @08:37AM (#4997785) Homepage Journal
    
    You should consider the readability of the language for someone WHO KNOWS THE LANGUAGE, dammit.
    
    I don't go round claiming japanese or arabic is unreadable - I just don't know the language.
    
    The analogy extends further - it is possible to construct almost unreadable drivel in natural languages, and it is possible to construct almost unreadable drivel even in python.
    
    However normal code written in python, forth, common lisp, or even, god forbid, c++ or perl is readable to someone who knows the language.
    
    Now, some programming languages are closer to english in appearance than others. However, for long-term use, that doesn't matter so much - it's just a barrier to entry for lazy people.
    
    I don't happen to know K. I do know APL, though - APL didn't look like K, since it had its own non-ASCII symbol set. I do find it difficult to read the relatively new asciified line-noise APL-derived languages. But that's because I haven't bothered to learn 'em! I do suspect they would be harder to learn than APL, since the ASCII symbols are already overloaded with so many other meanings already - but once I'd learned them, I would expect that problem to fade - just like I'm not confused that "gift" in German means "poison" in english.
    
    Actually, now that Unicode is widely supported, I would love to see a resurgence in APLs that use APL symbols, since they're much clearer to me - but so many people have been using the ASCIIfied APLs for so long now that that may never happen.
    
    Parent Share
    twitter facebook
    - Re:He didn't include K. (Score:2, Insightful)
      
      by master_p ( 608214 ) writes:
      
      Now, some programming languages are closer to english in appearance than others. However, for long-term use, that doesn't matter so much - it's just a barrier to entry for lazy people.
      
      I don't think so. Using symbols for expressions is not the same as using English, especially if one is developing in more than one language.
      
      You should also consider other facts:
      
      the introduction of new members in the development team, especially if the new members don't know the language
      maintenance and service after installation; it may not be the original developer (who was so efficient in K) that maintains the project
      readability; after long coding hours, ',' is easily mistaken for '.', for example; and with so many symbols packed together in such tiny space, the problem only gets worse.
      learning curve; much higher than a language based on English.
      relevance to documentation/pseudocode; for example, it is much easier to make someone understand that the developed code follows the pseudocode defined in a project's specifications when the code is as close as possible to the English language than when the code is a bunch of symbols thrown together.
      Although Python/Lisp/C++/whatever are readable, this is because they are based on English. The more a language is based on English, the better it is for big project development. That's why every coding style says "use readable variable names"...If things were as you claim, we all be programming in assembly or with 0 or 1s; after all, how hard to make a mistake with 2 symbols only :-) ?
      
      Don't forget that Hypercard has been claimed as one of the best programming environments because of its ability to program almost in English.
      
      The attitude "symbols are ok, as long as I understand them" shows that you are ignorant of the issues of real development (with managers, deadlines, multiple and heterogenous environments, different coders, testers, bug reports and bug databases, etc).
      
      Finally, the obfuscated C code contents would not exist if code readability did not play an important role!!!
      - start make an object (Score:2)
        
        by kfg ( 145172 ) writes:
        
        containing an item stop Now make another stop Now put them together in a new object stop Duplicate this object stop Now make a new object containing the two objects that were the result of the earlier operations stop Count the items in the latest object stop There are four of them stop stop
        
        2+2=4
        
        *Sometimes* purely symbolic typographical notation is *far* more readable than natural language.
        
        A wise programer will use each method appropriately. First learn *math,* THEN learn programming.
        
        KFG
      - Re:He didn't include K. (Score:3, Insightful)
        
        by DGolden ( 17848 ) writes:
        
        Firstly, I would like to echo the sentiment in kfg's reply to your post.
        
        Secondly, all html formatting seems to have stopped working- dunno why, so I apologise for the poor formatting of this post.
        
        Thirdly:
        
        *** Now, some programming languages are closer to english in appearance than others. However, for long-term use, that doesn't matter so much - it's just a barrier to entry for lazy people.
        ***
        
        *** I don't think so. Using symbols for expressions is not the same as using English, especially if one is developing in more than one language.
        ***
        
        O.K. I may have been oversimplifiing - but: ENGLISH IS SYMBOLS TOO. Using English is exactly the same as using symbols - since using english, or any language, *is* using symbols. That's how humans abstract. In written english you have letters, composed into compund symbols (aside: these compund symbols, "words", are often treated as primitve symbols by fast readers, whose symbol-recognition wetware seems to recognise them in one swoop.)
        
        Now, one could argue "then why not use familiar symbols" - but I find using familiar symbols just because they're familiar to be a bit silly and often dangerously misleading. Think of the emotive symbols "theft" or "piracy" applied to "violation of copyright law", in reality a quite different concept. Or "=" used for both assignment and equality testing (arrgh!!!!), neither of which correspond to mathematics-=, which is a statement of equality.
        
        ***You should also consider other facts:
        ***
        
        ***the introduction of new members in the development team, especially if the new members don't know the language
        ***
        
        It is often better to just budget for bringing the new member up to speed in the language. Or just decide to only hire "speakers" of the language in question. You'll have to bring the new member up to speed on all the little pecularities of your codebase anyway, a much harder task for the new member than merely learning another new computer language.
        
        ***maintenance and service after installation; it may not be the original developer (who was so efficient in K) that maintains the project
        ***
        
        No, but one would hire or train someone capable of maintaining it. Here's usually where the strong arguments for using COBOL or Java come in - "what if one can't hire a K developer in 3 years?", and so on. But learning a new language should be VERY EASY for anyone who calls himself a "programmer" - the hard part will be understanding the code, not the language.
        
        ***readability; after long coding hours, ',' is easily mistaken for '.', for example; and with so many symbols packed together in such tiny space, the problem only gets worse.
        ***
        
        Yes, that is true - but, interestingly, the number of lines of code written by a programmer in a day stays roughly constant - regardless of language used - so the more verbose the language, the less your programmer is doing. And non-ASCII APLs, for example, have much easier to distinguish symbols than #\, and #\. :-)
        
        ***learning curve; much higher than a language based on English.
        ***
        
        Somewhat higher.
        
        Note that I consider "A language based on english" very different to a language in which symbols^Wkeywords happen to correspond somewhat to english words. There are very few programming languages based on english, in which the grammatical structure of the language corresponds closely to english. Many programming languages, however, have keywords based (loosely - printf ??? like "print" followed by a stifled sneeze...) on english words - but the symbols are strung together in ways that are very different to english, and have their own meanings in that language that are typically very different to their english meanings. You can english-open an english-door, but you can mainly only unix-open a unix-file. I won't mention "ontological commitments" right now.
        
        Once you know what "printf" does, you manipulate it as a whole in a c program, you don't spell it out each time you write it "p-r-i-n-t-f" spells "print". A Chinese person can write in C quite easily without knowing much English - once he knows what the symbol "printf" does in the context of C, he doesn't *need* to know that there is an english word "print" or that "f" is the first letter of the english word "file".
        
        So english-like symbols can indeed help in the discovery phase, when you are trying to guess what a symbol does - but so could spanish-like. Or chinese-like. RTFTM (Read-the-fucking-translated-manual) can help here - as a programmer you are manipulating symbols when you are writing a programs, and one of the things programmers spend most of their timing doing is looking up definitions of the meanings of symbols in a particular context - if the definition is in french, and you're french, you can use the symbol, even if the symbol doesn't look french.
        
        ***relevance to documentation/pseudocode; for example, it is much easier to make someone understand that the developed code follows the pseudocode defined in a project's specifications when the code is as close as possible to the English language than when the code is a bunch of symbols thrown together.
        ***
        
        Not if the documentation or pseudocode is in any language other than english. Note that I am European, so I am probably more used to a multilingual environment than you if you are American.
        
        ***Although Python/Lisp/C++/whatever are readable, this is because they are based on English. The more a language is based on English, the better it is for big project development.
        ***
        
        I would disagree e.g. Lisp is SO not based on english. The symbols defined by the language spec are. The "sentence structure" is almost completely alien.
        
        ***
        That's why every coding style says "use readable variable names"...***
        
        Yes - but readability depends on the reader. And I'd prefer "use meaningful variable names". So english variable names make sense on a big project, since chances are your reader speaks english. But would you suggest using "BEGIN" and "END" instead of "{" and "}" on the same project ???
        
        *** If things were as you claim, we all be programming in assembly or with 0 or 1s; after all, how hard to make a mistake with 2 symbols only :-) ?
        ***
        
        Not at all - you misrepresent me. That would be antithetical to forming abstractions via new symbols. (aside: don't forget, most stuff is in fact written in "portable assembler" like languages like C - most good assemblers allow macros and therefore the beginnings of higher-level languages -.e.g Amiga m68k macro assembler .i header includer files had a near 1:1 mapping to the Amiga C .h header include files, including macros for structs and so on.)
        
        ***Don't forget that Hypercard has been claimed as one of the best programming environments because of its ability to program almost in English.
        ***
        
        (a) Hypercard was claimed as one of the best programming environments for "non-programmers".
        (b) Hypercard was one of those few languages where the grammatical structure of the language, not just the keywords, are english-like.
        
        ***The attitude "symbols are ok, as long as I understand them" shows that you are ignorant of the issues of real development (with managers, deadlines, multiple and heterogenous environments, different coders, testers, bug reports and bug databases, etc).
        ***
        
        I assure you, that not my attitude. I am a professional developer and encounter all of the above issues on a daily basis). English-like symbols do indeed help when trying to understand a system. They can also mislead - a variable called "applecount" that holds a count of "all oranges, pears and apples since last tuesday" are very annoying, for example.
        
        My attitude is "symbols are ok, so long as the intended readership (including the computer :-) ) understands them". I include English-like symbols in that.
        
        ***Finally, the obfuscated C code contents would not exist if code readability did not play an important role!!!
        ***
        
        True. To make the point in my previous comment more concrete: But have you ever seen legalese or a patent document? They're supposed to be in english - all legalese looks obfuscated because they're trying to fit a precise layer on top of an imprecise natural language, and patents are deliberately obfuscated as a matter of course.
Mirror (Score:3, Informative)

by elnerdoricardo ( 637672 ) writes: on Thursday January 02, 2003 @05:26AM (#4997447)

Here, I put up a temporary mirror in case this site melts...
Coyote Gulch Mirror [elnerdoricardo.com]
Be gentle! I'm sure my server is meltable, too! ;)

Share
twitter facebook
Good. (Score:4, Interesting)

by neksys ( 87486 ) writes: <grphillips AT gmail DOT com> on Thursday January 02, 2003 @05:29AM (#4997455)

No one benchmark (or set of benchmarks) provides an absolute answer -- but information helps make reasonable decisions
Ah ha! Someone who understands what benchmarks are for and how to use them - it sometimes seems like the corporate world uses numbers from benchmarks only when they prove their claims. Of course, that's the difference between open source and the business world - open source (ideally) looks at every benchmark result and asks "now how can we get all of these numbers better than the competition?" while more traditional businesses ask, "Which of these numbers make our product look the best?". *shrug* its just nice to see benchmarks used properly, is all.

Share
twitter facebook
Intel C++ (Score:2, Interesting)

by jsse ( 254124 ) writes:

I always find Intel C++ shines in all benchmarks. I wondered if anyone has ever tried to compile linux out of it? I know it might hurt your ideology but just for the fun of it. :)
- Re:Intel C++ (Score:3, Informative)
  
  by Anonymous Coward writes:
  
  Intel have released a separate version of their C++ compiler for Linux, which they claim has good GNU C/C++ compatibility http://www.intel.com/software/products/compilers/c lin/. They say it can be used to build the Linux kernel with few modifications.
  - Re:Intel C++ (Score:2)
    
    by axxackall ( 579006 ) writes:
    
    Each time some people forget that Linux is not only for Intel platform - many users use Linux on Mac/IBM /Amiga PPC, on Sun Sparcs and on other hardware platforms.
    Intel C++ is only for x86, therefore it's for Linux/x86, not for Linux in general. Therefore, Intel C++ should not be used by developers who write the code for other linux users (for Linux in general). GCC must be used instead.
- Re:Intel C++ (Score:3, Informative)
  
  by 0x0d0a ( 568518 ) writes:
  
  It doesn't work. Linux uses gcc extensions. Plus, the number of compiler bugs Linux exposes means that running it under icc would probably involve fixing a bunch of icc bugs.
  
  And you'd probably have to fix about a zillion Linux bugs...
SPEC-FP for different compilers? (Score:3, Informative)

by billstewart ( 78916 ) writes: on Thursday January 02, 2003 @05:44AM (#4997473) Journal

The SPEC [spec.org] benchmarks are the descendants of the late-80s SPECmark benchmarking projects that did performance comparisons across a wide variety of machines and architectures, using code derived from real applications rather than purely synthetic little benchmarks like Dhrystone. Their benchmark suites were roughly 10 programs, with weightings on each program's results and scaling to compare with some popular architectures. They now have a variety of different benchmarks [spec.org], covering a range of types of applications, including floating point. The benchmarks have tended to be used by hardware manufacturers, so they'll usually have just one result for a given machine, with the options obviously tweaked for maximum performance, but the details are provided and sometimes there'll be tests using different compilers (e.g. because it's a compiler maker doing the test.)

The benchmark programs aren't free - this is a non-profit industry association that charges money to cover its costs, but there are a number of universities that are members or associates which may be able to do testing that could explore some of the compiler differences; poke around their website to see who's reported what kinds of results.

Share
twitter facebook
- weighted? (Score:2)
  
  by SHEENmaster ( 581283 ) writes:
  
  So how much will the beowulf project have to pay for parrallel tasking to rank over serial tasking in the tests!?
  
  This is a joke, not libel. Beowulf is too poor to bribe.
Numbers of some Interest. Conclusions a yawn. (Score:2)

by CresentCityRon ( 2570 ) writes:

Its nice that the writer went to all the trouble to work on this and share it.

His conclusions are not very revealing. Anyone doing Java programming will quickly discover how slow it is esp. in regards to floating point. You don't need a benchmark for that.

That C++ performed as well as Fortran on the author's examples is interesting to me. Is the C++ implementations getting better or is the Fortan complier gone soft?

His reasons for staying with Fortran made me chuckle. Those are the same answers programmers gave back in 1983 when I asked them why they didn't convert after learning the new fangled languages C and Lisp. (Well new for me anyway at the time.) When it is right it is right. I guess Fortran is still alright.
Java is slow? (Score:5, Interesting)

by nsample ( 261457 ) writes: <nsample AT stanford DOT edu> on Thursday January 02, 2003 @05:59AM (#4997497) Homepage

I'm hardly a Java junky, but I've spent a lot of time recently with the language and I've heard a lot of complaints from my peers about Java being slow. Most of the time, just like this author, they're wrong! Java isn't slow, but sometimes you do have to program more thoughtfully to make Java fast.

First things first, though. No one would ever claim that JDK 1.4 is the ultimate Java speed demon. Even the "HotSpot" in server mode is going to be slow if your code isn't written well. But the author fails to do any profiling, and fails to give anyone even a hint as to why Java doesn't perform well. But I shouldn't get on him about his coding, or lack of profiling... neither issue is the reason his test showed Java to be slow.

The real problem: Firs, I'll cut him some slack for not profiling. However, I won't cut him any slack for using an interpreter instead of a JIT compiler. Java's been shown time and time again to be as fast as FORTRAN/C++ when using a good compiler, rather than an interpreter. *sigh* When will the madness end? A 0.07 second query to Google should explain that one to even a novice. Java IS fast. Interpreted byte-code is slow. Java != interpreted byte code; Java is a language.

Anyway, here's a link to a weak, biased, and not so rigorous argument backing up that statement. But, it's an easy read for Java newbies, so I'll risk posting it anyway: Java is Fast for Dummies(tm) [javaworld.com]

Share
twitter facebook
- Re:Java is slow? (Score:2, Informative)
  
  by fobef ( 541536 ) writes:
  
  What do you mean by "using an interpreter"? He used Sun's run time, even with the -server switch, which does some for java quite serious optimizations. The real disadvantage java has (in terms of performance, it is an advantage otherwise) isn't the IEEE requirements, but rather the extensive use of runtime binding of classes. In C code, the compiler can inline pretty much anything for you, and get long runs of instructions to schedule as it likes. The other big performance disadvantage compared to C++ is that in C++ you can often have complete control over how arrays are laid out in memory, and test different ways to see which ones work best with your cache hierarchy. In Java you just 'new' it and then you have no control over it. I often hear that Java is about half the speed of C++, but my own tests put it more in the 10% to 20% range. Maybe half is true for poorly optimized C++ code vs poorly optimized Java code, but for well optimized C++ vs well optimized java the difference seems closer to five times. Which is fine to me most of the time, so I generally use Java.
  - Re:Java is slow? (Score:5, Informative)
    
    by AG ( 3175 ) writes: on Thursday January 02, 2003 @06:59AM (#4997586)
    
    gcj really is within 10% of g++ on this benchmark, unfortunately he built the gcj program without the all important -ffast-math option (and -funroll-loops). This is a huge penalty for gcj - more than 2x slower without.
    
    I sent him a note and hopefully he's update his page.
    
    Parent Share
    twitter facebook
  - Re:Java is slow? (Score:4, Insightful)
    
    by X ( 1235 ) writes: <x@xman.org> on Thursday January 02, 2003 @07:17AM (#4997622) Homepage Journal
    
    The -server option actually will imposive significant overhead for this benchmark. The -server option is not going to do any of it's significant optimisations without a TON of work.
    
    All your statements about C++ having an advantage over Java in terms of memory management are silly of course, since the Java runtime performs these exact kind of optimisations with Java programs. Because the decision is made at runtime rather than compile time it is actually possible for the Java runtime to make better optimisations than the C++ compiler/developer (who's decisions all have to be made a priori). I'm not saying this means Java always wins, because it most certainly does not, but I'm just saying that the "disadvantage" you are talking about is actually a misunderstanding of the conceptual differences between these two models.
    
    Parent Share
    twitter facebook
  - Re:Java is slow? (Score:4, Interesting)
    
    by X ( 1235 ) writes: <x@xman.org> on Thursday January 02, 2003 @07:23AM (#4997632) Homepage Journal
    
    For the record, I actually worked with the JPL evaluating Java's floating point performance. This was in the JDK 1.3 era, when HotSpot was still new. They had initially ported a highly optimised C library to Java and found the performance about in line with what this guy got (4-10x slower... actually it was an order of magnitude worse than this until they used the JIT ;-). The Java code showed many of the same performance errors that this guy's code has, as is common when you just do a line-by-line translation to Java, rather than rewriting the code from scratch. I did rewrite the code base, and managed to get the performance within the 10%-30% range. Using JDK 1.4 I'd have a few other tricks available to me which would probably get it even closer (maybe even faster).
    
    Parent Share
    twitter facebook
    - Re:Java is slow? (Score:2, Interesting)
      
      by fobef ( 541536 ) writes:
      
      This is interesting, but I'm still very sceptical.
      
      First of all, the possibility to align data to suit the cache is very very real and not silly. Of course it might not make a difference if you're memory bandwidth bound or whatever else is limiting performance, but for critical working sets close to the cache size it can, and most likely will, make tremendous difference.
      
      The advantage that Fortran has over C++, except for lots of well tested libraries, is the fact that pointers cannot interfere with each other. For example, writing to memory through pointer PA cannot affect a read from pointer PB, so the compiler can rearrange the read and write in any order. This thing alone made fortran faster than C++ on most math benchmarks.
      
      There now is a restrict keyword in C++ that does exactly the same thing, but what does java have? Nothing like that ofcourse, and IMHO it shouldn't. Because java is a language that makes for bug free software with good-enough performance.
      
      I have only written two performance critical programs both in java and c++, so my numbers are anecdotal evidence at best, but c++'s victory was devastating both times. And they where no line-by-line conversions, no unneccesary news etc. Of course they where a far cry from the "loop one million times and make a double precision fdiv" of the article linked in the original post in this thread where they found out that java and c++ was about equal. But then, so was the code you wrote, which makes it interesting.
      
      Is this thing you ported freely available somewhere?
      
      Also, how do you know that the -server switch hindered performance? It was better than -client, right? I assume it performs its optimizations when loading the class, and the only disadvantage compared to -client would be when inlined, runtime-bound methods have to be un-inlined because of a new class being new'ed in the place of an old object with different class (did that make sense?:-) So is this assumption wrong, or do you know that this is in fact happening in this benchmark?
      - Re:Java is slow? (Score:3, Interesting)
        
        by X ( 1235 ) writes:
        
        Okay, first of all, the Java VM is fully capable of doing cache friendly memory management... in fact it can actually discover things about the cache's behavior at run time that are not likely to be known a priori by the compiler/programmer. So it's actually quite reasonable for a Java VM to excel in this regard.
        
        The instruction reordering issues you're talking about tends to only factor in when you have vector-processing CPU's. The limited vector-processing on the CPU's being tested is not enough to break the compiler's back, so it's not surprising Fortran is performing similarly to C++.
        
        Sorry, but my code isn't available elsewhere. However, I point out a few things this guy got wrong in another post.
        
        The difference between -client vs. -server is actually probably the opposite of what you are imagining. -client actually performs more optimisations at class load time, but fewer optimisations overall. This is in recognition of the fact that client programs tend to run for shorter periods of times.
        
        The -server switch actually REDUCES the number of optimizations performed when loading a class. Instead, it adds extensive profiling to all code execution, and then the runtime will gradually optimise frequently executed code based on analysis of how it executes. The optimisations are introduced gradually, almost on an "as needed" type basis. This approach allows the JVM to perform optimisations which simply aren't practical for a compiler to do, because the compiler doesn't have access to the profiling information gathered at runtime. When faced without a JIT, many programmers effectively do peep-hole optimisations of this nature by hand, but they ultimately can't predict what future runtime environments are like, nor can they effectively perform global optimisations on large programs. Of course, there are other differences between -server and -client, but this is by far the most important one.
        
        Re:Java is slow? (Score:2)
        
        by X ( 1235 ) writes:
        
        Actually, some VM do cache friendly management. All of the VM's take care of basic byte-alignment issues.
        
        The instruction ordering for which you need explicit declaration of read/write barriers for are large matrix math operations, and even then it only matters if you've got huge amounts of data parallelism from a vector processor. Everything2 [everything2.com] has a better description than I can write. All of the modern JVM's currently do basic instruction reordering just like C++ compilers, and it's enough to keep the execution units in modern CPU's pretty busy. Things like the "restrict" keyword aren't going to help much with that.
        
        As for "-client vs. -server". This is really basic Java stuff. Check out Sun's own docs on the matter [sun.com]. Where to start. *BOTH* VM's are HotSpot VM's. Both behave the way you describe the client. The difference is that the client VM does less profiling of the code and performs fewer optimisations (actually, there are many more differences, but this is what's relevant to what you said). The benchmark taking a "couple of minutes" to run is not enough for the -server VM to really shine. It's designed for programs that at least run for hours.
        
        Finally, it's actually NOT the number of minutes that the code runs that really determines what optimisations are good/bad for performance. It's the number of times the same code is run. And while some of the code is undoubtedly getting optimised, it is not enough for it to get all the optimisations, and it is certainly not enough for the optimisations to outweigh the penalty for profiling that was incurred during the earlier execution.
        
        Re:Java is slow? (Score:2)
        
        by X ( 1235 ) writes:
        
        Sorry, to clarify, I understand that execution units in modern CPU's aren't very busy. However, this is for the most part not due to deficiencies in the expressiveness of programming languages/compilers. In general, most computer programmers have a fairly linear flow, and even if you wrote the code in assembler by hand you'd be hard pressed to keep the execution units busy all the time.
        
        The original poster was specifically talking about read/write barriers. Fortran has special features which allow the compiler to make some very impressive assumptions about read/write dependancies. These features allow for much more powerful instruction reordering than the kind of stuff you are talking about. Indeed, as he said, these optimisations aren't possible in C/C++ either without the use of the restrict keyword. As I said, the stuff you are talking about is entirely possible to do in a JIT, and is in fact being done in all the major JVM's.
- JIT? (Score:4, Informative)
  
  by EnglishTim ( 9662 ) writes: on Thursday January 02, 2003 @06:34AM (#4997543)
  
  The article claims that he *is* using a Just-In-Time compiler. What makes you think otherwise?
  
  Parent Share
  twitter facebook
  - Re:JIT? (Score:2)
    
    by Mansing ( 42708 ) writes:
    
    This statement from the author indicates he doesn't understand the JVM versus a JIT:
    
    "Perhaps Java's Just-in-Time compiler could be enhanced to perform processor-specific run-time optimizations; on the other hand, doing so would require different JVMs (Java Virtual Machines) for different architectures, or a single bloated JVM with run-time architecture detection."
    
    The Java compiler produces byte-code, and the JVM interprets the byte-codes. A JIT *is* platform specific, and tied to a specific OS and hardware architecture. That's why I think the author was not using a JIT.
- Re:Java is slow? (Score:3, Interesting)
  
  by nsample ( 261457 ) writes:
  
  Btw, I am dating myself with the griping about JIT versus purely interpreted and all that, but there is something important here! I decided that my first post was decidedly unclear, and that I should actually profile the dang that and get some real numbers.
  
  The almabench program spends a lot of time in library routines that the author has no control over, and aren't always written the same way that they are in FORTRAN/C/C++. For instance, almabench makes 5,844,000 calls to java.lang.Math.asin(D), which then calls java.lang.StrictMath.asin(D) 5,844,000 times. The same is true of the 11,688,000 calls to atan2()... they're also passed along to StrictMath (only abs() is called as many times as atan2()). The beauty of writing java code is *not* knowing that these sorts of things are going on, no? For best *performance*, however, we have to work a little harder.
  
  I really enjoyed the following paper on using some OO programming-style optimizations coupled with a smart runtime to get almost identical FORTRAN and Java performance for linpack: linpack9 [rice.edu].
  
  A look there will validate the other comment about why OO designs can unecessarily kill performance and why the study's author should have used a JIT with identical libraries for math functions rather than Math.X(). Comparing otherwise is like comparing apples and oranges, or even Apples(r) and BillG.
  - Re:Java is slow? (Score:2)
    
    by Trinition ( 114758 ) writes:
    
    For instance, almabench makes 5,844,000 calls to java.lang.Math.asin(D), which then calls java.lang.StrictMath.asin(D) 5,844,000 times. The same is true of the 11,688,000 calls to atan2()... they're also passed along to StrictMath (only abs() is called as many times as atan2()). The beauty of writing java code is *not* knowing that these sorts of things are going on, no? For best *performance*, however, we have to work a little harder
    Are you crazy?!. You think Java is supposed to let you get away with NOT knowing how your code works and interacts with the code it relies upon?
    Java, like any other language, needs to be scrutinized for performance problems. If you profiled and found that you code called java.lang.Math.asin, which in turn called java.lang.StrictMath.asin, then you should've re-written your code to use the second method directly. If it was in a third party piece of code, then you should've severely questioned the authors, and yoruself for using it. If it was in a benchmark,t hen you have to ask yorself if this benchmark is a good benchmark to use.
    I've encountered this myself several times. For instance, Ineeded to parse a big file based on lines and pipes (|). I figured it'd be easiest (i.e. fastest for me to get the code done) to just use java.util.StringTokenizer. When it turned out to be far too slow for what I needed, I profiled it. It turns out StringTokenizer is optimized to separate Strings into tokens when there are several possible tokens to separate on. In my case,once I rea da line, I just needed one delimitter. Hand-writing my own tokenizer for handling the special case of just one character provided far supoerior results and is in use today.
- Picking nits with that Java article (Score:2)
  
  by MichaelCrawford ( 610140 ) writes:
  
  The article compares loading graphics in BMP format for C++ vs. GIF or JPG format for Java.
  It doesn't seem to acknowledge the widespread availability of file format libraries that are suitable for use with C++, such as libjpeg from the Independent Jpeg Group [ijg.org].
  It also repeats the commonly stated claim that Java is free from memory leaks. Nothing could be further from the truth. While it is possible to carefully write a Java program that doesn't leak, I don't think it's any easier than making a leakproof C++ program.
  How does Java leak memory? Simple. Just hold references to memory you don't need anymore. Holding a reference to any node in a DOM Tree will prevent the entire tree from being garbage collected. That's an easy way to leak tens of megabytes.
  Garbage collection is no substitute for responsible memory management.
  It also says that in C++, you have to implement graphics calls by coding to the native API of the OS. It says in Java, you have a portable way to do that.
  But that's just bullshit. The article makes no mention of cross-platform application frameworks, a number of which were around before Java was ever dreamed off. If you use ZooLib [sourceforge.net], for example, to code in C++, you can do everything you want with graphics with no need for platform-specific code. There's lots of other cross-platform frameworks, such as wxWindows and Mozilla, even GTK+.
  - Re:Picking nits with that Java article (Score:3, Interesting)
    
    by X ( 1235 ) writes:
    
    There is no question that you can have a memory leak in a Java program. However, Java does remove a number of potential ways to create a memory leak that do occur in C++ (heck, 90% of C++ programs I see will leak heap memory if you have exceptions). Anyone who claims that "While it is possible to carefully write a Java program that doesn't leak, I don't think it's any easier than making a leakproof C++ program" either hasn't written a lot of C++, or hasn't written a lot of Java. ;-)
    - I've written a lot of both (Score:3, Informative)
      
      by MichaelCrawford ( 610140 ) writes:
      
      I prefer programming in C++ because I find it easier to just take responsibility for my memory.
      Using reference counting is one big help. And knowing how to write exception safe code is another. Yes, it's a difficult subject, but it is something you can learn.
      It may be easier to write leak-free Java than C++, but I suspect that because many Java programmers blithely assume their code can't leak, there may well be more leaky Java programs than C++ ones.
      I used to work at a web shop where we used the Enhydra Java application server. Enhydra is pretty well written, but the applications that the company originally developed for it were pretty sad. As a result they had to restart their servlet process every few hours because the JVM ran out of memory!
      - Re:I've written a lot of both (Score:2)
        
        by sql*kitten ( 1359 ) writes:
        
        I prefer programming in C++ because I find it easier to just take responsibility for my memory.
        
        auto_ptr [www.gotw.ca] is your friend :-)
    - Sun's Wireless toolkit (Score:3, Interesting)
      
      by FyRE666 ( 263011 ) writes:
      
      Anyone who's spent any time working with Sun's Wireless Toolkit to develop for mobile devices will have witnessed pretty serious memory leakage firsthand. I know I do! After starting an emulator 10-20 times to test code it's using so much RAM that it's necessary to kill and restart the Toolkit to get anything like reasonable performance back again!
      
      I'll agree that Java makes memory management much simpler, (I've spent a lot of time hacking x86 assembler, Pascal, C and C++ over the years) but bad programming can lead to leaks just as well. You tend to discover leaks pretty quickly with a mobile phone that has only 200K of RAM to play with though ;-)
    - Leaks in C++ (Score:3, Interesting)
      
      by Antity ( 214405 ) writes:
      
      (heck, 90% of C++ programs I see will leak heap memory if you have exceptions)
      
      In case any fellow beginning C++ programmer was wondering: This is because objects are only guaranteed to be destructed when an exception is thrown if and only if this exception is caught.
      
      So no matter what you're doing in your program, its main() should always look a bit like this to make sure every exception is caught:
      
      #include <stdexcept> #include <cstdlib> int main (int argc, char** argv) { int ret = EXIT_FAILURE; try { ret = do_something_in_my_code (argc, argv); } catch (const std::exception& e) { std::cerr << "Caught exception '" << e.what() << "' in main().\n"; } catch (...) { std::cerr << "Caught unknown exception in main(). Sorry.\n"; } return ret; }
      
      (Sorry, I can't seem to save indentation with /. HTML here)
      
      Please note that this way, you also always assure to have a valid shell return code (EXIT_SUCCESS from your function() would be the OK code).
      - Re: Leaks in C++ (Score:2)
        
        by Antity ( 214405 ) writes:
        
        This is utter nonsense. Any decent OS releases all memory used by a program when that program exits.
        
        This is utter nonsense. First, we're not only talking about memory here, the problem is all about freeing resources. Suppose your object acquired some resource on a remote host - maybe using RPC/UDP. When your program exits, no OS will free the object on the remote host and it will stay allocated until it times out. Bad.
        
        Second, if we're talking about freeing memory, you want to free it as soon as possible and not when the program ends.
        
        Then:
        
        it just shows the programmer ignorence of how memory handling in an OS works./
        
        No, this is how a PROGRAMMER should work. He shouldn't give a damn apple about how the underlying OS works.
        
        OS memory handling is irrelevant to the programmer.
      - Re: Leaks in C++ (Score:2)
        
        by X ( 1235 ) writes:
        
        Sorry, I misread your post. The problem you are talking about is real, but usually not an issue (because usually if you don't catch the exception the program exits, and all memory is recovered). I thought you were talking about a different problem.
- Java is really, really slow (Score:5, Insightful)
  
  by 0x0d0a ( 568518 ) writes: on Thursday January 02, 2003 @07:45AM (#4997673) Journal
  
  ...I've heard a lot of complaints from my peers about Java being slow.
  
  Allow me to join the chorus.
  
  Java isn't slow, but sometimes you do have to program more thoughtfully to make Java fast.
  
  No. Java *can* be made less mind-bogglingly slow by avoiding certain things...preallocating a pool of objects and using primitive types (like int) whenever possible helps. The way the language is designed makes it *easy* to be mind-bogglingly slow. That doesn't mean that going out of your way to avoid these things makes Java fast. It makes it only "slow".
  
  Java is Fast for Dummies
  
  Ah, yes. A link tellings how Java isn't *really* that slow on "javaworld.com". I took a skim.
  
  The first two pages say basically "Java isn't that slow". They then start rambling about various features that make Java a good language.
  
  They claim that Java programs load faster than native programs. (The article was written, BTW, in Feb '98, to give an idea of how full of BS they are). This is stupid. JVM startup and load time *dwarfs* application link time. Write "hello world" in C++ and in Java.
  
  First, they laud the small executable size of Java as being a performance boost based on binary format. Everything I've read points the *other* way...Java *is* fairly compact, but can contain data that isn't nicely aligned along host boundaries.
  
  Second, what they're talking about, if it's even accurate these days, which I doubt, has a lot to do with the lousiness of the Windows runtime linker. This isn't really an issue for Linux.
  
  Third, while insinuating that minimizing code size provides a performance boost, they talk about how great it is that Java lets you use *built in* libraries, whereas C++ progams need to *bundle* libraries. What? That's stupid. They're shifting the libraries around, but it sure as hell isn't decreasing total amount of data that needs to get loaded.
  
  Fourth, this gem: Finally, Java contains special libraries that support images and sound files in compressed formats, such as Joint Photographic Expert Group (JPEG) and Graphics Interchange Format (GIF) for images, and Adaptive m -law encoded (AU) for audio. In contrast, the only natively supported formats in Windows NT are uncompressed: bitmap (BMP) for images and wave (WAV) for audio. Compression can reduce the size of images by an order of magnitude and audio files by a factor of three. Additional class libraries are available if you want to add support for other graphics and sound formats.
  
  They're billing this as *improving* performance? Yeah, I'd love to have my app blow CPU time decompressing a JPEG image instead of reading a slightly larger BMP image if I'm trying to minimize load time. Oh, and have it load all the JPEG loading code, too.
  
  They then proceed to ramble about selective loading, and try to imply that Java's runtime linking is faster than C++'s.
  
  They *then* show off smaller binary sizes by embedding a BMP in the C++ binary and a GIF in the Java binary. Impressive.
  
  They then claim that claims of poor Java performance are based on non-JIT implementations. This neatly lets them avoid actually citing numbers. Sure, I'll agree that Java went from "Performance Hideous" to "Performance Bad". Everyone uses JIT these days, and damned if Java isn't *still* slow.
  
  They then try to talk about how JIT allows code to be optimized just like C++. Wow. Yup, JIT sure is known for impressive optimization, isn't it?
  
  They then use the most artificial, contrived benchmarks I've ever seen (which conveniently avoid almost all of the Java pitfalls...they don't need to do array access, they're trivial to implement without heap allocation...)
  
  They finish up talking about how C++ RTTI performance sucks compared to Java (ignoring the fact that Java hits RTTI code *far* more often than C++ does, like every time it yanks something out of a generic container class).
  
  Finally, they finish up by talking about a bunch of random Java features that they think are great, like garbage collection "First, your programs are virtually immune to memory leaks." Hope you don't use hash tables, buddy.
  
  Next, they talk about how a JVM can defrag memory. I'm going to have to just crack up at that. This isn't a performance boost unless you're using a language that *hideously* fragments memory and eats memory like a *beast* (granted, Java is the best candidate I know of). Runtime memory defragmentation went out of fashion with the classic Mac OS...it's pretty much a bad idea as long as you have a hard drive available. VM systems are pretty damn good these days...if you're trying to maximize performance, there are almost always better things to be doing than blowing cycles and bandwidth defragging memory. There's a reason we don't do it any more.
  
  Basically, my conclusion is that "Java is Fast for Dummies" is primarily aimed at, well, dummies.
  
  Parent Share
  twitter facebook
  - Re:Java is really, really slow (Score:4, Insightful)
    
    by 0x0d0a ( 568518 ) writes: on Thursday January 02, 2003 @07:51AM (#4997680) Journal
    
    Oh, and a follow-up to my previous post. These clowns spent a long time talking about how people can "ignore JIT overhead" because it's almost completely insignificant "most of the time". Fine. Then they spend 80% of the article talking about *binary load time*, which is essentially only an issue under the *exact* same conditions that the JIT is -- once for a single chunk of code. If they're pimping launch time, they sure as hell shouldn't be ignoring JIT time.
    
    Parent Share
    twitter facebook
  - Re:Java is really, really slow (Score:2)
    
    by Trinition ( 114758 ) writes:
    Here's some the "better" parts of Java:
    
    Rapid Development: because there is a huge core set of standard APIs, as compared to C++, any programmer can sit down with an expansive toolkit when writing a new app. This saves a lot of time by not having to reimplement those tools, or find 3rd party tools. Even if you have a set of 3rd party C++ libraries you like to work with, there is no guarantee that every 3rd party C++ library is going to work on every OS (I once worked in a shop compiling their code to 6 UNOIX platforms and NT -- what a pain!)
    
    HotSpot: As someone else pointed out, HotSpot will improve things at runtime, where you have a lot more information available to you, instead of just compile-time. The author of the article used HotSpot Server which is made to optimize under very different circumstances (for example, extremely long runtimes).
    
    Secure Software: By this, I mean avoiding things like buffer overruns. When you check your array bounds before accessing an array, you can be sure you won't overflow the array. Of course, it is slower. But even HotSpot optimizes a lot of this way (so I've heard, but I don't know exactly how). Sure, you *could* do this in any other language, but opbivously people don't. They get arrogant and assume that they'll never have bad data and just read to the end of a stream. Voila, you've just created a potential exploit!
    
    Everything Else: One of the big factors remaining is just HOW you write code in Java. As someoneelse pointed out, there are a ton of Objects in Java, (almost) every function is virtual, everything is linked dynamically, etc. These things slow Java down, btu also make it more uniform which makes it easier (faster) to learn, in my opinion. If you made every function in your C++ classes virtual, used RTTI and Strings to do runtime linking, etc. your C++ programs would be slower too!
    - Re:Java is really, really slow (Score:5, Interesting)
      
      by 0x0d0a ( 568518 ) writes: on Thursday January 02, 2003 @09:45AM (#4998001) Journal
      
      Here's some the "better" parts of Java:
      
      Don't get me wrong. Java's fine for certain applications. For lightweight networking stuff, I think it's almost unparalleled. It's also pretty good for prototyping C++ stuff. It's good for lightweight tasks that break down logically into threads -- Java has nice threading support.
      
      My beef is that Java is not, despite its supporters' loud claims (which have been going on for years), remotely performance-competitive with C.
      
      The language simply has some foundational performance limitations in it. It was designed that way, and tweaking implementations cannot get around that.
      
      I agree that there are some nice things about Java.
      
      Rapid Development
      
      Damn straight. Java is a great prototyping language.
      
      Hotspot
      
      Not bad, but not that incredible, either. The benchmarks I've seen haven't shown HotSpot to be incredible, and besides, competitors like C (gcc) have branch-profiling code of their own.
      
      Secure Software
      
      True. There are some improvements. But buffer overflows are less and less common in C (due to *excellent* libraries like glib), and have been fixed in other languages without anywhere near the performance hit of Java (like Ocaml).
      
      One of the big factors remaining is just HOW you write code in Java.
      
      It may be a personal thing, but I have a deep dislike of languages where you have to modify your regular coding style to get decent performance at a given point. It used to be BASICs...you'd use some nasty trick and you could actually get decent performance out of the thing. Then MATLAB. *God* I hate vectorizing operations. I expect that a MATLAB guru simply does this in his sleep, but I find it incredibly frusterating to totally rethink code in an any areas where performance matters.
      
      These things slow Java down, btu also make it more uniform which makes it easier (faster) to
      
      A fair number of the uniformity improvements in Java could have come from simply tweaking syntax (int[50] x instead of int x[50], for example).
      
      I'm all for modern language features...I just think that doing anything that implies a necessary performance hit is a bad idea. If someone wants a given feature, they can slap it on top. I can make C++ have a virtual function, but I can't make Java run quickly.
      
      If you made every function in your C++ classes virtual, used RTTI and Strings to do runtime linking, etc. your C++ programs would be slower too!
      
      Ya, but Stroustroup went to a lot of work to ensure that you only "pay for what you use".
      
      So, I'm not out to bash Java as a usable language. It has some major pluses. However, specifically in the performance arena, Java definitely has issues.
      
      Parent Share
      twitter facebook
  - - Re:Java is really, really slow (Score:3, Insightful)
      
      by 0x0d0a ( 568518 ) writes:
      
      Okay, for embedded systems there might be some argument. I actually considered consoles, but decided that no one in their right minds would be coding console games in Java. However, I suppose there are some embedded devices that run Java (don't some cell phones, IIRC), so I guess that isn't totally out of line.
      
      However, I stand by my claim that (at least on non-embedded systems) runtime memory defragmentation is pretty silly for general purpose stuff.
      
      1) Computers don't sit there and block on I/O any more. The world isn't Windows 95 anymore, and when something is paging, another app can be doing some processing. The cost of hitting the disk is far less significant.
      
      2) The bottleneck for current desktop systems is generally not CPU cycles, but memory bandwidth. Memory defragmentation eats bandwidth like there's no tomorrow.
      - Re:Java is really, really slow (Score:3, Informative)
        
        by X ( 1235 ) writes:
        
        Okay, first of all, for embedded systems this is actually very important. For any embedded system that has to run for years, one of the key issues is that it absolutely must use a heap manager that avoids long-term fragmentation (or have a static heap... but that's another story). Java happens to have a prepackaged solution for this. This, plus small executable size due to the use of byte codes is one of the reasons Java is used for embedded projects.
        
        Actually, the cost of hitting the disk is typically the most important impact on system peformance once you have an app who's working set can't fit into available RAM. That's because access for swapped virtual memory is orders of magnitude slower than virtual memory that in RAM. As you said, most systems today are not CPU limited, but rather memory limited, so switching to another app which also needs to access virtual memory is actually going to make things worse (just try running two programs who's working sets are larger than available RAM and you'll get the idea).
        
        As for memory defragging eating bandwidth like there's no tomorrow, that's only half of the picture. Good GC systems aren't constantly defragging memory at top speed, so the penality is frequently invisible. Smart systems GC when there is available CPU time and memory bandwidth (check out BEA's JVM for example). More importantly, by defragging intelligently you actually increase the chances that sequential memory accesses will hit the various memory cache's on the system (it's really nice when the cache lines are all filled with actual related data instead of unrelated fragments with gaps).
        
        The thing to keep in mind is that the people who write GC systems (or at least the good ones) aren't stupid about how they use memory. They spend a lot of time very carefully analysing memory management issues and put their findings into their system. As a result, most GC systems are better at doing memory management than your average programmer (although there are always special problems where a particular GC will perform particularly poorly).
  - - Re:With every GHz milestone... (Score:3, Interesting)
      
      by 0x0d0a ( 568518 ) writes:
      
      But with 90% of the code these days in business (which is the majority of custom code written in the world), Java's speed is acceptable and its portability and memory garbage collection outweigh any speed advantages of C/C++ in a Gigahertz world.
      
      Sure. Java's fine for doing a front end or simple database accessor. Hell, Hypercard was great for this.
      
      It's just not so great to do mainstream apps in, and people claiming that Java is "just as fast as C", which sometimes gets these people to waste time trying to implement their app in Java, gets my gander.
      
      Look at Corel's suite. They implemented the *entire stupid thing* in Java because it was going to be the "next big thing" and eventually be as "fast as C", according to the frenzied shouts of some of the Java supporters. Then they had to throw the whole thing out because of performance issues. I can't even imagine the cost, both direct and strategic (cost in time sitting on your ass while your competitor does something) to Corel. Same thing happened to Mozilla...the thing was originally going to be in Java. That idea got nixed...
      
      The other thing that vaguely pisses me off is that almost all of the things that Java does that make performance *suck* really aren't necessary. You can design a fast language that has most of its security model. Ocaml is quite safe -- moreso than Java -- but the Ocaml people imposed almost no overhead from C, because they avoided adding features that required runtime overhead, but went all out on things that could be somehow finagled into compile-time work. They *do* do most array bounds checking, but that's about it. Ocaml has GC, portability, and type safety...it just does almost all its work at compile time. I'd be willing to use Ocaml (well, actually I don't like functional languages much, but I'd endorse its performance) for almost any of the situations where I'd use C or C++.
      
      I'm not a huge fan of Ocaml itself, but it's a model for what *should* have been done with Java. If a feature is going to nail you performance-wise -- if users are going to have to do contortions to get decent performance -- you should think long and hard before adding it. Everything that sounded "cool" ended up in Java, which resulted in countless wasted CPU cycles for users around the world.
      
      I don't see why, as processors get faster, people feel the need to keep computers equivalently slow. Why not take advantage of the constant-time improvement? We can do better optimizations, do caching, because now we have the memory and extra constant time to do so. We have better body of knowledge about compilers, so why can't we *improve* performance instead of hurting it? It's *stupid*.
- He DID use a JIT compiler (Score:2)
  
  by joss ( 1346 ) writes:
  
  > using an interpreter instead of a JIT compiler
  
  No, he wasnt.
  
  > Java's been shown time and time again to be as fast as FORTRAN/C++ when using a good compiler, rather than an interpreter. *sigh*
  
  No it hasn't, because it isn't.
  
  You can sigh all you like, it doesnt change a thing.
  
  You acknowledge your link is weak and biased, why not try posting a reasonable link, particuarly one which seems to show that java is as fast as fortran or C++ for anything meaningful.
- Java is slow. PERIOD. (Score:2)
  
  by giel ( 554962 ) writes:
  As the author of the article states languages as Java and C++ contain a lot of functionality which has no additional use to the problems to be solved. I can imagine however some functional languages do.
  
  Object oriented languages do come with a lot of overhead eg. memory management and exception handling, where a language as Fortran is optimized for and has proven its power when it comes to complex calculations.
  
  As we all know should select the right language for the right job. Last year I've been working on an application which performs complicated flow calculations and presents the results through a web frontend. The easiest and (IMHO best) solution was to use three languages: Java, C and Fortran.
  
  Frontend Java (servlet), sending requests to a deamon (running on an another machine).
  
  Deamon C, receives requests, calls a Fortran function (which it was linked to) and returns the result.
  
  Calculations Fortran, for calculation precision and because the calculation libraries existed and were certified.
  
  Ah, I forgot to mention it uses SQL to gather data from Oracle.
  Seemed a reasonable solution to me. Offcourse I could have used Perl, assembler and Lisp too and Prolog to solve queries, or Word97 macro's combined with Occam (which is very good at parallel computing) or whatever.
  
  Sometimes discussions about Java start to look like discussions about religion. Bad. Scares the hell out of me.
Compiler for AMD processors? (Score:2, Interesting)

by kghougaard ( 315693 ) writes:

The article deals with Pentium systems. Is there an AMD optimized compiler out there also?
Can anyone suggest a good compiler for floting point number crunching on Athlon based systems? (For Linux, and preferably free or not too expensive)
- Re:Compiler for AMD processors? (Score:2)
  
  by guybarr ( 447727 ) writes:
  
  try intel's compiler ... it's rumored to get better results than gcc on AMD's machines as well.
  
  you can do a 30-days free evaluation, IIRC, so my advice would be:
  devellop using gcc/g++, then when you need to go to production, evaluate intel's compiler and see if it's good enough for your needs.
dubious (Score:2)

by guybarr ( 447727 ) writes:

He benchmarked gcc 3.2.1 and did not benchmark gcc 2.95.x . This in itself makes me doubt his work, especially if his C/C++ code is just C code "in disguise".

Plus, parallelization should not have been a part of such a benchmark.
Parallelization for a cluster _or_ a supercomputer has many issues
that have nothing to do with compiler/language performance, and has much to do with IPC, memory and cache hit/misses, and other pains, who provide many people with PHDs and careers. For a good reason.

I know he says it himself, but if you admit your results should not be taken seriously, don't publish them.

Object-oriented programming adds very little to the core functionality of a number-crunching application -- and performance is adversely affected by the overhead entailed in objects, exception handling, and other object-oriented facilities.

this means the man does not choose the right tool for the job. I agree that exception handling should not be used with numerical code, asserts work just fine. But objects and higher-level abstractions should very much be used in the 90% of the code which takes 10% of the runtime but still, say, >80% of development time ...

And don't let me get started on templates, which really made my work much easier when used properly.

And in many cases, function-oriented code exceeds the clarity of corresponding object-oriented code

as for function-oriented code supposedly exceeding clarity of OO code, I find this remark, Ahem, embarrassing. If your OO code looks less clear than functional code, you have quite a problem. It means you do not use abstractions to ease design, reduce code-redundancy and reduce module interdependancies.

It means, in short, that you are going through the motions, but do not think on what you do.
- Re:dubious (Score:2)
  
  by joib ( 70841 ) writes:
  
  He benchmarked gcc 3.2.1 and did not benchmark gcc 2.95.x...
  
  Why should he have benchmarked gcc 2.95? What critical information would that provide? Other reports seem to indicate that compared to gcc 3.2, 2.95 is slightly slower, especially with processor-specific optimizations, and it has poor support for C99 and C++98. Nothing new or especially interresting here.
  
  Plus, parallelization should not have been a part of such a benchmark.
  Parallelization for a cluster _or_ a supercomputer has many issues...
  
  He didn't claim to test parallalization for supercomputers, he just tested to see if the automatic parallelization and openmp directives helped with the intel compiler on the p4 hyperthreading cpu. Automatic parallelization has a lot to do with the compilers ability to analyze the structure of the code and determine which parts can be run in parallel and which parts cannot. And as you can see from the results, in this case at least, there was little benefit from it.
  
  OO blah blah..
  
  I agree with you that OO is beneficial (for real life projects). OTOH, if you look at his benchmark code it's just a few pages. For a code which is that small there is probably little benefit from OO. And thus the c++ and java versions use no OO features, rightly so IMHO.
IBM java runtime and different platform. (Score:2, Informative)

by Anonymous Coward writes:

Here are some results from an AMD system and the IBM java runtime:

Athlon XP 1600
-mcpu=athlon-xp
-O3
-mmmx
-m3dnow

IBM java - build 1.4.0, J2RE 1.4.0 IBM build cxia32140-20020917a
58.8 seconds
58.9

Sun java - build 1.4.1_01-b01 Client VM
94.2
94.1

Sun java - build 1.4.1_01-b01 Server VM
82.1
81.6

gcj - version 3.2
82
82.1

gcc - version 3.2 20020903 (Red Hat Linux 8.0 3.2-7)
31.4
31.3
26.6 -ffast-math
26.6 -ffast-math
26.4 -ffast-math -funroll-loops
26.4 -ffast-math -funroll-loops
gcj results incorrect - 2x worse than truth (Score:4, Insightful)

by AG ( 3175 ) writes: on Thursday January 02, 2003 @06:52AM (#4997572)

Compiling his benchmark with -ffast-math and -funroll-loops more than doubles the performance of the gcj built benchmark on my P3.

This brings it within spitting distance of g++.

Share
twitter facebook
- Re:gcj results incorrect - 2x worse than truth (Score:2)
  
  by EmagGeek ( 574360 ) writes:
  
  -ffast-math gives up IEEE compliance and some accuracy for speed.... which is probably why most folks don't use it for real applications..
He didn't include python (Score:5, Interesting)

by more ( 452266 ) writes: on Thursday January 02, 2003 @06:55AM (#4997577)

Results on My P4 1.5 GHz, RedHat 8.0, gcc 3.2
time python -O almabench.py user: 22m19.354s
gcc -ffast-math -O3 almabench.cpp -lm time ./a.out user: 0m50.348s
C++ is only 27 times faster than Python for planetary simulations.
Almabench.py is my own conversion from the cpp source. I will send it to the author for possible addition to the benchmark.

Share
twitter facebook
- GCC optimization (Score:2)
  
  by Antity ( 214405 ) writes:
  Since more x86 benchmarks have been posted, people might have a look at some hints for optimization. I will only list some options important for Athlons ATM:
  
  -march=athlon-xp Yes, gcc-3.2 not only includes "athlon" but also "athlon-xp" and even more finer-graded pentium architectures. Use them! (implies -mcpu)
  
  -O9 Always use it. I've yet to see miscompiled code with gcc-3.2 and -O9.
  
  -fno-rtti / -fno-exceptions Only if you know the program isn't using them.
  
  -ffast-math "Simplifies" some assumptions the compiler makes over your FPU code. Could lead to slightly inaccurate results (down the commata..), but usually works great, even with xaos and co and gives a huge speed boost
  
  -fno-math-errno If you don't use errno after math operations at all
  
  -funroll-loops / -funroll-all-loops Well.. unrolls some (1st) or all (2nd) of your loops. The first one is more sane and already included by many -O levels
  
  -fprefetch-loop-arrays Important! On newer chips (Athlon/Pentium) this will generate instructions to prefetch memory before / while in a loop. This really works.
  
  -malign-double Aligns doubles on 8 byte boundaries (instead of 4 byte boundaries). It's a must on newer x86s but it breaks compatibility with many precompiled libraries. Just try it. If it breaks, you have to recompile all libraries used with this option (which is a pain in the ass with Glib/GDK/GTK+ etc)
  
  -mfpmath=sse Lets the compiler use SSE fp math instead of i387 math. pros and cons... it's worth a try for your application.
  
  Last, but not least: -fssa This is still experimental! in gcc-3.2, but after applying all other options for almabench it gave me another full 1% of speed increase. Worth a look, but expect broken code to be generated.
  
  More in your favorite GCC docs.
Fortran compilers and Linux (Score:4, Interesting)

by EmagGeek ( 574360 ) writes: on Thursday January 02, 2003 @07:16AM (#4997620) Journal

Here [polyhedron.com] is a more in-depth comparison of Fortran 90 compilers for linux. They [polyhedron.com] compared Intel, NAG, Lahey, and a couple of other compilers. Here [polyhedron.com] is a comparison of Fortran 77 compilers from the same folks. GNU g77 is actually the slowest of them all, and I've actually confirmed that it is the slowest of a group consisting of DEC/Win32, Lahey/Linux, and g77. I've always dreamed of the day that open source developers would throw some real brainweight at a really well optimized Fortran compiler for linux, but it looks like I'll just have to keep dreaming. Lahey is only $199 or so, but they place some HORRIBLE licensing restrictions on the binaries that are created with their compiler. The DEC/Win32 compiler is also really nice, but since I'm not in school [gatech.edu] anymore, I'm not licensed to use it, and even if I _wanted_ to whore myself out to Micro$oft, I couldn't afford to.
Just to put some things into perspective, here are some numerical results. These were obtained on my dual-athlon 1.4GHz w/ 1GB of RAM. The task was to compute the TE and TM surface currents induced on a circular cylinder 10 wavelengths in circumferece and having a relative permittivity equal to 4-j2. The program simultaneously solves the perfect electric conducting case. The surface was discretized into 60 cells using 120 unknowns (way overkill, but just to prove the point) using the Integral Equation Asymptotic Phase [ie-ap.org] method.
g77 Compiler (-O2 -malign-double -funroll-loops): 24.11s
Lahey Compiler (equivalent paramters): 16.45s
As you can see, there's really no comparison, except that the lahey-created binary uses about 10% more RAM than does the one created with g77. This is just a summary comparison as I did not go into measuring the difference in the error of the two results compared to a reference solution. I'm assuming that both solutions are about the same with regard to accuracy.

Share
twitter facebook
IBM JDK 1.3.1 (Score:3, Informative)

by MarkoNo5 ( 139955 ) writes: <MarkovanDooren@gm[ ].com ['ail' in gap]> on Thursday January 02, 2003 @07:33AM (#4997650)

When I run my QR decomposition "benchmark", IBM's virtual machine always comes out about 20% faster than Sun's. It would still be a lot slower than C++ or Fortran, but the gap should be smaller. On top of that, IBM's license does not require you to accept a license which says the VM may install any software on your machine and that you automatically accept the license of that new software. See http://hal.trinhall.cam.ac.uk/~nrs27/java_eula.htm l [cam.ac.uk] for all the fun.

Share
twitter facebook
Some java results (Score:2)

by LadyLucky ( 546115 ) writes:
I thought I would try out a few results on my machine, and got some interesting results. I would encourage other people to do the same. I have used his java code, unmodified, save for a System.out.println the time taken.
My computer is a 900MHz Athlon, 256Mb RAM, running WinXP. Using J2SDK 1.4.1_01.
javac -g:none -O *.java
java -server almabench Took a total of 107555ms
jview almabench Took a total of 51073ms
Compare this with 76 seconds on his 2.8 GHz machine. JView (that's Microsoft's VM) was able to run in 51 seconds.
The two things of note:
- My results are significantly faster than his.
- Jview ran at twice the pace of Sun's latest offering. Euch.
Unfortunatly I have no c compiler on this machine to run them, that would be really interesting.
- - Re:Some java results (Score:2)
    
    by ChaoticCoyote ( 195677 ) writes:
    
    Just because Java is a great tool doesn't mean it is the best tool for every job.
    
    I only included the -server option because people would've bitched if I hadn't. I'm well-aware that the server JVM is optimized for persistent processes.
    
    Everyone wants me to include the IBM JDK, so I'll do that later today.
    - Re:Some java results (Score:2)
      
      by X ( 1235 ) writes:
      
      Hey, I'd be the first to admit that for the job you're doing in this benchmark Java is a poor tool for the task, and that there are plenty of other tasks for which it is ill suited for.
      
      However, the conclusions you draw from this are that Java is a poor tool for a much broader set of applications and circumstances than your benchmark accurately measures.
      
      Certainly, lots of floating point calculations are highly persistent, and take *days* to execute. In many cases, even if they don't take days to execute, the same code is executed with different datasets. These are the kind of problems for with the Sun's Java VM is capable of approaching Fortran and C++ performance. It's known that the optimisations which are applied in these circumstances can improve execution speed by several times. Based on that it's totaly unreasonable to use your benchmark to draw conclusions about Java's overall abilities in high-performance settings.
      
      Your test also times everything including load time for the program. Because you don't isolate out just the floating-point calculations of the code, you don't have adequate controls on the other variables in the execution of your program.
      
      You conclude that Java does not take advantage of processor specific optimisations (it actually does have processor specific optimisations). I don't see how this conclusion could be based on your tests, particularly since the Sun JVM demonstrated significantly more of a performance improvement on the Pentium IV box than gcc or gcj were able to show (indeed they were even better than the performance improvements with the "default" builds with the Intel compilers).
      
      The infamous "lies, damn lies, and benchmarks" (paraphrasing a bit) is not so much about the benchmarks themselves, but how they tend to be used innappropriately to make generalisations well beyond what they accurately demonstrate.
      - Re:Some java results (Score:2)
        
        by ChaoticCoyote ( 195677 ) writes:
        
        You incorrectly assert:
        
        ...the conclusions you draw from this are that Java is a poor tool for a much broader set of applications and circumstances than your benchmark accurately measures.
        
        If you read the conclusion in my article, you'll see the following paragraph:
        
        This benchmark does not imply that Java is useless, or that Fortran is the world's best programming language, or that compiled languages are superior in all cases to byte-code interpreters. I would no more write a sophisticated graphical application in Fortran than I would analyze physics experiments in Java. Java is a terrific tool for building user interfaces; I enjoy working with Swing, and am in the midst of developing a large vertical-market database application in Java. Be that as it may, I'm exploring number crunching in this article, not buttons and mouse movements.
Development time vs. run time (Score:2, Interesting)

by Harald74 ( 40901 ) writes:

In the article the author writes:

Object-oriented programming adds very little to the core functionality of a number-crunching application -- and performance is adversely affected by the overhead entailed in objects, exception handling, and other object-oriented facilities.

Using object orientation in C++ adds a very small overhead (in the vicinity of 10% if you're using virtual functions). Now I understand that the article was mostly concerned with benchmarking the languages, and I applaud the writer for specifiying that benchmarking is no "silver bullet". But I really want to stress:

Somebody has to develop and maintain the freaking program too!

Object orientation helps anything but the most trivial of applications to acvhieve better modularity and reuseability. Anyway your program is going to spend most of its time in development, not running, so anything that can help that process along is going to be a big help to your project. Please check out the benefits of the SW industries "best practices" [google.com] and apply them to your project.

You will save days and months of development time, during which you can run your finished program to your heart's content.
- Re:Development time vs. run time (Score:4, Insightful)
  
  by ChaoticCoyote ( 195677 ) writes: on Thursday January 02, 2003 @10:36AM (#4998277) Homepage
  
  Using Object-Oriented constructs is no guarantee that a program is maintainable or even readable. I have seen some horrifying OOP code in my life, written by people so enamoured of syntax that they drown theircode in it.
  
  In numerical applications, and extra 10% can be the difference between success and failure. I'm corresponding with a fellow who works in meteorology; his company uses commodity boxes to compete with government-funded monopolies. For him, the ability to gain 10% is crucial.
  
  I am all in favor of object-oriented programming -- but my philosophy matches that of Bjarne Stroustrup, who refers to his language as a having "multiple paradigms." Use OO when it makes sense -- but use the right tool for the task at hand. C++ does not force you to use OOP when it doesn't make sense.
  
  Many numerical applications make mroe sense when using short variable names (that match formulas in texts) and a function-based approach (again, matching mathematical idiom).
  
  Parent Share
  twitter facebook
- Re:Development time vs. run time (Score:2)
  
  by budGibson ( 18631 ) writes:
  
  Anyway your program is going to spend most of its time in development, not running, so anything that can help that process along is going to be a big help to your project.
  
  This is true, but in production use, code performance is critical. If you can cut execution time by a factor of two or more for any given hardware configuration, it can mean the difference between feasible and infeasible. For instance collaborative filtering on the web.
  
  A lot of optimizations in numerical computing languages involve removing OO features (e.g., matlab). Numerical computing seems to focus on vectorization (re-representing all problems in terms of vectors), a sort of one-size-fits-all data representation, that can than be used in highly optimized algorithms. At the end of the day, one has to wonder about the usefulness of the OO paradigm for heavy duty numerical computing.
My Duron 1.1GHz must have a built-in Java VM ! (Score:2, Interesting)

by MarkoNo5 ( 139955 ) writes:

Using both Sun's and IBM's Java virtual machines I get the following results from 'time'.
real 0m35.173s
user 0m35.140s
sys 0m0.030s

The results vary from 35.130 to 35.160. When I run the c++ test with the following compiler options : -march=i686 -mcpu=i686 -O (somehow the mmx and sse option are rejected)
real 0m43.467s
user 0m41.790s
sys 0m0.010s

Can somebody explain this please ?
This guy just doesn't get Java (Score:2)

by X ( 1235 ) writes:

He complains about the Java compilers complete lack of optimisation flags (duh... making optimisations to platform independant byte code is pretty useless, and generally counter productive... it's all done by the JRE, which actually has a number of flags for tweaking things, and there are many JRE's to choose from). He talks about how the JIT doesn't perform processor specific optimisations (it does in fact perform many processor specific optimisations, unfortunately his benchmark is written in such a way that none of them will get used). He talks about there being no interest in high-performance Java (the Java Grande [javagrande.org] group would beg to differ). Best of all, he keeps calling Java an interpretted language... even though he used gcj as part of his own benchmarks... Sigh.

Worst of all he uses gcj without trying out TowerJ, which is a much more established Java-binary compiler. Sigh.
- I can only try... (Score:4, Informative)
  
  by ChaoticCoyote ( 195677 ) writes: on Thursday January 02, 2003 @10:28AM (#4998236) Homepage
  
  ...the tools I have at hand. I have nothing against TowerJ, but can't test it if I don't own it. As for Java, I made note of a lack of flags, which is different from complaining. The -O flag is no longer supported by Sun's JDK according to the documentation.
  
  Parent Share
  twitter facebook
  - Re:I can only try... (Score:3, Informative)
    
    by X ( 1235 ) writes:
    
    Actually, you can test it even if you don't own it. They provide a free trial download right off their home page [towerj.com].
    
    The -O flag wasn't doing anything that improved performance on a JIT'd system, which is why it was removed. The problem is you're looking for optimisation flags in the wrong space. As long as the compiler is outputing byte codes any performance optimisations it might do are actually quite likely to make things worse. Admittedly, the SunVM doesn't have much in the way of floating point optimisation flags, but that is where performance optimisations kick in for JIT'd runtimes.
    - An old policy (Score:4, Informative)
      
      by ChaoticCoyote ( 195677 ) writes: on Thursday January 02, 2003 @11:29AM (#4998556) Homepage
      
      Back when I wrote reviews for print magazines (back when there were print magazines, that is), it was standard policy to limit reviews to the actually, shipping, commercial product. Demos often lack critical features (like optimizers) or are tuned for benchmark tests, so I've kept to that policy now that I write reviews for my own web site.
      
      I own licenses for the Intel compilers, for example -- and, of course, gcc and Sun Java don't cost anything in the first place. I'm considering my options in this case; long gone are the days when a dozen Fortran or C compilers would arrive at my door for a magazine review. Heck, there aren't a dozen compiler vendors left... ;)
      
      Parent Share
      twitter facebook
      - Re:An old policy (Score:2)
        
        by X ( 1235 ) writes:
        
        In TowerJ's case, the only difference from the production product is the limited time period for use (15 days). However, I checked it out myself and it appears that if you click through they've actually recalled their downloads program. Now you have to register to get to the code (and they may not even give it to you then).
        
        I can tell you from past experience though that it pretty much killed gcj in performance for this kind of stuff.
Comments by the Author (Score:5, Informative)

by ChaoticCoyote ( 195677 ) writes: on Thursday January 02, 2003 @09:54AM (#4998043) Homepage

Almost *ALL* of my email is related to Java. I'll be adding the IBM JDK and older versions of the Sun JDK later today, as per reader request.

I'm making minor updates to the article as the day passes. I appreciate comments from everyone; once I'm through with my e-mail, I'll respond to these Slashdot comments.

Additional benchmarks will be added to the article with time; I'm putting together a single-precision ("float") benchamrk, for example.

Share
twitter facebook
awk (Score:4, Funny)

by Charles Dodgeson ( 248492 ) writes: <jeffrey@goldmark.org> on Thursday January 02, 2003 @11:16AM (#4998463) Homepage Journal

Surely people remember that most excellent O'Reilly book, Numerical Recipes in AWK. Unfortunately the 1998 review of it has disappeared [segfault.org].

Share
twitter facebook
Which is faster, a Ferrari or a snail? (Score:2)

by kfg ( 145172 ) writes:

The answer is:

You haven't got an fscking clue Sparky.

I haven't given you any indication as to what sort of *conditions* performance will be measured under.

The author of this benchmark test seems to understand this. Did anybody wailing on him for failures of the tests bother to read his benchmark caveat?

All a benchmark test does is give you some understanding of how various things perform under the *precise* conditions applied during the test.

The real value of benchmarking only comes after performing hundreds of tests under hundreds of variants of enviromental conditions.

Doing so will increase your *human* wisdom about how certain things perform. Performance is a *value* judgement, and any human who abrogates their value judgment to a machine probably gets what they deserve.

Too bad that sort of shit usually rolls downhill though.

KFG
- Re:Which is faster, a Ferrari or a snail? (Score:2)
  
  by ChaoticCoyote ( 195677 ) writes:
  
  Thank you.
The guy is ignorant about Java (Score:3, Interesting)

by mike_malek ( 82254 ) writes: on Thursday January 02, 2003 @12:57PM (#4999070)

First off, I have interned three times at Sun, working on virtual machines. So I know a fair bit about VM's, and their run-time and processor-specific optimizations.
"it has always been clear that Java is inferior to native code applications in terms of raw power. Computational performance depends on the effective use of a target processor's instruction set; such optimization can not be expected of Java given its goals of universality and portability."
This statement is simply not true! The portability nature of Java does not conflict with a virtual machine that does processor-specific optimizations. Take a look at Sun's HotSpot VM source code (it's publicly available!) In the IA32-specific code, you'll see lots of run-time switches to enable specific P4 optimizations, for example.
"Perhaps Java's Just-in-Time compiler could be enhanced to perform processor-specific run-time optimizations; on the other hand, doing so would require different JVMs (Java Virtual Machines) for different architectures, or a single bloated JVM with run-time architecture detection."
This already exists in the current HotSpot VM. There's an IA32 binary, which includes optimizations for several versions of IA32. It does not include PowerPC or SPARC code, as that's in a different binary.
" The "ia32" world is already fragmented between Pentium III, Pentium IV, Itanium, Athlon, and Opteron architectures, each having unique requirements for optimization;"
That's the challenges a VM writer has to deal with. And the HotSpot team did a great job in managing this complexity.
In the future, if you (or anyone else, for the matter) takes the time to write a paper, you should do more research. Some of the statements above are simply misleading.

Share
twitter facebook
A Few Other Fast Ones (Score:3, Informative)

by Lucas Membrane ( 524640 ) writes: on Thursday January 02, 2003 @06:21PM (#5002018)

OCaml -- if you can do tail-recursive list processing instead of array processing, this is just about neck and neck with anything else. Arrays are slower.
Corman Lisp and MLton -- Other fast functional languages.
GNAT Ada -- Runs pretty close to C/C++ speed. Sometimes faster if the compiler can do expeditious compile-time optimizations.

Share
twitter facebook
Article Updated; My Comments on Your Comments! (Score:3, Informative)

by ChaoticCoyote ( 195677 ) writes: on Thursday January 02, 2003 @09:56PM (#5003491) Homepage

The original article generated an exceptional collection of interesting and helpful of responses. In this section, I'll run down the important points people made.

Note: This is a duplicate of a new section I added to the review; I'm posting it here for posterity.
An overwhelming number of people suggested that I include results for IBM JDKs -- and I have. In fact, I've added results for Sun JDK 1.3.1_06, IBM JDK 1.4.0, and IBM 1.3.1 RC3. Adding these JVMs made a significant difference in the results, cutting runtimes in half. On the other hand, almabench runs as an infinite loop with Sun's 1.3.1 JVM (it starts, but never finishes). Note that I recompiled almabench with the corresponding javac for each JDK, so the JVMs were executing code generated by their corresponding compiler.

The problem with Java's performance is not my code or my lack of Java skills -- what real problem is that Java 1.4 is slow. Both the Sun and IBM JVMs lost significant performance in the move from 1.3.1 to 1.4, whether due to a new language requirement or other factors. My faith in Java is severely shaken when applications lose significant performance by upgrading to the current release of Java.

Java 1.4 added many new features to the language and packages; however, changing from version 1.3 to 1.4 should not double run-times! Nor am I comforted by the problems of Sun's 1.3.1 server JVM.

Given the nature of the problem, my conclusions about Java stand (albeit slightly softened). By Sun's own definition, JDK 1.3.1 is obsolete; the fact that it performs better than the most current JDK is indicative of a serious problem with Sun's improvements to their language. Since Java 1.4.1 is what Sun is promoting, so that is what I base my conclusion on. I can say that IBM's product is superior, and have already set it is my default JDK. It's no wonder Sun is upset about IBM usurping Java -- IBM is producing better tools.

Some people asked if my Java results were biased by the amount of time it takes to load the JVM. I've tested several empty and near-empty applications; a Hello, world program, for example loads and runs in less than 2/10ths of a second -- hardly significant. The start-up time increases with the number of imports -- but almabench.java imports only one small class package, java.math.*, which (in my tests) does not impose measurable overhead.

I did not include any commercial Java compilers. Most, like the oft-cited Excelsior JET [excelsior-usa.com], are Windows-only; this article is about Linux. I don't benchmark Visual C++ or C# for the same reason (although I will look at Mono [go-mono.com] and C# some time in the future). The free version of Borland C++ does not include a complete optimizer, so I don't think it fits in this review.

How do I know that the programs are producing the correct output? Each program includes code to display results; I run the programs with I/O to ensure that all calculations are being performed, then I comment out any header inclusions, imports, and print statements for actual testing.

How am I timing the results? With the Linux time command. Table 3 reports the real value reported by time (the elapsed time of execution.) Embedding timers in the actual code is fraught with problems; for example, each language implements different time scales and abilities. I'm sure someone will tell me that time is full of problems too, but it works for me and is consistent across all programming languages.

Amid the barrage of Java-related comments, a few people actually noticed the Fortran code. I am looking at other Fortran compilers for future updates. As for GNU g77 -- I wrote the code in Fortran 95 because I find Fortran 77 to be annoying. I wrote piles of Fortran 77 back in my CDC and VAX days, but these days I'm writing for environments where Fortran 95 is more appropriate. Believe it or not, Fortran 95 is a very clean, orderly language that eliminates many Fortran 77 idiosyncrasies while adding features important for high-performance coding.

Share
twitter facebook
- Re:Java performance (Score:2)
  
  by LinuxGeek ( 6139 ) writes:
  
  Hmmm.. On WinXP with J2SDK1.4.1, almabench ran in 80.13 seconds on a 1.4GHz AMD TBird. I'll have to boot back into Linux to compare jdk speed. That compares nicely to the P-IV 2.8GHz numbers for java.
- Re:What is it with windows-using losers (Score:2, Interesting)
  
  by InfiniteWisdom ( 530090 ) writes:
  
  With numerical code there will be little difference regardless of which version of Linux your use or if you use Windows for that matter
- Re:fp (Score:3, Interesting)
  
  by mvw ( 2916 ) writes:
  
  Yes, indeed, what about fp (functional programming) and numerics? :)
  
  Funny thing is that the fp people have invested lots of brainpower into advanced functional programming techniques. Symbolic and logics math, yes I have seen software for it. But numerics?
  
  Is it possible that a functional language beats FORTRAN eg in eigenvector calculations?
  
  Regards,
  Marc
  - Re:fp (Score:3, Interesting)
    
    by tigersha ( 151319 ) writes:
    
    Yes, look on Google for the language SISAL. That apparently beats Fortran.
    
    Sisal was developed at Los Alamos IIRC and those dudes have some of the fastest supercomputers in the world so they should know.
    
    The thing with FL's is not to use a lazy language, but a stract one (SISAL being a good example) should be able to be optimized quite heavily.
  - Functional Programming (Score:2)
    
    by ChaoticCoyote ( 195677 ) writes:
    
    I'm very interested in investigating the efficacy of functional programming for numerics. I've dabbled in Erlang and Haskell a bit, but haven't had an opportuntiy to study them in depth.
    
    So many tools, so little time to understand them...
    - Re:Functional Programming (Score:2)
      
      by certron ( 57841 ) writes:
      
      Well, I was searching around for a good place to put this, making it a reply to the author of the article seems as good a place as any :-)
      
      When I was poking around for something to let me learn octave, I came across this:
      http://www.bagley.org/~doug/shootout/bench/ matrix/
      
      Perhaps it will be useful. It contains some code, too.
  - Re:fp (Score:3, Informative)
    
    by JoeBuck ( 7947 ) writes:
    
    For the specific case of algorithms that can be expressed strictly in terms of bounded loops where the loop bounds can all be determined at compile time, so that there are no run-time tests needed to determine if some computation must be performed or not, functional programming styles can be near-optimal. Analysis techniques can radically restructure such programs, completely reorganizing the loop nesting.
    There have been a variety of stream-oriented or single-assignment languages to make such things possible: Silage and DFL, Lucid, Lustre, Sisal, and others. You can get very good code from such languages, but they aren't very general.
- Re:Lies, damn lies, and people who don't know... (Score:2)
  
  by toriver ( 11308 ) writes:
  
  For starters, he's using 2-dimensional arrays, so of course Java is going to be slower. 2-dimensional arrays in C++ are actually 1-dimensional arrays with some syntactic sugar thrown in.
  Also, the arrays are initiated element by element since Java doesn't have a "data segment". If he measures time outside the program from start to finish he will get VM startup time etc. in addition to that.
  As such, it doesn't play very well against the JVM's optimiser, which is oriented towards Java code with much different structure.
  Well, at least the "meat" of the code is in methods which are called often enough that some optimizations can be thrown in. There is also a curious absence of the "strictfp" keyword that might improve things.
  Now, if he had taken System.currentTimeMillis() before the loop and after and computed the difference, a more relevant number would emerge...
- People who don't know (Score:4, Insightful)
  
  by Raedwald ( 567500 ) writes: on Thursday January 02, 2003 @10:13AM (#4998144)
  The C++ code does not use any object-oriented or exception-handling features of C++; essentially, this is a C program with minor C++ convenience features
  
  God, it feels like I've spent most of my professional life arguing with Fortran programmers. These people are ignorant, but arrogant. They think that because they have a Phd in Engineering (or Physics, or whatever) and can produce a syntactically-correct Fortran program, they know how to program, and can ignore advice backed up by thirty years of software engineering research and experience. Bizarrely, what little knowledge they have is about 35 years out of date, even for those in their twenties. They live in a ghetto.
  
  As anyone with even the slightest real computing knowledge knows, what gives you performance is the algorithm chosen, not the implementation. Therefore, what matters is how easy it is to implement a good algorithm. Which means, how easy it is to write a program that implements a difficult to understand algorithm (because an inobvious algorithm-- of course, there are some exceptions). Which means that support for modern programming techniques that help you produce easy to understand programs is important for producing high performance programs. You know, things like the following that are absent from the still widely used Fortran-77:
  
  Requiring all variables are explicitly defined before use (accounts for one third of all coding errors in even carefully written Fortran progams). A requirement enforced by all other compiled languages now in common use.
  
  Programmer-defined data structures (struct in C). Widely available elsewhere since the late 1960s.
  
  Structured control structures (while, etc.). Old timers might be aware that that goto battles were fought and won in the rest of the world by about 1970.
  
  Aggregation of subroutines into modules or packages. Widely available elsewhere since the mid 1970s.
  
  Support for abstract data types. Widely available elsewhere since the early 1980s.
  
  Support for polymorphism and inheritence (object orientation). Widely available elsewhere since the mid 1980s.
  
  So, comparing the performance of toy a Fortran program with its translation into C++ or Java shows nothing.
  
  What has happended is a second Software Engineering Crisis. The first 'Crisis was in the mainstream, data processing, part of the software industry. The introduction of more powerful computers resulted in large, complex programs that were failures because they were complicated (See The Mythical Man Month). Since then, we have developed software engineering techniques to deal with their problems, so now large programs can be much more complex (composed of many parts) without being excessively complicated (difficult to produce and understand). Since about twelve years ago, the increasing performance of computers means that number-crunching programs (e.g. CFD programs) don't merely process large amounts of data; they are also large and complicated in their own right. The Software Engineering Crisis has caught up with engineers and scientists. The sad thing is, many don't know it, or ignore the advice (and screamingly obvious signs) that it is here.
  Parent Share
  twitter facebook
- Re:Lies, damn lies, and people who don't know... (Score:2)
  
  by ChaoticCoyote ( 195677 ) writes:
  
  Considering that I make a rather healthy living from Java development, I must be getting something right. ;) And I've written at least one successful book on Java development, Java Algorithms (JDK 1.1-based and a bit old now). Ignore the J++ book I wrote for Microsoft; it was butchered by political concerns in Redmond.
  
  For my clients, I create Swing-based GUI apps that have no numerical code whatsoever; the current one's running about 50K lines at the moment. Performance on such applications is database-bound, so Java's performance is a non-issue.
  
  Almabench was not a standard Java program; it is a translation directly from the Fortran, and, as such, it has a few inconsistencies with the usual Java style. Here's the Catch-22: Had I customized the code to be very "Java-like", people would have complained that I was comparing apples and organges -- yet when I keep the code consistent across languages, people such as yourself argue that I should be more language-specific.
  
  It's a good thing I put on my asbetstos underware this morning. I hadn't expected this to hit /.'s front page....
  
  Given that numerical code often uses intrinsic functions like sin and atan2, testing those functions is quite valid, given the inability to chose my floating-point library under Java (byond changing the entire VM).
  - Re:Lies, damn lies, and people who don't know... (Score:2)
    
    by X ( 1235 ) writes:
    
    Seriously, if you know Java, you had to know that the stuff you were writing is Java was semantically different (and significantly so) in many ways from the Fortran and C++ code (in terms of what the program was telling the language to do). You also had to know that the way you were benchmarking the code was going to get anywhere near an accurate measurement of Java's overall thoroughput for executing this problem. Knowing that, how can you really feel that this is a reasonable performance assessment of Java?
    
    I agree it's a Catch-22 problem. You will be criticised no matter what you do.
    
    That being said, I'm not sure that you can effectively measure the performance characteristics of a language if you apply idioms to it from another language. It merely measures one language's ability to effectively execute another's runtime model. While this is an interesting measurement in and of itself, it is not an effective measurement as to whether a given language can solve the original problem in an efficient manner.
    
    How's this for a solution: write the program in Java with a focus on the Java execution model, then translate it back in to Fortran. Now you'll have a version in both languages of both programs. This would make you pretty much safe in both camps.
    
    I agree that the trig tests are valid for testing the performance of a given VM, but they aren't a valid reflection of the language. From a language perspective, there is no reason why Java's sin and atan2 functions can't execute as quickly as their Fortran equivalents. For that matter the same is true of C++ (and as you rightly suspect, the Fortran and C++ code that *is* using the same floating point library gets very similar results).
- - A Very Clear Title (Score:2)
    
    by ChaoticCoyote ( 195677 ) writes:
    
    The title of the article CLEARLY states the limits of the tests: Linux Number Crunching: Benchmarking Compilers and Languages for ia32 That doesn't leave much room for doubt as to what I'm testing. I can add Solaris/SPARC numbers if anyone is interested.
- - Re:slashdot benchmark (Score:2)
    
    by ChaoticCoyote ( 195677 ) writes:
    
    My biggest surprise review was the excellent performance of the C++ code. In the past, C++ compilers introduced overhead -- even in code that isn't object-oriented.
- - Re:meaningless benchmark (Score:3, Interesting)
    
    by g4dget ( 579145 ) writes:
    
    Indeed, I think this is the root of gcc's difficulties. It doesn't invalidate the test by any means
    Oh, yes, it does. If you want GNU C/C++ to inline trig functions and/or use the machine instructions, you can get it to do that, too. But there are good reasons why it doesn't do that by default. There are other options you can give GNU C/C++ to tell it to make assumptions that let it optimize more. It's really for you to decide what tradeoffs you want.
    The point of using a high-level language is to avoid the need to play in assembly-land.
    For day-to-day programming, I agree. But if you do benchmarks, you have to understand assembly language and look at it. There are a lot of other low-level details you need to look at and understand as well (e.g., caching). That's not so that you can tune your benchmarks for it, it's so that you can determine whether your benchmark is actually doing what you think it is doing.
    The trick is picking the right level of hardware abstraction -- do you write for an assembler
    Unfortunately, your code doesn't test the efficiency of abstraction at all--your code is something that could have been written in Fortran 77. Once you actually start moving to higher levels of abstraction, Fortran 95's capabilities are limited, and Java's performance becomes abysmal. Pretty much only C++ has all the necessary hooks to write efficient high-level numerical code at this point.
    Note, incidentally, that even if your benchmarks are representative, the Pentium4 is probably still not the best solution in terms of price/performance. That's another important factor to be taken into account when benchmarking: how much does that performance actually cost.
    (Incidentally, I hope JPL isn't using this kind of code for actual navigation.)

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

0th post (Score:3, Funny)

Octave (Score:5, Interesting)

Re:Octave (Score:2)

Re:Octave (Score:2, Informative)

Re:Octave (Score:2)

Matlab for Linux (Score:2)

He didn't include K. (Score:5, Interesting)

Re:He didn't include K. (Score:3, Interesting)

Kernigan's k and java and perl (Score:2)

nor matlab, nor ... (Score:3)

Very true (Score:2)

Re:Very true (Score:5, Interesting)

Re:He didn't include K. (Score:5, Insightful)

Re:He didn't include K. (Score:2, Insightful)

start make an object (Score:2)

Re:He didn't include K. (Score:3, Insightful)

Mirror (Score:3, Informative)

Good. (Score:4, Interesting)

Intel C++ (Score:2, Interesting)

Re:Intel C++ (Score:3, Informative)

Re:Intel C++ (Score:2)

Re:Intel C++ (Score:3, Informative)

SPEC-FP for different compilers? (Score:3, Informative)

weighted? (Score:2)

Numbers of some Interest. Conclusions a yawn. (Score:2)

Java is slow? (Score:5, Interesting)

Re:Java is slow? (Score:2, Informative)

Re:Java is slow? (Score:5, Informative)

Re:Java is slow? (Score:4, Insightful)

Re:Java is slow? (Score:4, Interesting)

Re:Java is slow? (Score:2, Interesting)

Re:Java is slow? (Score:3, Interesting)

Re:Java is slow? (Score:2)

Re:Java is slow? (Score:2)

JIT? (Score:4, Informative)

Re:JIT? (Score:2)

Re:Java is slow? (Score:3, Interesting)

Re:Java is slow? (Score:2)

Picking nits with that Java article (Score:2)

Re:Picking nits with that Java article (Score:3, Interesting)

I've written a lot of both (Score:3, Informative)

Re:I've written a lot of both (Score:2)

Sun's Wireless toolkit (Score:3, Interesting)

Leaks in C++ (Score:3, Interesting)

Re: Leaks in C++ (Score:2)

Re: Leaks in C++ (Score:2)

Java is really, really slow (Score:5, Insightful)

Re:Java is really, really slow (Score:4, Insightful)

Re:Java is really, really slow (Score:2)

Re:Java is really, really slow (Score:5, Interesting)

Re:Java is really, really slow (Score:3, Insightful)

Re:Java is really, really slow (Score:3, Informative)

Re:With every GHz milestone... (Score:3, Interesting)

He DID use a JIT compiler (Score:2)

Java is slow. PERIOD. (Score:2)

Compiler for AMD processors? (Score:2, Interesting)

Re:Compiler for AMD processors? (Score:2)

dubious (Score:2)

Re:dubious (Score:2)

IBM java runtime and different platform. (Score:2, Informative)

gcj results incorrect - 2x worse than truth (Score:4, Insightful)

Re:gcj results incorrect - 2x worse than truth (Score:2)

He didn't include python (Score:5, Interesting)

GCC optimization (Score:2)

Fortran compilers and Linux (Score:4, Interesting)

IBM JDK 1.3.1 (Score:3, Informative)

Some java results (Score:2)

Re:Some java results (Score:2)

Re:Some java results (Score:2)

Re:Some java results (Score:2)

Development time vs. run time (Score:2, Interesting)

Re:Development time vs. run time (Score:4, Insightful)

Re:Development time vs. run time (Score:2)

My Duron 1.1GHz must have a built-in Java VM ! (Score:2, Interesting)

This guy just doesn't get Java (Score:2)

I can only try... (Score:4, Informative)

Re:I can only try... (Score:3, Informative)

An old policy (Score:4, Informative)

Re:An old policy (Score:2)

Comments by the Author (Score:5, Informative)