Performance Benchmarks of Nine Languages 954
ikewillis writes "OSnews compares the relative performance of nine languages and variants on Windows: Java 1.3.1, Java 1.4.2, C compiled with gcc 3.3.1, Python 2.3.2, Python compiled with Psyco 1.1.1, Visual Basic, Visual C#, Visual C++, and Visual J#. His conclusion was that Visual C++ was the winner, but in most of the benchmarks Java 1.4 performed on par with native code, even surpassing gcc 3.3.1's performance. I conducted my own tests pitting Java 1.4 against gcc 3.3 and icc 8.0 using his benchmark code, and found Java to perform significantly worse than C on Linux/Athlon."
Trig functions... (Score:4, Interesting)
Why are the Microsoft languages so fast with the Trig functions?
Accurate? (Score:5, Interesting)
32-bit integer math: using a 32-bit integer loop counter and 32-bit integer operands, alternate among the four arithmetic functions while working through a loop from one to one billion. That is, calculate the following (while discarding any remainders)....
It also relies on the strength of the compiler, not just the strength of the language.
Why did VB do so bad on IO. (Score:4, Interesting)
Comment removed (Score:4, Interesting)
.NET Languages and IL (Score:4, Interesting)
Why benchmark the various ".NET languages" (those languages whose compilers target the CLR)? Every compiler targeting the CLR produces Intermediate Languages, or more specifically MSIL. The only differences you'd find is in optimizations performed for each compiler, which usually aren't too much (like VB.NET allocates a local variable for the old "Function = ReturnValue" syntax whether you use it or not).
Look at the results for C# and J#. They are almost exactly the same, except for the IO which I highly doubt. Compiler optimizations could squeeze a few more ns or ms out of each procedure, but nothing like that. After all, it's the IL from the mscorlib.dll assembly that's doing most the work for both languages in exactly the same way (it's already compiled and won't differ in execution).
When are people going to get this? I know a lot of people that claim to be ".NET developers" but only know C# and don't realize that the clas libraries can be used by any languages targeting the CLR (and each has their shortcuts).
Would like to see... (Score:4, Interesting)
As any games/DSP programmer will tell you, there are a million ways to speed up trig providing that you don't *really* care after 6dps or so.
OK, maybe I'm just bitter because I was expecting gcc 3.1 to wipe the floor.
Sitting on a Benchmark (Score:3, Interesting)
Oh wait! C# only runs on one operating system. Can you name any other development languages that only run on ONE OS, boys and girls? Neither can I.
Re:Java Performing worse then C (Score:3, Interesting)
I would consider myself part of that "anyone," and I disagree with you. Other than load times (which aren't as bad as they used to be), Java can perform as fast or faster than C code. The main thing is to use a good VM - IBM's J9 VM significantly outperforms Sun's.
Re:Trig functions... (Score:3, Interesting)
They should benchmark development time (Score:2, Interesting)
What about coder's performance? (Score:4, Interesting)
On the other hand, the time and cost required by the coder is a bigger issue (unless you outsource to India). I would assume that some languages are just easier to design for, easier to write in, and easier to debug. Which of these langauges offers the fastest time to "bug-free" completion for applications of various sizes?
Speed or accuracy? (Score:5, Interesting)
Re:Trig functions... (Score:5, Interesting)
In the case of Java, you find that the Intel floating point trig instructions don't meet [naturalbridge.com] the Java machine spec. So they had to implement them as a function.
It all depends if you want accuracy or speed.
Re:They should benchmark development time (Score:5, Interesting)
The difference b/w Java and C++ would be dwarfed by the difference b/w Java and Python. Java may be 30-40% more productive than C++, but Python is 1000% more productive than Java. And yes, this applies to larger projects. J2EE may come to its own w/ projects that have hundreds of mediocre programmers, but if you have a mid-size team of highly skilled developers creating something new & unique (something like Zope or Chandler), Python will trounce the competition.
Re:Trig functions... (Score:3, Interesting)
Otoh, the P4 SSE2 uses a vectorized software model that doesn't have these; I don't know whether the MS compiler generates x87 hardware code or SSE2 vectorized software.
Java specifies using a 32-bit model for these functions, and is probably doing them in software. But what software? And does it use the vectorized SSE2?
IBM Java (Score:3, Interesting)
My application that I benchmarked is data and network and memory intensive, although not math intensive, so that's what I can speak for. We consistently use 2 GB of main memory and pump a total of 2.5 TB (yes, TB) of data (doing a whole buch of AI style work inside the app itself) through the application over it's life cycle, and we cut our total runtime from 6 days to 2.8 days by switching to the IBM VM.
Re:Wow (Score:5, Interesting)
The short of it is that GCC 3.2.1 is highly competitive with ICC 7.0, except for two cases:
FP-intensive code on the Pentium 4
Code that allows Intel C++ to auto-generate SSE vector code for it
Re:Java Performing worse then C (Score:2, Interesting)
I code in a plethora of languages (mainly, python, c, c++, java) and trust me, execution speed is not an area I look at anymore when deciding on C or Java impl (unless , of course your dealing wih cross platform graphics) It hasn't been for 3 years.
I think anyone who has done much work with either developing or running large scale java programs knows that speed can definitely be an issue.
It think it may be time for you to crawl out from that rock you live under.
Re:Wow (Score:5, Interesting)
Re:Trig functions... (Score:3, Interesting)
I, for one, would _never_ trust Java in a mission critical embedded environment. In fact you still see assembly in those envrionments from time to time. Imagine using Java for a fly by wire system. Would you fly on a plane that was using Java for fly by wire? I, for one, would not.
Considering that the EULA forbids from using Java to operate Nuclear Plant and Air Traffic systems, you will never fly in a Java powered Boeing. But that's in the License, so your rant is useless.
Java is great for some things. But you get too many cases where companies use a new technology without adequate due diligence simply because its the NTOW (New Technology Of the Week). I still say that a server written in C (written properly of course) will outperform a server written in Java.
Of course, I can write every application only using 1s and 0s, avoiding every bloating in current programming languages and compilers. That's good and dandy for hobbyists and educational porpuoses. But when there's big $$ on the way you don't want to do that. Java suits those needs perfectly.
In fact I have a feeling that after the release of project Barcelona (this will allow to have just one full set of Java classes to be loaded, with every additional VM using the shared classes, reducing the memory use by every new application) it will be very reasonable to rewrite every network service available on a linux machine using Java. Without buffer overflows, integer overflows and other insecurities inevitably present in pointer-based languages removed you could have an even more secure Linux system.
Re:Under Windows... (Score:3, Interesting)
Python's huge win is not in speed, but in the ability to express the program in a very concise and easy to understand way.
The fact that Psyco can provide huge speed ups via a simple import is just icing.
Comparison of ikewillis' Linux results to Windows (Score:2, Interesting)
Conveniently I have the same system configuration as ikewillis (dual 2.0 GHz Athlon MP), but am running Windows XP instead of Linux. I also have Intel C++ 8.0, which he used on Linux to generate his results [fails.org].
So I ran the same tests that he ran under Linux under Windows. Here are my results from Intel C++ 8.0, with Profile Guided Optimization turned off (comparing to his with PGO on):
Running the same tests under Windows with PGO turned on, the numbers did not change except on the least-significant digits, so I won't bother to list those too. Before running the tests, I set the program to run at high priority on one processor to avoid unnecessary interference from other running applications, or unnecessary processor-jumping--although when I tried it without, there wasn't much of a difference (< 1%).
Conclusions? First, it seems the 64-bit integer performance problem is something that exists only for Intel C++ 8.0 on Linux, not Windows. Second, it seems stdlib I/O performance is significantly higher under Linux than under Windows for this benchmark.
Re:Which Java VMs were used? (Score:2, Interesting)
I would also like to see benchmarks of the same JVM across different operating systems on the same processor, namely Windows, Linux, BSD, and (if it matters) Solaris x86. The question is how do other JVMs stack up against the Windows 'standard'.
It would also be nice to see a 'leveling' benchmark across different processors, specifically comparing a suite of Java benchmarks on WinTel and MacOS.
Re:Trig functions... (Score:4, Interesting)
Re:Trig functions... (Score:2, Interesting)
I have knowledge of mission critical systems and I will tell you right now, just because the EULA forbids something, or its not certified, doesn't mean that doesn't happen. VXWorks is a very popular RTOS and used on flight systems but it's not "certified for flight systems". It was used on the pathfinder mission, but it was not space certified... etc.
I'm not bashing Java here. I'm just saying that not enough due diligence is performed before people jump at a new technology (yes I know that Java has been around for many years). Java, IMHO, is used for applications that it should not be used for. There are different tools for different applications. Yes you can use a crecent wrench as a hammer, but why not use a hammer.
I have seen many instances of technologys getting deployed because it's new and cool (and there are quite a few new and cool technologies that fit their applications) but sometimes things are deployed that are trying to fit a round peg and a square hole.
In fact I have a feeling that after the release of project Barcelona (this will allow to have just one full set of Java classes to be loaded, with every additional VM using the shared classes, reducing the memory use by every new application) it will be very reasonable to rewrite every network service available on a linux machine using Java. Without buffer overflows, integer overflows and other insecurities inevitably present in pointer-based languages removed you could have an even more secure Linux system.
It would probably not be a good idea to rewrite every network service avaliable on Linux in Java. Linix _is_ used in mission critical systems! Yes you may reduce the amount of buffer overflows but VMs are not immune to buffer overflows either. Nothing can make up for good programming practices. Besides C is still faster than Java and in my experience, works very well for OS level networking routines. Some people forget to cross their t's and dot their i's simply because we are human. That's why it helps to have a second set of eyes look at things and test your code (and test more even after that!).
I'm not trying to bash you or Java here, I'm just trying to say that different tools have different uses.
I doubt that there will ever be a one thing fits all applications. I'm remided of my programming languages class when the prof said "Here is assembly language, you will never have to use it". Guess what, 2 years later I was using it
Re:Trig functions... (Score:5, Interesting)
Eclipse is nice, I love eclipse. But I dont mistake it as a Swing replacement. AWT has a purpose, as does Swing and SWT, they are all different.
I believe AWT should be as fast as SWT because its also natively implemented.
Python numbers (Score:3, Interesting)
Python did pretty badly in the tests. The reason is that in Python it takes a long time to translate a variable name into a memory address (It happens at runtime instead of compile time).
The benchmark code has stuff that basically looks like this:
Adding 1 to i takes no time at all but looking up i take a little time. In C this is going to be a lot faster.
Python did really bad when "i" from the example above was a long compared to when it was a long in C. That's because Python has big number support but in C a long is limited to just 4 bytes.
Python did OK in the trig section because the trig functions are implemented in C. It still suffers because it takes a long time to look up variables though.
In real life, variable look up time is sometimes a factor. However, for programs that I've written getting data from the network, or database was the bottleneck.
Re:Trig functions... (Score:3, Interesting)
Last time I did similar benchmark on Windows, the MSVC runtime library set the FPU control word to limit precision to 64 bits. Other environments on x86 used 80 bits precision by default, increasing computation time for some operations.
Re:Sitting on a Benchmark (Score:3, Interesting)
Boy, that's gotta be embarrassing
--
Mando
Why no ActivePerl? (Score:3, Interesting)
In the article it rather sounds like they just assumed Python performance would be an indicator of performance for interpreted languages generally, but is there anything to back this up?
Problem: Java not portable (Score:3, Interesting)
I actually use C++ for portability, not speed or generic programming (which are nice to have).
If you avoid platform, compiler, and processor specific features, C++ is even more portable than Java. Java on the other hand tends to drag all platforms down to the least common denominator, then requires the use of contorted logic and platform extensions just to attain acceptable performance.
People seem to have forgotten the original intention of C: portable code.
Performance not important? Umm , not quite... (Score:4, Interesting)
"Even if C did still enjoy its traditional performance advantage, there are very few cases (I'm hard pressed to come up with a single example from my work) where performance should be the sole criterion when picking a programming language. I"
I can only assume from this that he has never done or known anyone who has done any realtime programming. If you're going to write something
like a car engine management system performance is the ONLY critiria, hence a lot of these sorts of systems are still hand coded in assembler , never
mind C.
Portable.net vs MONO (Score:2, Interesting)
First, I compiled Benchmark.cs using cscc (portable.net) and mcs (mono).
I then ran each binary with mono and ilrun (portable.net). Results are interesting.
Portable.net compiler: cscc -O3 Benchmark.cs
$ ilrun Benchmark.portable.exe
Int arithmetic elapsed time: 12996 ms
Trig elapsed time: 28700 ms
$ mono Benchmark.portable.exe
Int arithmetic elapsed time: 16235 ms
Trig elapsed time: 4534 ms
Mono Compiler: mcs Benchmark.cs
$ ilrun Benchmark.exe
Int arithmetic elapsed time: 13784ms
Trig elapsed time: 27939 ms
$ mono Benchmark.exe
Int arithmetic elapsed time: 15994 ms
Trig elapsed time: 4596 ms
As you can see, Portable.net has slightly faster Int math, but crumbles under the trig functions. There is no significant difference between the compilers.
the Portable.net runtime had a serious bug where the time calculated was an order of magnitute out. I used the unix time command to get a more accurate result.
It would be interesting to do this comparison using Microsoft.NET as well. I would assume Microsoft.net would absolutely rape these results.
n.b. Please note this was not a comprehensive benchmark. I disabled some of the tests because I didn't feel like waiting (So sue me), while X, xmms, xchat, etc were running.
Re:These kind of benchmarks are so 1970s (Score:3, Interesting)
If cars followed Moore's law we'd all be driving at the speed of light about now. And guess what -- that's completely unnecessary.
No, I wouldn't want a 20 HP engine in my car. But I don't feel the need for a 1.6e9 HP engine, either.
Re:They should benchmark development time (Score:3, Interesting)
Amazon.com runs completely on HTML::Mason, which is 100% Perl. All development for the site is done in Perl. Would that qualify as a "large project"?
Why is it that people always make the same kinds of blanket statements about Perl? "It's not for big projects". "It's not for ecommerce systems". Why is not for large projects?
Re:Java Performing worse then C (Score:3, Interesting)
Performance and scale are two different beasts.
However, there are definitely times when some languages are more appropriate than others.
Being mostly a one-language person (used to be guru level on others, but skills have lapsed) I restrict my development to the areas that language is strong in. And let people fluent in the other languages do the other work.
For the things most people use Java for, speed isn't that important compared to the reasons they're using it.
For people who need speed, maybe Java isn't the right choice. So pick something else and get on with it.
For the example environment shown, i.e. writing software to run on windows, I'd pick Delphi anyway - all the speed advantages of C, many of the programming language niceties of Java, all of the front-end simplicity of VB. Delphi rocks for Windows development. Shame it wasn't also benchmarked..
~Cederic
I just sped the Python version by 7x and 1.5x (Score:5, Interesting)
Changing this to 'linesToWrite = [myString] * ioMax' dropped time on my system from 2830ms to 1780ms (I'd like to note that I/O on my system was already much faster than his *best* I/O score, thank you very much Linux)
In the trig test, I used numarray to decrease the runtime from 47660.0ms to *6430.0ms*. The original timing matches his pretty closely, which means that numarray would probably beat his gcc timings handily, too. Any time you're working with a billion numbers in Python, it's a safe bet that you should probably use numarray!
I didn't immediately see how to translate his other mathematical tests into numarray, but I noted that his textual explanation in the article doesn't match the (python) source code!
(My system is a 2.4GHz Pentium IV running RedHat 9)
Re:Trig functions... (Score:3, Interesting)
And yeah, best way to improve performance would be writing a better virtual machine.
Re:Trig functions... (Score:3, Interesting)
Don't go there. it provides no real benefits [slashdot.org].
String, list, dictionary, and function benchmark? (Score:1, Interesting)
Almost all my work programming is function calls and string, dictionary, and list manipulations in a single and multi-threaded apps. A little integer, very little float (money), no trig. In this case the only useful benchmark in the article is I/O.
My guess is that a majority of programmers aren't developing apps heavy in integer, float, or trig. So this benchmark article suggest that we not develop the next iteration of Quake in python.
Does anyone know of a multiple language benchmark relevant to the rest of us lumpen proletariat programmers?
Re:Trig functions... (Score:4, Interesting)
- Algorithms for allocating memory in a manual management scheme can be quite complicated. Look up on how large glibc's memory allocator is. Memory allocation algorithms for GCs tend to be much simpler, often as simple as a simple pointer increment.
- Deallocation algorithms for manual memory management are often even more complicated than the allocation algorithms. They are nearly always slower. Plus, objects are deallocated one at a time. Deallocation algorithms for GC can be simpler, but most importantly, the GC can deallocate large numbers of objects at once. This is, of course, more efficient.
- Copying GCs can compact holes in memory, which makes for better cache utilization.
Depending on the problem at hand, a GC can be a little slower or about the same. For a functional programming style (which has a particular pattern of memory usage) GC is usually faster than manual management. For programs that tend to operate in phases, allocating large numbers of objects gradually and freeing them at once, GC will also be faster.
The real problem with GC is that it affects latency. Modern GCs only freeze the app for a fraction of a second, but that's a large amount of time for something like a game or movie. There are some work arounds for this, though. Latency-sensitive apps can disable GC and use manual memory management. Or, they can use a real-time garbage collector, which has guaranteed latency, but does incur a large fixed overhead. Of course, these problems can be worked around, as evidenced by the fact that major PS2 games like Jak and Daxter were written in a GC'ed language (Common Lisp). Most proponents of GC will tell you that such work-arounds are a good deal easier than hunting down memory leaks and dangling pointers!
Usefull and useless benchmarks (Score:2, Interesting)
Anyway, even when remaining within a same language or language family, the benchmarks are still quite meaningless. For instance when you want to compare the Performance of MSVC++ and GCC. The benchmark has several flaws:
- the code is too trivial. It doesn't show how good the compilers really are at optimizing
- the code is too library dependent. For instance, in the trig benchmark, only the runtime library is really benchmarked and not the code generated by the compiler itself
- for the floating point benchmarks, the options chosen for both compilers do not match. For MSVC++, the options chosen favour speed over accuracy, while the GCC options favour accuracy over speed.
The last point can very easily be illustrated with the trig benchmark.
On my computer (P4, 2.8GHz), I get the following results:
1) Options from the article: 10.9s
2) additional option -ffast-math : 6.9s
(this option is also a significant win for the double benchmark)
3) options above plus linking with CRT_fp8.o : 2.8s
The last option may need some explanation:
Programs compiled by MSVC++ by default set the math coprocessor to 64bit, while GCC programs set it to 80bit. Linking with CRT_fp8.o on Windows platforms makes GCC programs behave like MSVC++ programs and only use 64bit precision. For arithmetic operations, this makes no difference, but the built in transcedental functions become much faster if you reduce the precision of the coprocessor. So all in all, be were able to reduce the speed of the trig benchmark by a factor 3.9 just by changing the compilation options. This is almost exactly the difference seen in the article between the MSVC++ and the GCC results for the trig benchmark.
All in all, for trivial benchmarks like this, if you chose matching compilation options, different compilers give you almost the same results.
The only real weakness that GCC is showing is 64bit integer arithmetic. These are badly implemented in GCC and could be vastly improved.
Marcel
Athlon versus Pentium 4? (Score:3, Interesting)
Conclusion would be: the JIT compiled Java on an Athlon is poorly or not at all optimzed.
Sidenote: the original author of the original benchmark wanted to compare
Further: obviously trig functions (which could be compiled to a single math processor opcode) are not optimzed at all in Java. From the language level calling a trig fuction is a call to a static method in the class Math. If that is "mapped" one to one to machine code it results in a JSR to the C function which contains only a few opcodes, but what a c compiler will compile to one opcode is in trivial 'mapped' Java about 10 opcodes.
angel'o'sphere