Java Performance Urban Legends 632
An anonymous reader writes "Urban legends are kind of like mind viruses; even though we know they are probably not true, we often can't resist the urge to retell them (and thus infect other gullible "hosts") because they make for such good storytelling. Most urban legends have some basis in fact, which only makes them harder to stamp out. Unfortunately, many pointers and tips about Java performance tuning are a lot like urban legends -- someone, somewhere, passes on a "tip" that has (or had) some basis in fact, but through its continued retelling, has lost what truth it once contained. This article examines some of these urban performance legends and sets the record straight."
To what extent does this exist in other languages? (Score:5, Interesting)
I remember in my Turbo Pascal programming days (heh) that a lot of people said that using Units would degrade performance. So I tried it both ways and it never really made a difference, for my applications anyways.
I'd say before taking someone's word for it on a performance enhancing technique, test it out. Because not everything you read is true, and not everything you read will apply to every environment or every application.
Source (Score:1, Interesting)
It doesn't help... (Score:5, Interesting)
...when one of the first issues of Java magazine published an article explaining the Java object runtime model, but made it little more than a FUD-filled advertisement. (What killed it for me: claiming that C++ vtbls are always, in every implementation, done as a huge array of function pointers inside each obvject. It wasn't a typo, either; they had glossy color diagrams that were equally deliberately false.)
I think Java's a decent language, but it invented nothing new. Every language feature had been done before, and without the need for marketing department bullshit.
Times change (Score:4, Interesting)
What the article said is true - JVMs have improved a lot. They are getting better and better, even today. My friend likes to fool around with all these little 3d demos in Java and even the latest JDK (1.4.2 beta) suddenly offers big performance boosts over the previous JDK. The fact that I refuse to ever use a beta Java SDK is another story, though...so I won't see those performance gains for a little while.
Java's memory usage (Score:2, Interesting)
Isn't the memory usage one of negatives of java?
While I don't care too much for java's wordy verbose syntax , anything that competes with Microsoft is A-OK in my book.
If any of you think Sun is making a ton of money on java , check this link out. [com.com]
I am beginning to feel sorry for SUN. They are also in some economic hard times laying off alot of people.
Urban MySQL vs. Urban PostgreSQL (Score:5, Interesting)
And I'm sure MySQL DBAs all know PostgreSQL is slow, bloated, and is only good for huge database rollouts.
Except, well. You get the gist. I'm replying to this article because I now know first-hand that both camps are getting a lot of it wrong.
I've written up what began as a final in-depth studied proof that MySQL wasn't ready for the corporate environment (because I'm a PostgreSQL guy, see?) but ended up reluctantly having to conclude MySQL is slightly more ready for the corporate environment than PostgreSQL!
The writeup [faemalia.org] is on a wiki, so feel free to register and add your own experience. Please be ready to back up your opinions with facts.
Java is slow (Score:3, Interesting)
If there is an "inner loop" of your application that needs performance above all else, and you need to program it in Java for whatever reason, there are two things you should get rid of:
I've just found that you can't trust the garbage collector, no matter how good people say it is. People have been saying it's great since the beginning of Java, and now they say, "It wasn't good before, but it is now." And they'll be saying the same thing in 3 more years. No matter what, the opportunistic garbage collection of C/C++ simply leads to better performance than any language that tries to do the garbage collection for you.
jav vm sucks (Score:2, Interesting)
Fact: Valenz and Leopold did a survey (summarrixed in the most recent issue of Dr Dobbs) whereby Java bytecode programs were converted to MSIL, and the .NET, Rotor, MONO, Sun, Blackdown, etc. VMs were compared. In each case, Microsoft's .NET VM had a 5-20% speed increase over the best java VM, and even the MONO and Rotor VMs had a 2-15% speed increase.
Re:Times change (Score:5, Interesting)
No matter what, there will always be applications that strain our machines to the cusp of their abilities. And there will always be things we want to do that our machines cannot handle. It's only by performance tuning that these tasks can go from impossible to possible.
If John Carmack was forced to program in Java, for example, Doom would only now be possible. And Doom III wouldn't be possible for many more years. Performance matters. Not always, but often.
Performance is ok, but memory footprint... (Score:4, Interesting)
javaw.exe - Mem usage 70,244K
java.exe - Mem usage 9,808K
According to task manager. Granted, now I got 512 to take from but it's still eating up much more memory than anything else.
Kjella
Re:Java for Applications.... (Score:5, Interesting)
The first mistake, of course, is that people think that (a == b) == a.equals(b) which is, of course, only true if a and b are constant strings or one have invoked intern() on them.
The second is to not realize that string concatenation with the "+" operator is a special case and only syntactic sugar for StringBuffer operations. Thus, someone not familiar with Java may accidentally generate huge amount of StringBuffer objects in loops.
However, both these things are very fundamental Java knowledge and among the first thing you learn when studying Java. It's obvious that you don't start coding serious Java without knowing how try..catch..finally works, and equally obvious that you should the know about the deals with the String class.
Re:Java for Applications.... (Score:5, Interesting)
But I can tell you, the that almost every Administration application that runs java, sucks out my soul. Trying to run java applications over X at long distances makes me want to commit suicide. (Lucky theres VNC, so its almost usable...) (I think its a Shared memory problem with the way it works with X windows.)
Then there is the damn JVM's that each app needs, and how i can have multiple versions loaded, so each application works correctly. Java 1.1/1.2/1.3/1.4 and now 1.5 should take even more disk space. Doesnt seem anything is upgradable in java.
And lets not forget, about how Java likes to interact with all custom window manager replacements on windows. For some reason the screen flickers every time you run a java app. (Havnt seen any answers, but it messes with lightstep, blackbox, geoshell, and even stardock applications.)
Humm, and cut/paste sucks, yes you can use key combos, but sometimes in windows, its nice to select all, and copy. (Minor bitch, but still annoying when you have switch ways of doing things...)
If you cant have command line, and you must have a GUI, for gods sake use a HTML. I now make it a point to go with vendors without Java interfaces, they clearly dont use their own products on a day to day basis.
BTW, i said java Interfaces, not Java Beans, etc. We have java running on solaris, works fine, other than the memory leaks. Its the Admin applications that use Java that are crap.
Re:-1, bloody obvious (Score:1, Interesting)
I assume you know (Score:3, Interesting)
For example, should I have a program that has 8 threads and the whole thing uses 28 mb of memory.
A process listing shows 8 entries each using 28 mb of memory, when in reality only 28 mb and not 224 mb (8 * 28) of memory is being used.
Before you blame this one on JAVA too, you might want to know that it's a bug in the concept of process memory reporting (ie. the OS) not JAVA. The OS lists 8 scheduleable programs (the java threads) looking up the amount of memory each has access to (28 mb) without ever hinting that they are all using the same 28 mb.
Truth AND Consequences (Score:5, Interesting)
What's sloppy about the article? Well first of all, Goetz asserts " even though we know they are probably not true, we often can't resist the urge to retell [urban legends]". Where in Hades did he get such a silly idea? Some half-remembered sociology book? Everybody who's ever told me an urban legend really believed in that exploding microwave poodle or the dead construction workers concealed in Hoover Dam. I myself remember feeling rather peeved when I heard the sewer alligator legend debunked.
Second, perfomance myths are not "urban legends". ULs are third-hand stories that are difficult to debunk because the actual facts are hard to get at. "Facts" that people can check but don't are just myths or folklore.
Anyway, here's my favorite performance myth: "more is faster". The most common variation of this is "application performance is a function of CPU speed". Ironically enough, I encountered this one when I was working at JavaSoft. Part of my job was to run JavaDoc against the source code to generate the API docs. The guy who did it before me ran it on a workstation, where a run took about 10 hours. Neither of us had the engineering background to really understand why this took so long. He just took for granted that it was a matter of CPU cycles. I knew a little more than him -- not enough to understand what was actually going on, but enough to be skeptical of his explanation.
Eventually, I put together the relevent facts: (1) JavaDoc loads class files to gain API data, using the object introspection feature; (2) the Java VM we were using visited every loaded class frequently, because of a primitive garbage collection engine; (3) forcing a lot of Java classes into memory is a good way to defeat your computer's virtual memory feature...
Eureka! I tried doing the JavaDoc run on a machine with enough RAM to prevent swapping. Run went from 10 hours to 10 minutes.
Another variation of this myth is "interpreted code is slower than native code". That bit of folklore has hurt Java no end. If your application is CPU-bound, you might get a performance boost by using native code. But, with the obvious exception of FPS games, how many apps are CPU bound?
Here's another variation: "I/O performance is directly proportional to buffer size". At another hardware vendor I worked for, one of our customers actually filed a bug because his I/O-bound application actually got slower when he used buffer sizes greater than 2 meg. It was not, of course, a coincidence that 2 meg was also the CPU cache size!
What about numerics (Score:3, Interesting)
I notice that while the article mentioned deals with a couple of nit-picky optimizations, it doesn't tell us anything useful about how to make Java rock on the numerics, which is the pace performance matters most to me. For instance, how would you write FFTW in Java?
Re:Enough about the things Java can never fix. (Score:2, Interesting)
this is what I dislike about it. I don't like languages built to force me away from a bad habit, it means I've been forced into (someone else's idea of) the perfect habit. I have yet to find the perfect habit.
The real solution for bad C/C++ code (I slipped C++ in there on purpose) is to learn what you are doing. If you can't learn the ins and outs of procedural logic than don't create an imperitive language, you still have the the real pitfalls still present, and create a new paradigm altogether at a higher language, like Zope.
There are good problems that VMs address, but they are niche areas (dynamically distributed code that can benefit by using heterogenous networds, and certain idioms like self modifying code... somehow I don't think the joy of self modifying code is what Java is about, but then, I'm told again and again how Java will be as fast as C++ when the JVM's dynamically recompile and optimize code as it runs... loluck), and not general purpose areas.
Re:Times change (Score:5, Interesting)
In those days, I hated Java and Macromedia Flash, because even then, they only used it to do the exact things a scripted mouseovers could reproduce. Those two programs accounted for most of the slowest web page loads...
Now I have a P4 1.5 GHz with gobs of RAM running XP, and I have a hard time running enough tasks to slow it down. With a cable modem, I don't care about huge binary applets. I guess Java just needed some hardware upgrades for it to become useable...
Re:Worst...disproof...ever (Score:1, Interesting)
Re:To what extent does this exist in other languag (Score:5, Interesting)
Probably lots. Everywhere.
As a crude approximation, 90% of the time is due to 10% of the code. Improving the "efficiency" of the 90% of the code that is responsible for only 10% of the time tends to be counter-productive. Of course there are no easy magic rules for how to improve the 10% of the code that is responsible for 90% of the time, or even identify exactly what that 10% really is.
What does work is to have a sense of how long things should take and find and cure whatever is taking much longer than it should.
Immutable objects do suck. (Score:1, Interesting)
public String indent(int indent)
{
String output = "";
for (int x= 0; x indent; x++)
{
output += "\t";
}
return output;
}
If you use this function you are waisting cpu power like crazy because strings are immutable. This has nothing to do with the jvm, but is inherent in immutable strings, no matter what language they are in. Here's the reason... output starts as an empty string then the first "\t" is added to it. What the compiler MUST do is create a 3rd string of the length of both output and the "\t" string. Then it copies all of output in to the new string then the "\t". On the next loop output = "\t" and we add another "\t". To do this we make a third string again and start copying. BUT we already copied everything in output to a new string once already, so we're copying it for a second time. This is where we waste cpu cycles, and this is why not using immutable strings properly can cause problems. By the end of the function, the first "\t" is copied indent number of times, to indent number of temporary objects. If we change the function to..
public String indent(int indent)
{
StringBuffer output = new StringBuffer();
for (int x= 0; x indent; x++)
{
output.append("\t");
}
return output.toString();
}
No temp strings are created. No "\t"'s are copied. Thus we don't waste cpu cycles.
Re:Java is slow (Score:2, Interesting)
Setting references to null in Java is not what I would expect from a professional programmer.
Where are the Java desktop applications? (Score:1, Interesting)
Don't moderate - post an example of a useful java desktop application.
Re:Antidote (Score:3, Interesting)
This issue is debatable. The example the author gives is a bad one.
What small objects? For me these are iterators. I use a lot of them in my designs. Someone else may use complex numbers. A 3D programmer may use a vector or a point class. People dealing with time series data will use a time class. Anybody using these will definitely hate trading a zero-time stack allocation for a constant-time heap allocation. Put that in a loop and that becomes O (n) vs. zero. Add another loop and you get O (n^2) vs. again, zero.
Don't create a new object each time through the loop. Reuse the object to sidestep allocation/deallocation. In a tight-loop where performance matters this will help. In a situation where performance doesn't matter, then this doesn't matter at all.
2. Lots of Casts
Java 1.5 will have generics.
3. Increased Memory Use
Well let's look at the three points the author tries to make.
3.1. Programs that utilize automatic garbage collection typically use about 50% more memory that programs that do manual memory management.
What's a typical program? This one can be ignored.
3.2. Many of the objects that would be allocated on stack in C++ will be allocated on the heap in Java.
See above. This can be minimized.
3.3 Java objects will be larger, due to all objects having a virtual table plus support for synchronization primitives.
I'll admit ignorance on this issue, although my gut reaction is that hard facts need to be presented on the issue. As the original article said, Java has come a long way since 1.0.
4. Lack of Control over Details
This is a mixed bag. You could say that a language without pointers, and consequently direct memory access, will never be as powerful as a language without. But we know this isn't true. Functional languages are as powerful (from a programmatically expressive point of view, not computationally expressive), if not moreso considering the other features they offer: closures, anonymous functions, first-class functions, etc.
5. No High-Level Optimizations
The whole concept of template-metaprogramming is entirely orthogonal to the intended style of C++ programming. Meshing template-metaprogrammed code with regular code is a daunting task (on the large scale). It's a hack to get features that aren't in the language, such as a more computationally inclined macro system, lazy evaluation, etc. With that said, it is useful. However, I would rather see those features actually put into C++ and/or Java rather than having to resort to the abuse of C++'s generic programming facilities. A counter to this could be that templates allow for unbridled extention to C++, but that is most definitely not the case. Until C++ has a macro system that rivals Common Lisp's that assertion cannot be made.
Re:Where are the Java desktop applications? (Score:3, Interesting)
OpenOffice at the very least uses Java within it.
Hotjava is a functional web browser.
Java just isn't popular on the desktop, because you never know what crazy JVM version someone's going to have on their system. But there's definately a place for it.
Re:Where are the Java desktop applications? (Score:1, Interesting)
Re:Where are the Java desktop applications? (Score:4, Interesting)
Re:Java is slow (Score:2, Interesting)
When you say "I'm not complaining about the implementation so much as the architecture." are you suggesting GC'd languages make programming harder? If so, let me suggest that the real problem is that Java doesn't have adequate tools for detecting accidentally retained objects.
Re:Java is slow (Score:3, Interesting)
I said you can control the initial size of the memory. Many JVMs will normally not go outside that size unless it has already GCd and it still doesn't have enough memory.
I should be able to tell the GC that I am done with an object right here and now, rather than waiting for the low priority GC thread to take too long to pick it up.
Yes, that would certainly be useful; however, you can usually (depending on the VM) kick the GC off at sensible rates or tune the GC to run more often.
Still, there are alternatives; and I generally still prefer using Java inspite of these kinds of relatively minor shortfalls in the language because it is still a more powerful language than C++ (powerful in the sense of my programs being shorter in Java than C++).
Re:It doesn't help... (Score:3, Interesting)
It's there in String, isn't it? They declared it good enough for themselves, but not good enough for us.
Java's nothing new and exciting if you program in more languages than what the corporate world tells you is popular. They just had a marketing machine behind them.
Re:Why do poor coders have tunnel vision? (Score:3, Interesting)
Java offers two main advantages: a beefy class library, and enough of the bondage-and-discipline nature to herd legions of mediocre programmers into typing a lot and doing a lot of important-looking work. Other than that it seems to combine the disadvantages of other languages, producing the old saw "Java: All the power of C++ with the blinding speed of Smalltalk".
The thing about knowing many languages is that you can evaluate them and choose a better language, even than the conventional default. The idea that programmers should know many languages has been invoked far more often to justify the incumbence of bad languages rather than promote the adoption of good ones.
Re:Times change (Score:5, Interesting)
Actually, carmack considered java for Quake 3, but decided against it because he was worried about the quality of JVMs (something he couldn't control). Not because of their speed. He's said on many occasions that optimization in game code isn't even important anymore, since the vast majority of the work is done by the CPU is code inside the video card driver. He's said that for quake 3, even doubling the speed of the game code would only give a 10% improvement in framerate.
Thread synchronization (Score:3, Interesting)
What I think the author is trying to say is that "Premature optimization is the root of all evil in programming". Most of the stuff enumerated in the article usually has a minor impact on performance and no programmer should worry about them during coding.
However, when all the coding is over, the system will have to meet some performance criteria. If it crawls like a quadraplegic snail, a programmer will have to get its hands dirty and tweak his code to remove the bottlenecks.
It is very possible that one of those bottlenecks will be rooted in these so-called "urban legends". Gross over-allocation of immutable objects and synchronized methods may impact performance.
It happened to me a while ago. I was working on a system that was designed to use lots of threads and message passing. We had completed the development and were ready to move on to testing. The system worked pretty well on the developers' workstations (1 CPU) but when we deployed it on our much more powerful servers, the throughput went down. At first, we thought that it was a thread contention problem but after some testing, we realized that the cost of obtaining a lock on multiprocessor systems is orders of magnitude higher than on uniprocessor systems.
This is because on uniprocessor machines, thread synchronization simply amounts to doing an atomic if/set. However, on multiprocessor machines, complex mechanisms have to be used so that the lock becomes effective for both processors. It involves a lot more overhead because the required extra-cpu operations cost a lot of cycles.
Insignificant optimizations (Score:3, Interesting)
I don't put much stock in performance tips that are offered without explanation. And in deciding whether to use a tip, I weigh not only the performance trade-offs (near call vs. far call) but also the programming trade-offs (single source file vs. modular code). End-users want reliable functionality, and efficient programming practices often make more difference than code tweaks.
Re:Java not always slower (Score:3, Interesting)
I tried the simple and stupid
int fib(int i) {
if(i<=2) return 1;
else return fib(i-1)+fib(i-2);
}
without optimization on javac and gcc (the latter was slowed down by it so I figured it wouldn't be fair). Calculating up to 45 on my P3 800MHz took, according to 'time', 1m5.554s. Java used 0m51.807s (and that's including the jvm loading).
Pretty neat.
java -Xint (no JIT) is still running though.
Re:Urban MySQL vs. Urban PostgreSQL (Score:1, Interesting)
I can easily corrupt MySQL with concurrent transactions in the same table with at least 2 indexes. Thanks but no thanks. MySQL, let me know when your transactions don't corrupt under real load.
It is a trade-off (Score:3, Interesting)
Of course, some people interpret the statement to be a comparison to C or C++. Now, Java has a lot of behaviors that are slower than C/C++.
For example consider array access. Java implicitlu checks the bounds of an array whereas in C/C++, that is leftas an exercise for the programmer. Unfirtunately, most pogrammers are lazy and don't exercise that. Hence with C/C++ you have buffer overruns where nasty clients can execute arbitrary code. In Java,you'd have an ArrayIndexOutOfBoundsException which would prevent the malicious data form being pushed into memory. This, it was a trade-off between security and speed.
Garbage collection is another one of these. Ever seen a C/C++ program with memory leaks (why, I even remember the X11 libraries leaking)? With Garbage collection, your memory consumption is slower and your memory freeing slower (since Java has to determine using an algorithm what isn't used anymore whereas in C/C++ its coded into the logic). Java also seems slower becaus ethat GC overhead is generally experiences as "pauses" whereas n C/C++ the object deletion occurs through the execution of the program. But this was a trade-off. A trade-off between making developers lives easier and the programs more stable versus the speed and risk of developer-coded memory deallocation.
Java also has immutable Strings With a mutable String class, I know I could eliminate a lot of Object creation. But the String class was made immutable so everything could be final, and thus optimized better for. This was a trade-off between the speed of Strings themselves and the speed of creating a new String everytime you need to concatenate.
There are many more cases, but I think you get the point. Java does things ways that are slower. But many of these are trad-offs -- trade-offs to make the programs more secure, development faster and syntax/API simpler. Then they go and address the speed in other ways by improving the VM (HotSpot, incremental/concurrent GC, etc.)
In my opinion, I would've accepted a 100% Java version of Microsoft Outlook, even if it was slower, if I didn't have to worry about the nex buffer overrun exploit hijacking my computer.
bogus (Score:2, Interesting)
Goetz is in denial and just waves away problems using straw men without providing a truly balanced view of cases where these things cause problems. It depends on the VM, if things are done in a tight loop, and so on.
Suffice it to say I did not like this article. As always, you need to measure application performance for yourself to find true bottlenecks.
-Kevin
Endless confusion (Score:2, Interesting)
The "final is faster" stuff is totally irrelevant, even if it is true for some cases, particularly static final methods. However, final is not Java's answer to c++ inline. Final is there just to say "do not override this." If the reason it's there is for "performance," it shouldn't be there.
The immutable object thing is equally irrelevant. Strings are a particularly pleasant illustration, taking the argument about them to its logical conclusion leaves you with an array of char. If that's what you want to work with, what you probably want is a C compiler. You can look under the hood at StringBuffer and String and try to dope out what the compiler and runtime are doing. The better approach is to think about what you're doing, and make sure you're thinking in Java. Often if strings are actually the bottleneck it's because the coder wants their perl or c approach to a problem to work, not because garbage collection is more efficient one way or the other.
In many ways, I wish primitives weren't exposed in the language. It would be a subtle hint to those who still think with pointers, arrays and free() in the back of their head "this isn't C" and reduce the stupid performance tricks people try. I also wish prior to 1.4 javac had issued a "this isn't perl" warning if you used more than a couple StringBuffers and StringTokenizers per class. Alas, with java.util.regex those who approach every problem with "I need a regular expression that will..." can wrap their bad habits in Java code, and 1.5 seems to be devolving to c++ with crappy pseudo-templates and precompiler-ish directives.
Re:Java is slow (Score:3, Interesting)
Simpler? Sure, if you have just a few objects. But with a lot of objects, it's much, much, much more complicated. As anybody who has spent hours running down a malloc/free error in somebody else's code can tell you.
If you can't use new and delete or free and malloc correctly, then there's probably a lot of other things you can't do well either.
Welcome to the human condition.
I have a limited amount of attention and effort I can spend. The whole point of computers is to take the boring parts and have a machine do them, freeing me to think about the important, interesting parts. For the kind of stuff I code, memory allocation is in the boring category about 99.99% of the time. For the 1 time in 10,000 where it really matters, then fine, I'll write native code. But the rest of the time, let the machines do the donkey work.
For example, I do not see how things would become much more dangerous in Java if you added a delete operator to complement the new operator.
You can make the same argument about a lot of things in Java. E.g., multiple inheritance or platform-dependent code. The theory behind Java is that there are some features that a) take an expert to use properly, b) are dangerous when novices try to use them, and c) complicate things a lot when they exist. So Java doesn't allow those, or at least makes it hard enough to get to that only the experts bother (e.g., JNI).
This sort of daddy-knows-best behavior is annoying, and absent business considerations, I wouldn't put up with it. But if I'm going to have to inherit the code bas of J. Random Programmer, I'd rather it be in Java, because although it's impossible to write really brilliant code, it just can't end up as bad as in C or Perl.
Re:To what extent does this exist in other languag (Score:3, Interesting)
I think with respect to web programming, this is itself a myth. This rule of thumb seems to have reached the popular consciousness of developers in the 80s when desktop apps ruled. This was a time when each additional user adds a CPU. And it's true; in such a world, you don't worry about that other 90%. But when you have a fixed number CPUs shared by vastly more clients, you need to worry about more than just the 10% most offending code.
In addition, I've found that programmers can be Soooo lazy that even the 90/10 isn't true in practice. I've seen the same expensive mistakes happen all over nearly every page of a web app.
This is why so many intranet and internet applications seem slow. People put-off worrying about performance until the last step (just like they are told to). And then it might be too late.
Developers get lulled into thinking everything's fine. Seems fast enough to them. But they are one user. Hundreds or thousands of real users will hit their app. If it's just OK for one, it's probably not OK for hundreds. Even if things seem lightning quick to you, they may not be for the hoard.
In a lot of cases, performance can't be gained just by optimizing the little things here and there. In these cases, you often have to restructure how you approach things app-wide; you find yourself tweaking sections of almost every module. Or yanking out nice abstractions in favor of going bare metal. That takes even more time to do after the fact.
My rule of thumb with web apps is actually to:
dave
Re:Truth AND Consequences (Score:3, Interesting)
For example, we recently "tuned" our transaction engine to the extent that it was truly CPU bound in a multiple tiered architecture with hundreds of remote clients using network comms to add transactions. Until this time, we had never been able to drive the engine at full utilisation of a single CPU (itis single threaded) because of various other inefficiencies in the system as a whole. Now that we have reached the point where the overall throughput is constrained by CPU, any enhancements we make to specifc algorithms in the system will go straight to the bottom line, of increasing our functional payload per unit time.
What does this have to do with Java? Well, it is my belief that the vast majority of Java's problems are like the ones we used to have, ie not algorithmic weaknesses in the implemented software but structural impediemnts to fully loading the CPU, and so even if you try to optimise your code you can't gurarantee that the improvements will appear in the bottom line in terms of the applications performance.
It is for this reason that I cannot use Java for my real world applications (oh, by the way we were 100% loading one of the processsors on a dual 750MHz SparcIII [Starfire I think], so it's not like we were trying to squeeze blood from a stone)
It is the same reason why I cannot use functional/logic programming for the complex algorithms that we use for parts of the system, I need to control the execution path becasue I do not have the luxury of parallelising that path and so every action is on the critical path of the overall system performance. This is a fact lost on the FP, LP zealots (IMHO) [btw I like LP/FP FWIW]
Re:Java is Slow (Score:2, Interesting)
Actually, I'd be willing to bet you considerable amounts of money (Euro's instead of dollars even, they're currently worth more) that the performance you are perceiving in that test is not Java pur sang but to a far greater extent the load time of the JVM (which, for a job as simple and involved code as limited as for 'ls', is undoubtedly quite significant). The JVM has a nasty tendency, you see, to preload lots of classes -- which, often, don't get used in small programs and so are deadweight. I was playing around with the JNI the other day under JDK1.4.1 on Linux and just wrote a simple "Hello World" -- a C program that embeds the JVM (pretty much what the actual "java" command does), loads a class (two, since the VM must load java.lang.Object in response to my loading my own class), creates an instance, calls a method, ends. Now, I can't claim a perceived performance of "instantaneous" like with a compiled C program, but certainly fast enough that you'd miss it if you blinked. Much faster than the "java" command, which loads a whole slew of stuff.
Also, as another anekdotal hint towards preloading overhead, I was also playing around recently with java.math.BigInteger and wrote a program that calculates, for input n, the nth Fibonacci number. This is done recursively and it creates a new BigInteger object in each recursive call, plus two to start out with. I also put the "input-calculation" part in a loop, so I have a pretty good idea of performance minus startup overhead. And that, I can assure you, is preceived as instantaneous up until F-number 5500 and "fast" up until 17000-20000 (depends on the person). After that it slacks off until you reach 45065 (where I hit the maximum recursion depth) -- which I assure you will be "slow" pretty much everywhere. By comparison, Mathematica crawls when calculating number 50. And there's no perceived difference with Python, but my Python implementation doesn't do F-numbers beyond 998.