Java IO Faster Than NIO 270

Posted by kdawson on Tuesday July 27, 2010 @04:26PM from the question-conventional-wisdom dept.

rsk writes "Paul Tyma, the man behind Mailinator, has put together an excellent performance analysis comparing old-school synchronous programming (java.io.*) to Java's asynchronous programming (java.nio.*) — showing a consistent 25% performance deficiency with the asynchronous code. As it turns out, old-style blocking I/O with modern threading libraries like Linux NPTL and multi-core machines gives you idle-thread and non-contending thread management for an extremely low cost; less than it takes to switch-and-restore connection state constantly with a selector approach."

Java IO Faster Than NIO

This discussion has been archived. No new comments can be posted.

Search 270 Comments Log In/Create an Account

Comments Filter:

Re:And this is news? (Score:3, Interesting)

by slack_justyb ( 862874 ) writes: on Tuesday July 27, 2010 @04:34PM (#33050340)

Mod parent up! There is no better way to sum up this article, other than "Yes, we knew that already, but we don't do it that way anymore because we're all lazy."

Re:And this is news? (Score:1, Interesting)

by jellomizer ( 103300 ) writes: on Tuesday July 27, 2010 @04:49PM (#33050546)

Perl is not New Fangled. I am sorry to say Perl is one of those .COM languages that has sparked peoples interest for a few years but have settled down to niche language. So it is now an Old School Language... Sorry...
Lines of code doesn't equate to easy. I could write almost any program with one line of code of APL. However there is a steep learning curve to APL, Debugging APL code is near impossible and what is worse is trying to add updates to it. Perl can be coded very Dense as well with heavy use of Regular Expressions... However Regular Expressions come at a cost of readability and upgradability as well.

Re:Old news. (Score:5, Interesting)

by bill_kress ( 99356 ) writes: on Tuesday July 27, 2010 @05:25PM (#33050928)

I had a problem where the customer wanted to discover a class-b network in a reasonable amount of time.
Aside from Java's lack of ping causing huge heartaches the limitation was that when using old Java IO it allocated a thread per connection while waiting for a response.
This limited me to 2-4000 outstanding connection attempts at any time. Since most didn't connect, I needed at least 3 retries on each with progressive back-off times--the threads were absolutely the bottleneck.
I reduced the time for this discovery process from days (or the machine just locked up) to 15 minutes. With nio I probably could have reduced it significantly more (although at some point packet collisions would have become problematic).
NIO may not be defective, it just may be solving a problem you haven't conceived of.

Re:Should be using Scatter/Gather +IOCP on windows (Score:3, Interesting)

by PhrostyMcByte ( 589271 ) writes: <phrosty@gmail.com> on Tuesday July 27, 2010 @05:58PM (#33051242) Homepage

Actually, they are better for different things. In Linux you get notified when you can perform an I/O, perform a bunch of non-blocking I/O, and then wait for another notification. In Windows you perform an I/O, and it will either complete immediately or notify you when it does. This means async I/O on Linux can use less memory, while on Windows it can give higher throughput.
Of course, these are merely API advantages -- if the implementation is poor, that won't matter. I'm not aware of any serious tests on this. And even then, Windows lacks an equivalent to Linux's splice(), and its equivalent of sendfile() is crippled on desktop versions to discourage using a desktop as a file server.

True for JAVA, but not generally true... (Score:5, Interesting)

by grmoc ( 57943 ) writes: on Tuesday July 27, 2010 @06:30PM (#33051506)

This may be true for Java.
It isn't true for C/C++.
With C/C++ and NPTL, the many-thread blocking IO style yields slightly lower latency at low IO rates, but offers significant latency variability and sharply decreased thruput at higher IO rates.
It seems that the linux scheduler is much to blame for this-- the number of times that a thread is scheduled on a different CPU increases dramatically with more threads, and this trashes the caches.
I've seen order-of-magnitude decreases in performance and order-of-magnitude increases in latency as a result of what appears to be the cache trashing.

Re:And this is news? (Score:3, Interesting)

by Entrope ( 68843 ) writes: on Tuesday July 27, 2010 @06:57PM (#33051696) Homepage

The extra stuff to take care of is why asynchronous I/O applications tend to have lower throughput than synchronous I/O if you have good OS threading.
There have only ever been two good reasons to use application-multiplexed I/O: Your OS sucks at threading (like Windows and Solaris the last time I looked at them), or you have more clients than memory. Languages like C and Java require applications to dedicate multiple kilobytes per thread for the thread's stack -- but usually default to megabytes per thread, so if you have thousands of concurrent clients, you will soak up memory in fairly large quantities.
Applications like IRC and Jabber can have tens of thousands of clients on a single server (until some jackoff decides to DoS it), so there is strong pressure to minimize per-client memory use.

Re:True for JAVA, but not generally true... (Score:5, Interesting)

by grmoc ( 57943 ) writes: on Tuesday July 27, 2010 @07:27PM (#33051922)

Unfortunately, nothing I can publish without permission.
I can say that I'm in charge of maintaining the software that terminates all HTTP traffic for Google. Draw your own conclusions.

Re:Old news. (Score:1, Interesting)

by Anonymous Coward writes: on Tuesday July 27, 2010 @08:41PM (#33052476)

Absolutely.
First, the paper does says that synchronous NIO is just as fast as (synchronous) IO. So it really is about synchronous vs asynchronous.
Second, asynchronous is inherently slower than synchronous due to the fact that, in addition to the OS having to figure out what to wake up, the application also has to restore the context in the asynchronous case, but not in the synchronous one, since the state is then more correctly called the stack. The exception is when, due to caching issues or OS limitations *and* the state being so small (as in your case), it is faster to, basically, work around the OS. This obviously is not the general case.
Last, it's kind of funny to hear asynchronous being presented as the "modern" way, and synchronous/multithreaded as the "old" way, since asynchronous was basically invented to work around the fact that OSes didn't supports threads... Which hasn't been true for a while.
Cheers.

Re:Should be using Scatter/Gather +IOCP on windows (Score:5, Interesting)

by dr2chase ( 653338 ) writes: on Tuesday July 27, 2010 @08:42PM (#33052478) Homepage

I'm afraid I have to disagree. No fan of Microsoft, but I helped build a the-Java-Programming-Language-TM Virtual Machine on Windows, with M:N threads, back before Java 1.4, and IO Completion ports worked well, and we got good performance out of them. We rewrote the network IO to work behind the curtain with threads, with the result that the one-socket-per-thread model actually did the I/O completion port thing, with as many as 32k Java threads running in a grand total of about a dozen Windows threads (stacks were small, stacks grew on demand. Certain things were tricky.).
The largest wins of doing it this way were:
1) got to use the underlying OS's preferred way of doing async IO (on another OS, we might do it differently)
2) lots of threads allowed
3) because Java "context switches" were extremely lightweight, lots of "expensive" stuff got faster (e.g., lock contention).
I also accidentally (really -- I had to choose one of two threads to go first, and chose the right one, on a whim) built-in an anti-convoying heuristic for contended locks, that was really useful when code contained a hot lock.
But, the rest of the system was not especially Microsoft-y; all of us came form a Unix background, and when we were done, we did Unix again. IO Completion ports, at least one Windows, were the best choice (and I tried it 2 or 3 other ways, and they sucked).

Re:And this is news? (Score:3, Interesting)

by LionMage ( 318500 ) writes: on Tuesday July 27, 2010 @09:12PM (#33052658) Homepage

Java may not be "sexy" anymore (or "all the rage" as you put it), but it is not exactly a niche language. It still runs in surprisingly many places, like cell phone apps (yes, a lot of us still use regular cell phones, and Android is Java-ish but with some tweaks), and more importantly just about every corporate data center uses Java. That last "niche" is pretty huge, and the only thing that threatens Java in that space is dot-Net, the Java platform clone.
Java, like it or not, has become the COBOL of the 21st century. It's ubiquitous.
I agree that Perl makes code hard to maintain (especially in the sense that one developer won't necessarily readily understand another's Perl code, since everyone has his own favorite idiom), but you make a lot of claims that I don't see supported by facts. Perl CGI might be frowned on these days in some circles, but there are plenty of sites that use Perl as a basis -- including this one, Slashdot. So saying Perl is no longer used for CGI scripts is probably false, as there are plenty of folks who clearly think it's "good enough."
You're trying to make Perl and Java both sound like fads, but the truth is neither language is going away anytime soon, as each is too useful for too large a segment of the developer population.

Re:Should be using Scatter/Gather +IOCP on windows (Score:3, Interesting)

by dr2chase ( 653338 ) writes: on Tuesday July 27, 2010 @10:41PM (#33053108) Homepage

The goodness of this strategy assumes some sort of linear-in-delay metric. If there's a deadline, with high penalties for exceeding it (say, if you are serving web pages), you don't want to be stochastically fair, you want to be fair.
The scheduler I wrote was 100% fair, EXCEPT in the case where a thread exited a critical section that had other threads competing for (i.e., blocked). In that case, the exiting thread would give up its quantum to the head (longest waiter) of the queue, who would do the same, until the quantum expired or the queue of blocked threads was empty, in which case, the last thread through the gate would get the remainder of the quantum (not fair). The result of this is that the lock is left in the unowned state, and threads will get a better chance at blowing right through the critical section in the future.
You could see how different VMs approached the problem, running things like TP benchmarks, or just a beating-on-a-lock benchmark. We blocked threads in FIFO, another VM did LIFO, another VM did something bimodal and weird. And as far as throughput went, when this sort of badness (a hot lock) occurred, we were far and away the fastest, mostly because of the user-mode context switches, but also because of the no-convoy heuristic that kept locks clean.

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Java IO Faster Than NIO 270

Java IO Faster Than NIO More Login

Java IO Faster Than NIO

Re:And this is news? (Score:3, Interesting)

Re:And this is news? (Score:1, Interesting)

Re:Old news. (Score:5, Interesting)

Re:Should be using Scatter/Gather +IOCP on windows (Score:3, Interesting)

True for JAVA, but not generally true... (Score:5, Interesting)

Re:And this is news? (Score:3, Interesting)

Re:True for JAVA, but not generally true... (Score:5, Interesting)

Re:Old news. (Score:1, Interesting)

Re:Should be using Scatter/Gather +IOCP on windows (Score:5, Interesting)

Re:And this is news? (Score:3, Interesting)

Re:Should be using Scatter/Gather +IOCP on windows (Score:3, Interesting)

Related Links Top of the: day, week, month.

Slashdot Top Deals

Slashdot