Scalable Nonblocking Data Structures 216

Posted by kdawson on Tuesday May 27, 2008 @04:10PM from the don't-fear-the-multi-core dept.

An anonymous reader writes "InfoQ has an interesting writeup of Dr. Cliff Click's work on developing highly concurrent data structures for use on the Azul hardware (which is in production with 768 cores), supporting 700+ hardware threads in Java. The basic idea is to use a new coding style that involves a large array to hold the data (allowing scalable parallel access), atomic update on those array words, and a finite-state machine built from the atomic update and logically replicated per array word. The end result is a coding style that has allowed Click to build 2.5 lock-free data structures that also scale remarkably well."

Scalable Nonblocking Data Structures

This discussion has been archived. No new comments can be posted.

Search 216 Comments Log In/Create an Account

Comments Filter:

why (Score:5, Interesting)

by damn_registrars ( 1103043 ) writes: <damn.registrars@gmail.com> on Tuesday May 27, 2008 @04:15PM (#23561441) Homepage Journal

why are there fewer than 1 thread per core? It says 768 cores, but only 700 threads. Does it need the rest of the cores just to manage the large number of threads?

false sharing? (Score:2, Interesting)

by shrimppoboy ( 853235 ) writes: on Tuesday May 27, 2008 @04:49PM (#23561961)

The brief description in the article sounds suspicious and incompetent.
1. A common killer in parallelization is false sharing. That is, threads on two processors fight over a cache-line even though they are accessing independent variables. A cache-line is typically bigger than an individual variable. The approach of using adjacent elements of an array for parallelism sounds naive. One needs to pad the array.
2. Updating a shared variable, especially a non-scaler, in an inner loop is naive. One should reference local scalers in inner loops when parallelizing. Just once, should the thread update the shared variable. Don't reference non-scalers or shared variables in an inner loop. Don't lock in the inner loop, either, if you can avoid it.
If it is not an inner loop then the locking is probably not a big issue any way.
3. Java, really, Java?

Re:Java???? (Score:3, Interesting)

by tppublic ( 899574 ) writes: on Tuesday May 27, 2008 @04:56PM (#23562079)

"Why not use C++"
Umm, because Azul runs the Java in hardware. It *is* optimized by being in Java.

Re:I think you have it wrong (Score:1, Interesting)

by Anonymous Coward writes: on Tuesday May 27, 2008 @05:02PM (#23562155)

Well, actually, it's 16 Vega2 chips (each of which is termed a "processor"), each chip has 48 cores, making for 768 total cores. 2 chips per board, and 8 boards in a system. All chips are fully meshed (no switch, fabric, or bus, just a whole bunch of direct multi-GHz wires), so it's a flat SMP UMA machine.

The new Vega3 (announced last week) takes it to 54 cores per chip, and 864 cores in a system (16 x 54 = 864 ...).

Re:scalable noNBLocking data sTRructures .. :) (Score:5, Interesting)

by Seferino ( 837142 ) writes: on Tuesday May 27, 2008 @05:13PM (#23562343) Homepage

Good for us. Get the rabble away from Slashdot. Only true nerds should understand the contents. Let me add a few keywords to get rid of the softies: monads, higher-order type systems, return type, genericity.
Your turn.

No it is Lock Free (Score:2, Interesting)

by tbcpp ( 797625 ) writes: on Tuesday May 27, 2008 @05:19PM (#23562437)

I used to think this too until I saw the video by the article's author. By lock free we mean that if the thread that has the "lock" were to die, it would not stall out the entire program. With CAS updates, a crashing thread would simply die and cause no ill effects to the data structure. With Mutex style locks, if the locking thread crashes (or otherwise forgets to unlock the mutex) then the entire program grinds to a halt as other threads start waiting on the lock. The maximum time a CAS "lock" can exists is 1 CPU cycle.

Re:Java???? (Score:2, Interesting)

by AKAImBatman ( 238306 ) writes: <[akaimbatman] [at] [gmail.com]> on Tuesday May 27, 2008 @06:37PM (#23563451) Homepage Journal

Your link says: "Unfortunately, there's also a third conclusion. It seems that it's much, much easier to create a well performing program in Java. So, please consider it for a moment before you start recoding your Java program in C++ just to make it faster..."

In any case, the author of that test failed. The test includes start-up times in the results. Java is always going to lose micro-benchmarks where start-up time is included. Why? Because the JVM load gets included in the benchmarks. As anyone who knows Java well can tell you, Java's startup time is extremely costly. Once the JVM is running, however, it produces much better performance. The end result is that the startup cost will eclipse the actual benchmark being run.

I'm willing to bet you'll have a hard time finding a modern benchmark for Java that doesn't make that (simple!) mistake.

However, I will admit that I was overextending my reach by suggesting that no one runs benchmarks anymore because of the strong performance of Java. The reality is (as one might expect) much more complex. It's not that Java performed poorly in benchmarks. Quite the contrary. By 2004, Java was cleaning up in benchmarks. The problem is that micro-benchmarking sucks. All you end up getting is the two sides fighting over who can produce the most optimized benchmark possible. Which invariably ends up defeating the purpose of benchmarking for "real-world" code.

The long and short of it is that Java was proven to be more than fast enough, with the average case easily keeping pace or beating C++ code. Thus the statement by the author of the article you linked to:

"Unfortunately, there's also a third conclusion. It seems that it's much, much easier to create a well performing program in Java. So, please consider it for a moment before you start recoding your Java program in C++ just to make it faster..."

Can also be done with a clean cache (Score:3, Interesting)

by Gazzonyx ( 982402 ) writes: <scott@lovenberg.gmail@com> on Tuesday May 27, 2008 @11:35PM (#23566343)

I don't necessarily take issue with what you've said (I've got you on my friend list, so I'm fairly sure that your one of the professionals on this site whose insight I love to read... I'm a software development major in college, so I usually add programmers to my friends list when I genuinely value their insights...), but I would like to point out that I've seen another approach to your first point about locking busses. The Linux kernel, IIRC, does a sideways and backwards cache flush to RAM before masking IRQs on an SMP system. I think it defers top half of interrupts to another CPU and doesn't queue the bottom halves. I might be mistaken on that, though. It then 'takes over' the CPU for the process/thread, and when it's done, it flushes all the caches once again.

But, as you said, the lock doesn't require a bus lock. Please correct me if I'm wrong on this, it's been a few months since I've read that piece of code and I'm a little hazy on the exact details.

Re:Java???? (Score:3, Interesting)

by TheLink ( 130905 ) writes: on Tuesday May 27, 2008 @11:42PM (#23566403) Journal

When performance doesn't matter much I'd use perl instead of Java, or some other higher level language (e.g. Python).

Of course when your boss/customer requires Java, use Java :).

I'm mixed on Lisp- Lisp is kind of high level and kind of not... I guess I'm just too lazy - perl has CPAN = lots of wheels I don't have to reinvent - and document ;).

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Scalable Nonblocking Data Structures 216

Scalable Nonblocking Data Structures More Login

Scalable Nonblocking Data Structures

why (Score:5, Interesting)

false sharing? (Score:2, Interesting)

Re:Java???? (Score:3, Interesting)

Re:I think you have it wrong (Score:1, Interesting)

Re:scalable noNBLocking data sTRructures .. :) (Score:5, Interesting)

No it is Lock Free (Score:2, Interesting)

Re:Java???? (Score:2, Interesting)

Can also be done with a clean cache (Score:3, Interesting)

Re:Java???? (Score:3, Interesting)

Related Links Top of the: day, week, month.

Slashdot Top Deals

Slashdot