Panic in Multicore Land 367
MOBE2001 writes "There is widespread disagreement among experts on how best to design and program multicore processors, according to the EE Times. Some, like senior AMD fellow, Chuck Moore, believe that the industry should move to a new model based on a multiplicity of cores optimized for various tasks. Others disagree on the ground that heterogeneous processors would be too hard to program. The only emerging consensus seems to be that multicore computing is facing a major crisis. In a recent EE Times article titled 'Multicore puts screws to parallel-programming models', AMD's Chuck Moore is reported to have said that 'the industry is in a little bit of a panic about how to program multicore processors, especially heterogeneous ones.'"
Self Interest (Score:3, Informative)
So take it all with a grain of salt
--Q
OpenMP? (Score:2, Informative)
Languages (Score:2, Informative)
Comment removed (Score:5, Informative)
What about... (Score:2, Informative)
Re:Self Interest (Score:5, Informative)
Re:Panic? (Score:5, Informative)
Re:Self Interest (Score:2, Informative)
It's worth noting that Intel will also be going down this route in a similar timeframe, integrating an Intel graphics processor onto the CPU die.
Re:Languages (Score:5, Informative)
I think the wailing we're about to hear is the sound of thousands of imperative-language programmers being dragged, kicking and screaming, into functional programming land. Even the functional languages not specifically designed for concurrency do it much more naturally than their imperative counterparts.
Re:Panic? (Score:1, Informative)
I have an old AMD-XP-something running windows XP at home, it is at 5 years old. I have a Core2Duo machine is sometimes use. I dont see much difference in day-to-day usage. Even if there is one, i would attribute most of that to faster drives and i/o.
Re:Languages (Score:5, Informative)
What it *doesn't* do is make it easy to write verifiably immutable types, and code in a functional way where appropriate. As another respondent has mentioned, functional languages have great advantages when it comes to concurrency. However, I think the languages of the future will be a hybrid - making imperative-style code easy where that's appropriate, and functional-style code easy where that's appropriate.
C# 3 goes some of the way towards this, but leaves something to be desired when it comes to assistance with immutability. It also doesn't help that that
APIs are important too - the ParallelExtensions framework should help
I don't think C# 3 (or even 4) is going to be the last word in bringing understandable and reliable concurrency, but I think it points to a potential way forward.
The trouble is that concurrency is hard, unless you live in a completely side-effect free world. We can make it simpler to some extent by providing better primitives. We can encourage side-effect free programming in frameworks, and provide language smarts to help too. I'd be surprised if we ever manage to make it genuinely easy though.
Re:Multicores, but not on a chip (Score:5, Informative)
Re:Panic? (Score:5, Informative)
Re:Panic? (Score:5, Informative)
And frankly, it helps a lot to write code that is microprocessor-friendly to begin with:
If the node-code is bad enough, it can make any parallelism look good to the user. But writing good node-code is hard;-( As a reviewer, I have recommended rejection for a parallel-processing paper that claimed 80% parallel efficiency on 16 processors for the author's air-quality model. But I knew of a well-coded equivalent model that outperformed the paper's 16-processor model-result on a single processor -- and still got 75% efficiency on 16 processors (better than 10x the paper-author's timing).
fwiw.
Re:Multicores, but not on a chip (Score:3, Informative)
yep, here it is : http://api.kde.org/4.0-api/kdelibs-apidocs/threadweaver/html/index.html [kde.org]
Re:My heterogeneous experience with Cell processor (Score:3, Informative)
I find the most useful parts of the STL, don't even use exceptions. It just has a lot of undefined behaviors. There is only one call, at, for vectors and deques that will throw a exception directly. The STL is mostly concerned with being exception safe. Do you have a reference for C++ programming the cell processor concerning the exceptions?
Re:Panic? (Score:3, Informative)
What I did mean to imply is that something fundamental needs to change in the rest of the system as well before this becomes important, though, since right now most of the time I'm not waiting for the CPU, I'm waiting for the hard disk. That guy waiting for the address bar in IE? I'd bet a dollar that he is really waiting for his harddisk. Possibly IE is scanning some history file each time he types a character, and there might be some paging going on, and he might have some severe fragmentation issues, and some torrents open, and all those would combine to making something that should be lightning fast, unbearably slow.
My dualcore, 2.4GHz machine with a staggering 3GB of RAM, occasionally feels slower than my ancient Amiga 500 (7.14MHz, 512KB of RAM, and no hard disk - and no paging file!). As soon as your application swaps out (and that is an activity Windows does as a hobby, just to spite you), you will lose significant time when you want it to come back to life.
And as long as systems remain mostly limited by the harddisk, rather than the CPU, adding threads will not help. Even those massively parallel monster applications of tomorrow will just be spending their time waiting to be paged in.
Re:Panic? (Score:3, Informative)
But massive instruction per clock improvements do not happen very often in the x86 chip industry. In fact, I can count all the major improvements for the last 15 years on one hand:
1993: Intel Pentium Pro (approximately 2 INT, 2 FP operations per clock, best case) introduces real time instruction rescheduling to the x86 world. The design can decode 3 instructions per clock. Yes, I am disregarding the Pentum, because you got NO performance improvement without an optimizing compiler.
1997: MMX increases maximum number of integer instructions to 8 per cycle. But, because of the 64-bit data size, you really see little improvement unless using 16-bit or 8-bit types.
1998/1999: 3DNOW! and SSE double the potential throughput for 32-bit floating point, again not all that impressive.
2001: the Pentium 4 actually REDUCES performance per clock, with a single instruction decoder, and heavy reliance on trace cache to make up for this. SSE2 gives the potential to increase FP thoroughput to 4 instructions per clock, per SSE unit, but a half-assed implementation by both Intel and AMD means nothing changes.
2006/2007: the addition of more decode units on the Core 2, packed SSE instructions for both the Core 2 and the Phenom, and TWO 128-bit SIMD units means we see the first improvements in instructions per clock in years.
Re:Languages (Score:3, Informative)
An object oriented AND functional language built on top of the JVM, with seamless interoperability with Java, immutable collections and an Actors framework pretty much the same as in Erlang. I think it is the solution to the multi-core problem.
Re:Panic? (Score:3, Informative)
Re:Panic? (Score:3, Informative)
1) Is there a programming language that tries to make programming for multiple cores easier?
2) Is programming for parallel cores the same as parallel programming?
3) Is anybody aware of anything in this direction on the C++ front that does not rely on OS APIs?
1) Yes.
2) Maybe.
3) Yes [openmp.org].