Slashdot Log In
Intel Updates Compilers For Multicore CPUs
Posted by
kdawson
on Tue Jun 05, 2007 02:46 PM
from the what-about-gcc dept.
from the what-about-gcc dept.
Threaded writes with news from Ars that Intel has announced major updates to its C++ and Fortran tools. The new compilers are Intel's first that are capable of doing thread-level optimization and auto-vectorization simultaneously in a single pass. "On the data parallelism side, the Intel C++ Compiler and Fortran Professional Editions both sport improved auto-vectorization features that can target Intel's new SSE4 extensions. For thread-level parallelism, the compilers support the use of Intel's Thread Building Blocks for automatic thread-level optimization that takes place simultaneously with auto-vectorization... Intel is encouraging the widespread use of its Intel Threading Tools as an interface to its multicore processors. As the company raises the core count with each generation of new products, it will get harder and harder for programmers to manage the complexity associated with all of that available parallelism. So the Thread Building Blocks are Intel's attempt to insert a stable layer of abstraction between the programmer and the processor so that code scales less painfully with the number of cores."
This discussion has been archived.
No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
Full
Abbreviated
Hidden
Loading... please wait.
Anyone want to... (Score:5, Funny)
Re:Anyone want to... (Score:5, Informative)
Parent
Re: (Score:2)
:)
Soooo, at the risk of sounding really stupid, wasn't this sort of thing happening with previous compilers?
Re:Anyone want to... (Score:4, Funny)
FYI, not a programmer/developer/etc., not even PHP, just interested in tech, but love the attitude anyway, AC
Parent
Re:Anyone want to... (Score:5, Informative)
They found a way to make the computer be able to determine how to use its many CPU cores automagically when you compile a program. It is cool, since it is really to figure out how to share a given workload 16 even ways.
Parent
Re: (Score:2)
I can't speak for Fortran but what standard C++ mechanisms are there for threading? If they added stuff to the CLR, shouldn't it have gone through the organizations that maintain them? Weird compiler extensions are bad for cross-compatiblity. (Which I guess is the point since Intel compilers -> Intel CPUs -> No other CPU manufacturers).
Besides, threading is still an OS specific venture. Do these optimizations just work by looking for calls to fork() or the Windows alternati
Re:Anyone want to... (Score:5, Informative)
A more interesting question is "Is that good enough?" For vectorization, the answer is 'usually' - so some additional work/headaches happen when it isn't enough. For parallelization - the answer is at best 'sometimes.' So I'll get flamed two ways: (1) by people very happy with it - and say that I've understated how good it is - and it is all they need, (2) by people with programs which don't get magical auto-paralleism to solve there needs. There are more people in #2 than #1 - but this ain't a 1-size-fits-all-world. Not a bad deal if it solves you problems - otherwise - you got work to do... but that ain't the compiler's fault... parallelism requires work for most of us.
About languages...
Virtually every Fortran, C and C++ compiler these days support OpenMP, which is not part of the official standard - but is there to use. It is loop oriented, and is very Fortran-like and fits into C well enough... but is definitely not C++ like.
Fortran and C/C++ don't support threading in the language, you need to write your code to be thread-safe, and you need to use a threading package like Windows threads or POSIX threads (pthreads). Boost thread offer a portable interface to hit on the key threading needs - essentially wrappers for pthreads and Windows threads, etc. - the standards are likely to add a portable interface officially in the future. One thing Java did from the start.
Intel compilers -> Intel CPUs -> all compatible processors
The Intel compilers and libraries aim to beat other compilers and libraries regardless of the processor it is run on. No one will get it right all the time - so this is not a dare to find single examples of little code sample to prove me wrong. But if a real program doesn't get the best results from Intel - we want to know. (yeap - I work at Intel - I post for myself)
Parent
Re: (Score:2, Informative)
Re: (Score:3, Interesting)
Threading Building Blocks is a good op
Re:Anyone want to... (Score:5, Funny)
See, it's not that hard to understand.
Parent
Re: (Score:2)
Re:Anyone want to... (Score:5, Interesting)
The Threading Building Blocks are yet another attempt to make writing multithreaded code easier. Frankly I don't find pthreads hard but maybe I am just odd.
Threading is very important because we are not going to see an endless increase in clock speed anymore. Intel, AMD, and IBM are all pushing multiple cores. While adding an extra core or three really does help modern systems at least a little since we are often running multiple tasks current software will not scale as well when the cores start growing in a Moore like fashion. Right now we are at four cores if Moore's law holds in two years we might see eight, then 16, then 32... As you can see it gets out of hand pretty quickly. Your average desktop will not use four cores very well much less eight until software is written to take advantage of more cores.
Yes I know that Moore said 18 months but I was going for a nice round numbers.
Parent
Re: (Score:3, Funny)
Intel - The Software Company (Score:5, Insightful)
And as much as they develop compilers to optimize code for Intel CPUs, the code most of the time will also see a speed increase on AMD CPUs as well. Who else do you want developing a compiler but the people who made the hardware it's running on.
Re: (Score:3, Insightful)
You mean like nvidia making nvidia drivers for linux?
Re:Intel - The Software Company (Score:4, Funny)
Parent
Re:Intel - The Software Company (Score:5, Interesting)
Parent
Re: (Score:3, Funny)
Anyways, it's not like MMX/SSE are really used for much of anything but benchmarks and voice synthes
Re: (Score:2)
Re:Intel - The Software Company (Score:4, Insightful)
This matters because the whole purpose of IPP is to take advantage of newer instructions. If you say "new instructions don't matter because no one uses them" it becomes a self-fulfilling prophecy. Optimized libraries could break out of that cycle, but only if they aren't used as competitive weapons.
Parent
Re: (Score:3, Insightful)
They should also let you build binaries without those fallback code paths, as a lot of code will never run on older machines (eg x86 macs, which all have at least sse3).
If someone's system lock up because AMD claimed to support a feature which they dont actually support, that's AMD's fault and intel could claim the moral high ground instead of the other way round.
Re:Intel - The Software Company (Score:5, Informative)
My favorite complaint: Intel checks "CPUID"
No duh - that's where the feature information is.
Next favorite: Intel checks for "GenuineIntel".
Another "no duh" - RTFM from Intel or AMD - the features flags checking has to come AFTER you determine the manufacturer AND family of the processor...
unless you don't care about running on all processors
(spare pointing out to be that you can skip the first two checks - look at the SSE flag - and it is usually right - unless say you pick just the right older processor)
We do the checks the way Intel and AMD manuals say we have to... if that is evil... so be it.
We even start by testing if the CPUID instruction exists (it didn't before Pentium processors).
Parent
Re:Intel - The Software Company (Score:4, Interesting)
It's really useful for a CPU company to develop an optimizing compiler for their hardware. It forces them to understand how their CPU features actually speed up software, and it gives them the opportunity to prove that certain hard optimizations actually work. It would probably be best for everyone if the compiler were open source, but if Intel thinks they need to sell it as a commercial product to justify it financially we still get all of the benefit on their future processor designs.
Parent
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
My goodness... you can't mean... that the company which developed the hardware is in a strong position to get a few people from the hardware dev team onto the team developing software for it?! And that these people are well placed to know what's worth optimising, where and how?
No shit, Sherlock.
The only amazing thing about this is that it is such a novel insight that it is necessary for you to be modded as such.
GCC (Score:3, Insightful)
No and yes (Score:3, Informative)
Re:No and yes (Score:5, Informative)
Now, if you're Intel then you have the time and the money to work out exactly how to exploit these tradeoffs to schedule threads effectively. But you don't want to give that away for free. From the (very scanty) marketing bullshit that was linked to, it would appear that they've appear an Intel-specific threading library (probably with a POSIX interface). Separate to this is a profiling tool and a multi-threaded debugger (the latter of which is non-trivial). While any debugger will let you skip across threads allowing you do it in a deterministic manner to look for race hazards is much harder.
The analysis tools sound nice, but the bolton library is nothing special. It's purely to win a few synthetic benchmarks and gain some marketshare for ICC and therefore more "Made for Intel" applications in the market. I'm cynical about the library because what is broken about the threading model in C/C++ would take more than a library to fix. It would require redesigning the language down to the ground and choosing a different set of control constructs.
So finally, when you claim that it's because Intel has "better" coders. You don't know what you're talking about. I know a few guys who code GCC for a living, and they are grade A coders. It is because Intel has moved the goalposts. It's not so much that GCC targets multiple architectures, it's that they are trying to stick to (relatively) standard C where-as Intel is willing to redefine where the semantic gap sits if they can squeeze out a little more performance. Their attitude is screw portable code - talking across different compiler vendors here, rather than chip vendors. If what they need to squeeze into their compiler is no longer "C" strictly speaking, then they don't care. The gcc guys do.
Ah yes, and portable code can be a smaller window than you expect. That weighty 1000 page Intel document is sitting comfortable next to the AMD equivalent, which differs in suprising places.
Parent
learn better parallel programming techniques? (Score:3, Interesting)
Re: (Score:2, Insightful)
If compilers keep abstracting away the interface between the programmer and the cpu, programmers will be less likely to write better code or learn new techniques that take advantage of all the power a few extra cores can provide right?
If compilers keep abstrating away the programmer and the cpu, and getting better at optimization, programmers won't need to write better code or learn new techniques to take advantage of all the power a few extra cores can provide.
Instead the programmer can concerntrate on writing more understandable code.
Re: (Score:2, Insightful)
Even more so for interpreted/compiled on the fly languages. They can be dynamically compiled to take advantage of whatever hardware is available on each machine, without the developer having to code for it.
Re: (Score:2)
Re: (Score:3, Insightful)
These days, a similar thing is happening with vectorization. If programmers try t
Re: (Score:2)
Good programmers can write good highly optimized and mostly bug free code.
unfortunately good programmers are like good Car Drivers. Everyone says they are good, but very few really are.
Looks like something they rushed out (Score:4, Informative)
"The Intel Threading Tools automatically finds correctness and performance issues" (The tools finds?)
"Along with sufficient task scheduler and generic parallel patterns" (Who has insufficient task scheduler?)
"automatic debugger of threaded programs which detects many of thread-correctness issues such as data-races, dead-locks, threads stalls" (Sarcasm fails me...)
And that's just in the first few paragraphs, I haven't even gotten to the real meat of the article!
I'm used to informative, well-written and reasonably complete technical documentation from Intel — WTF is this?
Re: (Score:2)
No, the "Intel Threading Tools" is a product, in the singular -- it finds. Maybe Intel threading tools would find, but notice the subtle difference?
OK, sso it's a bit awkward to parse, but isn't it obvious by the grammar that "sufficient" modifies both "task scheduler patterns" and "generic parallel
Re: (Score:3, Informative)
Probably the source of the less than optimal text.
How's the documentation on -your- compiler coming along?
Re: (Score:2)
WTF? No, this is SPARTA!
OK, I'll Byte (Score:3, Interesting)
As a programmer, I already have abstractions such as Active Objects [wustl.edu]. While this may make it easier for compiler writers or kernel hackers, what benefits does it bring to us ordinary mortals?
The inevitable... (Score:4, Funny)
30
20
10
Re:The inevitable... (Score:5, Insightful)
Parent
intel's product page (Score:4, Informative)
I dont understand this statement: (Score:5, Insightful)
I'm very surprised and dissapointed by the pervasiveness of the incorrect myth thats being promoted even amongst supposedly technically knowledgeable groups that:
a) Writing multithreaded code is terribly difficult
b) You need to implement code to have the same number of threads as your target hardware has cores
Both of these is completely not true at least for the PC marchitecture.
The way to develop multithreaded code is to exploit the natural parallelism of the problem itself. If the problem decomposes down most neatly into one, three or 6789 threads, then design and write the implementation that way. Consequently the complexity of the problem does not increase as the number of cores available increases.
In the PC architecture case, attempting to design your code based on the number of cores in your target hardware just leads to a twisted and therefore bad and also non-portable design.
I'm surprised how few developers seem to understand that in fact its OK, normal and often desireable to have more than one application thread running on the same core. In fact you really can't even ensure or even assume that your multi-threaded app will get one core per thread even if the hardware has enough cores, or work best if it does, as core/thread allocation is dynamically scheduled by the OS depending on loading. Not to mention there's all sorts of other apps, drivers and operating system tasks running concurrently too, so depending on each core's load, one app-thread per core may actually not be the most optimal approach anyway.
Re: (Score:3, Insightful)
1 - If the communication or thread switching overhead exceeds the thread computation, it is not worth threading.
2 - It is (unfortunately) easy to build in "lock stepping" into otherwise independent threads. These systems scale from 1..n cores; after n cores no further scaling is seen.
3 - It *is* difficult to build correct parallel systems. Especially with points 1 and 2 in mind (and, yes, I *have* built parallel high-speed device drivers that are lock-free to avoid switching).
4 - *Proving
Would the OS benefit from using this? (Score:4, Interesting)
As an aside, Linux is obviously compiled using GCC but I wonder if Microsoft compiles Windows using the Intel compilers?
Re:Umm.. (Score:5, Funny)
Parent
Re:Umm.. (Score:5, Funny)
Parent
Re:Umm.. (Score:5, Funny)
You're thinking of IBM [com.com].
Parent
Re: (Score:2)