Intel Compiler Compared To gcc 101
Screaming Lunatic writes "Here are some benchmarks comparing Intel's compiler and gcc on Linux. Gcc holds it own in a lot of cases. But Intel, not surprisingly, excels on their own hardware. With Intel offering a free (as in beer) non-commercial license for their compiler, how many people are using Intel's compiler on a regular basis?"
Re:slow down timothy (Score:1, Funny)
Slashdot. News if you haven't seen it on some other site already. Or maybe if it's realy old and you forgot about it
Re:Huh? (Score:5, Informative)
c++ programs (Score:4, Interesting)
I would like to see a test with real desktop applications and desktops, ie. gcc GNOME/KDE vs. icc GNOME/KDE. Would these projects see significant performance improvements from the Intel compiler?
Re:c++ programs (Score:4, Interesting)
And, no, I supsect not really. Intel Compilers are designed for number-crunching work - eg: finite element alaysis, engineering simulations, that sort of thing. They perform optimizations designed to improve CPU bound processes. I suspect that interactive / IO bound processes wouldn't be so affected.
Secondly, it depends where the bottleneck is - I could be the runtime linker, or the X-Window system itself or who knows.
Those projects should see some level of improvement, but I wouldn't imagine it's twice as fast. (Things like a paint program might though - as the Intel compilers can take existing "normal" C code and generate SSE and MMX using code.)
Re:c++ programs (Score:4, Informative)
I like icc, esp since I'm using a lot of floating point and gcc isn't too good with that on the PentiumIII & 4. But so far haven't had the time to unit test every component with my C++ project, and you can't just drop in icc compiled classes, it's all or nothin' (or lots of hacks and C code, but I'd rather put the work into a proper port at some point.) gcc 3.2 is also better than those benchmarks show, I've gotten a doubling in speed on some code compared to gcc 2.x. It's often a matter of trying different flags on each unit and rerunning your benchmark, I think the -Ox's aren't finely tuned yet on the gcc3.x series.
There is a real problem with compilation speed on gcc3.2, I thought it hung when I ran a "-g3" compile and it was stuck on one short file for 10 mins, nope, just REALLY slow. I modified my makefiles to do a non-debug compile to check for errors before doing a "-g" Then I only "-g3" the files I need, when I need them. I mention it mostly because it may explain why the -Ox flags aren't optimal yet.
Re:c++ programs (Score:5, Informative)
Re:c++ programs (Score:2, Interesting)
Integer performance? (Score:3, Informative)
Also, the benchmarks used are probably much more loop-oriented than much of the real-world code, but that's typical of benchmarks.
What I would find interesting would be to compile glibc, apache, and something like perl or mysql with both sets of compilers and see what difference you can get with some web server benchmarks. Or compile X and some game and see how the frame rate compares between the two compilers. Or compile X and Mozilla, and find some really complicated pages to see what gets rendered the fastest (possibly using some trick to get it to cycle between several such pages 1000 times).
More benchmarks... (Score:2)
Re:More benchmarks... (Score:1)
Total absolute time: 3.21 sec
Abstraction Penalty: 0.95
So the more abstracted code is _faster_?!?!?
(I reran half a dozen times, I never got any results >1.0 from any of the 12 tests)
THL
Re:More benchmarks... (Score:1)
(Duron 900, only gcc)
Complex 20000 0.2 1.5 640.0 105.3 6.1
The 6:1 ratio of C/C++ Complex on gcc is partly because operator+ and operator* take 2 Complex parameters. I changed that to Complex const& and the 6:1 becomes 2:1
Complex 20000 0.3 0.5 615.4 296.3 2.1
Then if you write an operator+= rather than the
"a = a + b" operator+ in the code you get
Complex 20000 0.2 0.2 640.0 695.7 0.9
There you go - a factor of 7 speed increase.
THL, available for hire as a freelance programmer.
Could this replace gcc ? (Score:3, Insightful)
Could we see versions of linux distributed with intel compiler instead of gcc? Can the intel compiler compile the kernel?
Clue me in!
--noodles
Re:Could this replace gcc ? (Score:2)
Try reading the article before posting.
Intel does not support all gcc language extensions; while it has been used to compile the Linux kernel and other free software projects, it is not a drop-in replacement for gcc.
Re:Could this replace gcc ? (Score:2, Insightful)
I'm somewhat dissappointed with kernel hackers (and other opensource developers) with respect to this issue. The issue is that the kernel is not ANSI-C compliant, not the fact that icc isn't compliant.
It annoys me when MS does not support standards such OpenGL or with MSVC6 or with .doc files, etc.
I'm not trying to troll here, but standards are a Good Thing(TM). But who am I to complain, Linus' tree is Linus' tree and he is allowed to do whatever he wants with it. Although, I'd like to see a hacker pick it up and port it to ANSI-C.
Re:Could this replace gcc ? (Score:5, Informative)
Any peice of software as large, complex, and critical, as a OS kernel is going to, at the very least, be tested agianst a specific compiler. Linux was developed primarly with free tools, ie GCC. So Linus and his cohorts have taken the test-on-gcc mindset one step further and used GCC extensions.
So what? What do they loose? No functionality, they could implement things in ASM if need be, so convience. And convienent things are probabaly understandable things, and understandable things mean less buggy code.
If people never used compiler extenstions, then you would never have to run ./configure :)
Re:Could this replace gcc ? (Score:1)
WRONG! gnu configure checks which compiler and version you're using, but it spends most of the time checking for #include files, libraries, and functions. Those are dependent on the OS and libraries installed, not the compiler.
GCC proper doesn't use any extensions (since it may be compiled by a non-gcc compiler).
Re:Could this replace gcc ? (Score:2)
Portability. They are just as locked in as any other development team using a single proprietary compiler with its own custom extensions. As a result, they are stuck using a tool that, with all due respect, produces pretty mediocre output compared to the best in the field. That might not matter too much for an OS, since chances are it doesn't take much advantage of either the things the other compilers optimise better or the features GCC doesn't support properly. In general, though, it's a very serious point (said the guy who writes code that compiles on 15 different platforms every day).
Re:Could this replace gcc ? (Score:1)
If you did any serious kernel development at all, you'd realize how stupid your complaint is. It's not possible to optimize an operating system kernel using straight ANSI C. There are just too many specialized operations that a kernel needs to perform. And since gcc is available for a variety of platforms and architectures, it's no less of a standard than ANSI C is.
Re:Could this replace gcc ? (Score:1)
Re:Could this replace gcc ? (Score:1)
I agree in general about standards, but to be disappointed with the kernel hackers over this is a bit much.
Re:Could this replace gcc ? (Score:1)
Re:Could this replace gcc ? (Score:1)
Re:Could this replace gcc ? (Score:2, Interesting)
I do not know if this is still true, but I imagine it is.
The Kernel developers use gcc - I wouldn't entirely trust using a different compiler. Besides, there probably isn't a huge performance penality.
I've looked at using the Intel compilers (the have a FORTRAN one) and their main advantage is in number-crunching applications. I suspect the differences aren't so important in interactive / non-crunching applications.
-
-
Re:Could this replace gcc ? (Score:2)
Re:Could this replace gcc ? (Score:1)
Re:Could this replace gcc ? (Score:2, Informative)
I have the impression that a significant point is the difference in assembler syntax. GCC uses the AT&T syntax, where the register you want to store into comes last, while the Intel compilers (and just about any other x86-native tool) uses the Intel syntax, where the distination register is the first one in the list. There are other differences as well, regardign the way type information and indirection is handled.
My impression is that Intel does not want to implement an AT&T style assembler parser, and the GCC folks got bothered so much about Intel syntax by all the x86 newbies that they'd rather jump off a cliff.
heh. (Score:2)
Gentoo supports icc (Score:2)
It worked great; thanks! (Score:1)
Re:heh. (Score:1)
Thats about all you really need to do to install the latest icc7 ebuild. If you don't have rpm portage will download and install that so it can extract the stuff in the icc rpm file.
An oldie but a goodie! (Intel joke) (Score:2, Funny)
I am Pentium of Borg. Division is futile. You will be approximated.
gcc and Intel compilers (Score:3, Insightful)
But Intel, not surprisingly, excels on their own hardware.
Do you mean to imply that Intel knows something about the Pentium architecture or instruction set that the authors of gcc don't? Does the code emitted from the Intel compiler use undocumented instructions? Intel's compiler is newer than gcc and wasn't developed with the "many eyes" that have looked at gcc over the years. It looks like Intel's engineers wrote a better compiler, simple as that.
These benchmarks give gcc a black eye, but I doubt Intel was using undocumented secrets of their chip to defeat gcc. Sometimes the open source community has to admit that not every open source project represents the state-of-the-art.
Re:gcc and Intel compilers (Score:4, Insightful)
I do believe that Intel engineers probably have a better understanding of branch prediction and cache misses on Intel hardware.
I don't think these benchmarks give gcc a black eye at all. gcc aims to be a cross-platform compiler first, optimizing compiler second. icc aims to be an optimizing compiler first, cross-platform compiler second.
And chill with the conspiracy theories.
Re:gcc and Intel compilers (Score:2)
Re:gcc and Intel compilers (Score:2)
Re:gcc and Intel compilers (Score:3, Informative)
Re:gcc and Intel compilers (Score:1)
gcc Error at line 1 - too many levels of indirection. Compile aborted. Try again after less beer.
Re:gcc and Intel compilers (Score:1, Insightful)
A lost sentence! (Score:2)
Aarrrgggghhhhhh! I'd made that point in an earlier incarnation of the article, and it got lost when I rewrote the conclusions. Thanks for bringing this too my attention; I'll restore the lost text.
Re:gcc and Intel compilers (Score:2, Interesting)
But, knowing which instruction would be the fastest in each particular situation, how to organise things to reduce the chance of a cache miss and that sort of thing. So, yes, Intel know more about their chips than anyone else does.
(However, AMD know more about AMD chips than anyone else does...)
Re:gcc and Intel compilers (Score:2)
Am I using undocumented "KDE+Bash+Linux knowledge"? Hardly. Do I have a 'home turf advantage'? Yup.
Who said that intel was doing anything nefarious? I'd say it's passibly *obvious* that the engineers that designed the chip would have an advantage at designing an optomizing compiler, even when things are completely documented. And so it is.
--
Evan
Intel compiled code is faster on IA AND AMD (Score:1)
1) Intel compilers improve code performance (over GNU compliers) on both Intel (PIII and P4) and AMD (Athlon) processors due to supporting SSE and SSE2 instructions and other extensions. Although this perf gain will be greater on Intel cpus.
2) gcc maintainers have been unwilling to put Intel or AMD specific optimizations in the code -there's no secret instructions, just unwillingness to use the published stuff (check out the 100s of docs, forums and other stuff at developer.intel.com (where to get your non-commecial compiler downloads))
Re:gcc and Intel compilers (Score:2)
This tool alone probably gives them a huge edge in developing compilers.
Not my experience. (Score:4, Informative)
More relevant is how the performance of C7 is markedly worse on the P3 platform than C6. Very disappointing, makes me wonder what they've done.
Dave
Re:Not my experience. (Score:2)
C7 defaults to -mcpu=pentium4, I bet he'd get different results with -mcpu=pentiumpro
These benchmarks aren't really for those that really need the fastest code, they will benchmark their own code. But, it is valid for what deciding what to compile all that other stuff. Though with gcc holding it's own on most of those benchmarks the ubiquity gcc gets through it's license is outweighs the small performance benefit, well in C surely. Hopefully someone will look at that wacky place in the C++ benchmarks where icc outperformed it by over 1000%, perhaps the fix for that could be pulled into -O3 or maybe -O6, like with pgcc... gcc 3.0 was mostly a standards release, 3.1 and 3.2 were mostly bug fixes, hopefully 3.3 will iron out the ABI interpretation differences between gcc and icc, then 3.4+ can be performance oriented.
Everything is relative... (Score:2)
My article is a guideline, not a pronouncement. Your mileage is guaranteed to vary.
Re:Not my experience. (Score:2)
It's optimising for P4 by default, which is missing the barrel shifter the P3 uses to generate immediate operands. On P4 it uses a 3rd cut-down ALU to handle them. Hence P3 code will run slowly on P4 CPUs (on top of the fact that P4 only gets ~80% performance clock-for-clock compared to P3) and vice-versa.
Jon.
some practical issues (Score:5, Informative)
that said, icc does a lot of things that really irritate me. for one, it's diagnostic messages with -Wall are, well, 90% crap. note to intel: i don't care about temporaries being used when initializing std::strings from constants --- the compiler should be smart enough to do constructor eliding and quit bothering me. the command line arguments are somewhat cryptic, as are the error messages you get when you don't get the command line just right. the inter procedural optimization is very *very* nice; however, be prepared for *huge* temporary files if you're doing ipo across files (4+mb for a 20k source file adds up very quickly).
this all said, i don't think that i'm going to give up either compiler. gcc tends to be faster on the builds (especially with optimization turned on) and has diagnostics that are more meaningful to me. fortunately, my makefiles know how to deal with both
Re:some practical issues (Score:2)
On the other hand, icc supports OpenMP, which means that on an SMP machine you might be able to parallelize a loop by inserting just a single line of code, like: ...
#pragma omp parallel
Why temporaries matter (Score:1, Informative)
Re:Why temporaries matter (Score:4, Interesting)
class C {
public:
C(const string& s = "some string");
};
icc wants code that looks like this:
class C {
public:
C(const string& s = string("some string"));
};
The only real difference I see between the two is the explicit creation of a temporary. Now, as to why GCC doesn't complain is another issue --- maybe its diagnostics for temporaries aren't turned on with -Wall (perhaps -pedantic fixes that); however, I have this feeling that GCC's constructor elision is the trick here. To be honest, I'm very curious to find out why this happens. As an interesting aside, Stroustrup tackles the issue of overloading operators in a "smart" way so as to avoid unecessary copies.
Personally, I think Java (and whomever it "borrowed" these particular semantics from) got it right. Unfortunately, Java isn't exactly a good language for talking to hardware
Re:Why temporaries matter (Score:2, Informative)
class C {
public:
C(const string& s = "some string");
};
icc wants code that looks like this:
class C {
public:
C(const string& s = string("some string"));
};
The only real difference I see between the two is the explicit creation of a temporary. Now, as to why GCC doesn't complain is another issue --- maybe its diagnostics for temporaries aren't turned on with -Wall (perhaps -pedantic fixes that); however, I have this feeling that GCC's constructor elision is the trick here. To be honest, I'm very curious to find out why this happens.
Constructor elision trick? The code
const std::string& s = "some string";
implicitly constructs a temporary std::string and binds it to the reference s. I don't know how the compiler could eliminate the construction of the temporary each time the function is called, unless it compiled it to something like the following:
#include <string>
class C {
static const std::string S_DEFAULT;
public:
C(const std::string& s = S_DEFAULT);
};
#include "C.hpp"
#include <iostream>
const std::string C::S_DEFAULT("some string");
C::C(const std::string& s) {
std::cout << "C::C() called with: " << s << std::endl;
}
You may wish to rewrite your code in this manner because it virtually guarantees that the std::string for the default parameter is constructed once and only once. It also provides an added benefit: if the value of the default changes (from, say, "some string" to "some other string"), then only the C class's translation unit needs to be recompiled.
Re:Why temporaries matter (Score:1)
Mail sent to author (Score:5, Informative)
Re:Mail sent to author (Score:2)
You need to re-read the article, which has changed. The "15%" senetence was an artifact that should have been deleted (and now is) from an earlier article.
The text you found objectional is replaced by the following:
Many "numerical applications" involve integer calculations; last time I looked, integers were numbers, too. ;)
Re:Mail sent to author (Score:2)
If you remove the Monte Carlo test, the P4 composite result turns out to be 9.3% better for icc, quite a different figure than 20% (even if icc is of course still better on 3 out of 4 tests).
Well you can obviously play with words. Why did SPEC dudes bother splitting between SPECint and SPECfp after all? :).
SPECint vs. SPECfp (Score:2, Interesting)
Why did SPEC dudes bother splitting between SPECint and SPECfp after all?
Because encryption and other heavy number theory doesn't use floating-point.
Because analog modeling of physical systems such as circuits doesn't use integers except as loop counters and pointers.
Because floating-point hardware draws a lot of power, forcing makers of handheld devices to omit the FPU.
Re:Mail sent to author (Score:2)
Interesting... (Score:3, Interesting)
Under the versions of GCC that I have used, I've always found that -fforce-addr -fforce-mem gives a slight speed boost when combined with -O3 -fomit-frame-pointer.
Under GCC 3.2, it looks like -fforce-mem is turned on at optimization -O2 and above, but -fforce-addr does not appear to be turned on, and it seems like it may be of some help in pointer heavy code.
A Practical Problem (Score:2)
I didn't use -fforce-addr because I didn't think of it! ;)
Based on some work suggested by someone in e-mail, I'm going to see if it's possible to write a "test every option combination" script. Given the hundreds of potential options, we're looking at a REALLY BIG test... ;)
In my view, gcc has far too many options and virtually no real documentation about how those options interact, or even what options go with what circumstances. Very messy, and very hard for people to figure out.
,I hope to alleviate that problem, given time and resources.
Re:A Practical Problem (Score:2)
On the version of GCC I normally use, there are 25 -f options for controlling optimization. There are also a couple of other options that will effect code efficiency as a side-effect.
To test every combination of 25 options, you'd have to recompile and re-execute your tests 33,554,432 (2 to the 25th power) times, which will probably exceed your patience.
With a little clever winnowing of options, you might be able to cut that down to a reasonable set of options. Presumably, some options will always be a win, in nearly every situation. If you take those as fixed, that'll cut down the set of permutations significantly.
-Mark
Re:A Practical Problem (Score:1)
Head Start (Score:4, Interesting)
Historically, Intel has always been ahead of the competition in terms of code generation; I've used their Windows compiler for years as a replacement for Microsoft's less-than-stellar Visual C++.
On the Pentium III, the gcc and Intel C++ run neck-and-neck for the most part, reflecting the maturity of knowledge about that chip. The Pentium 4 is newer territory, and Intel has a decided edge in know how to get the most out of their hardware.
I have great faith in the gcc development team, and as my article clearly states:
Two good things (Score:1)
(1) Competition. This is OSS versus a compiler from the largest CPU-maker, both designed to work on this CPU. I think quality will go high.
(2) Standards. Now that we have at least 2 worthy compilers, developers from both sides will try harder to stick to standards to be able to bite each others' markets. Intel's compiler will try to compile the linux kernel and glibc2 while GCC should make attempts at Borland and VC++ IDEs, possibly building on their MingW32.
If only AMD came out now with an open-source compiler for AthlonXP and Athlon64
If you don't like gcc's x86 code, FIX it yourself! (Score:1)
Re:Glibc 2.3 issues? (Score:3, Interesting)
I'm running Intel C++ and Fortran 95 with Debian "unstable" as my distro (though I provide my own kernel), and it's currently using glibc 2.3.1.
Intel has stated on their web site forum that their compilers don't work with the glibc provided with Red Hat 8.0. I don't have an installation of Red Hat here, so I can't verify the problem.
Re:Glibc 2.3 issues? (Score:1)
file size differences between icc and gcc (Score:1)
gcc may perform well on x86, but... (Score:2)
And the more code is made that only compiles with gcc, the more performance wastage on these architectures.
Re:gcc may perform well on x86, but... (Score:2)
Installing a commercial compiler takes several months in large corporations, assuming you get permission for the expense at all, and often that is not an acceptable option. So gcc is what you get (which has the added advantage of not needing to deal with a PITA license server).
If that means the performance will suck on Solaris, Tru64, HP-UX or IRIX that just means we are more likely to migrate applications to x86 linux machines instead, not that we buy the compiler...
Re:gcc may perform well on x86, but... (Score:2)
x86 is still really limited to 4gig address space, the scalability is poor and not all applications are appropriate for being clustered. Often a high end 64bit multiprocessor system is the only option, and in these cases a 10% speed increase could result in hours of time saved, or even more..
Re:gcc may perform well on x86, but... (Score:2)
If, on the other hand, they're interested in remaing competetive in the low to midrange server end they'd do well to make sure that GCC has the best code generation possible for their platform because GCC is quite often the defacto compiler installed, despite it not being the best for the platform (even if the platform specific compilers were free gcc would remain immensly popular and probably remain the most common compiler on those systems merely because it works close to the same between platforms).
relativity on the brain. (Score:2)
You know, Einstein once said his biggest mistake was naming his theory 'special relativity'
His theory is not that everything is 'relative', it is that if you specify all the variables, everything is precisely understood.
Moreover, it does not apply to anything else. How is it that you use the postulates of the constancy of the speed of light and of simultaniety, can be applied to the speed of ICC over GCC???
Quote from the article:
"Like Einstein, I have to say the answer is relative."
DOH!