Stories
Slash Boxes
Comments

News for nerds, stuff that matters

Slashdot Log In

Log In

Create Account  |  Retrieve Password

A Review of GCC 4.0

Posted by Hemos on Mon May 02, 2005 11:44 AM
from the only-time-will-tell dept.
ChaoticCoyote writes " I've just posted a short review of GCC 4.0, which compares it against GCC 3.4.3 on Opteron and Pentium 4 systems, using LAME, POV-Ray, the Linux kernel, and SciMark2 as benchmarks. My conclusion: Is GCC 4.0 better than its predecessors? In terms of raw numbers, the answer is a definite "no". I've tried GCC 4.0 on other programs, with similar results to the tests above, and I won't be recompiling my Gentoo systems with GCC 4.0 in the near future. The GCC 3.4 series still has life in it, and the GCC folk have committed to maintaining it. A 3.4.4 update is pending as I write this. That said, no one should expect a "point-oh-point-oh" release to deliver the full potential of a product, particularly when it comes to a software system with the complexity of GCC. Version 4.0.0 is laying a foundation for the future, and should be seen as a technological step forward with new internal architectures and the addition of Fortran 95. If you compile a great deal of C++, you'll want to investigate GCC 4.0. Keep an eye on 4.0. Like a baby, we won't really appreciate its value until it's matured a bit. "
+ -
story
This discussion has been archived. No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More
Loading... please wait.
  • by Anonymous Coward on Monday May 02 2005, @11:45AM (#12408863)
    Well clearly the problem is that you compiled GCC 4.0.0 with GCC 3.4.3! What I did was go through the GCC 4.0 source code in two seperate windows, fire up hexedit in another, and go through line by line "compiling" GCC 4.0 with the GCC 4.0 source, in my head. I wouldn't recommend doing this with -funroll-loops, my hands started cramping up.

    Or you could wait to compile 4.0 until the 3.0 branch makes it to 3.9.9, then it will be close enough anyway. YMMV, people say I give out bad advice, go figure...
    • by Anonymous Coward on Monday May 02 2005, @12:12PM (#12409230)

      This was meant as a joke, but for those who took this too seriously: if you have ever tried building GCC yourself, you should know that it always recompiles itself.

      A gcc "stage 1" build is gcc compiled with your old compiler. The "stage 2" build is gcc compiled with the compiler created in the previous stage. This is the one that gets installed. The "stage 3" build is optional and verifies that the "stage 2" compiler creates the same output as the previous one.

      • but then the gcc 4 you compiled with gcc 3.4.3 would produce tainted compilations, and the second 4.0.0 compilation would lean towards 3.4.3 because it was compiled with a compiler that was compiled by 3.4.3. You would have to then take the second compilation of 4.0.0 and compile 4.0.0 with it, at which point the similarity to 3.4.3 would make it somewhere along the lines of 3.7.0. If you continue to compile it, while it will never reach 4.0.0, it will approach closely enough that for all intents and purposes, it will be 4.0.0. The forumula is as follows:

        V3-V1~2(V3-V2)

        where V1 was used to compile V2, and V2 was used to compile V3.
            • by Anonymous Coward on Monday May 02 2005, @03:29PM (#12412012)
              Yes, as long as it wasn't miscompiled.

              Historically, GCC tends to bring out the worst in compilers. That is why when you build GCC, the system compiler will be used once, /without optimizations/ to produce a slow GCC 4.0 which can be used to compile itself. This is done twice (stage 1 compiles stage 2 and stage 2 compiles stage 3) so that 2 and 3 can be compared to ensure that there were no miscompilations, as it is unlikely that a miscompiled compiler will produce correctly executable machine code that replicates exactly.

              Unlikely but possible. Look for the paper "Reflections on trusting trust" for a beautiful hack involving intentional miscompilations. The author basically changed the compiler so that when "login" was being compiled, the compiler inserted a back door. And when a new compiler was being compiled, the compiler would insert the code to insert the back door and to change the next compiler. And then no matter how much you checked teh source to either login or the compiler, you would never notice the back door.
      • by rsidd (6328) on Monday May 02 2005, @12:54PM (#12409837)
        The GP was a joke, but since you're serious, this is exactly what the "bootstrapping" build of gcc does: it builds a stage 1 build with the system compiler, then a stage 2 build with the stage 1 build, then -- if you want -- a stage 3 build with the stage 2 build, and verifies that the stage 2 and stage 3 builds are the same.
  • Expected (Score:5, Interesting)

    by Hatta (162192) on Monday May 02 2005, @11:46AM (#12408887) Journal
    It was a long time before GCC 3 got better than 2.95. I expect the same thing will happen here.
    • Re:Expected (Score:5, Insightful)

      by Rei (128717) on Monday May 02 2005, @12:11PM (#12409208) Homepage
      I think the problem is that, if I'm not mistaken, he's testing all C code except Povray. The biggest reported improvements in 4.0 were for g++, so using such a small C++ sample base (Povray - one purpose, one set of design principles, few authors) seems bound to produce inaccurate benchmarking.

      Further, on his most reasonable C benchmark (the Linux kernel), he only records compile time and binary size, but no performance. I call it the most reasonable benchmark because it has thousands of contributors and covers a wide range of code purposes and individual coding habits - and yet, performance is omitted.

      In short, I wouldn't trust this benchmark. Probably the best benchmark would be to build a whole Gentoo system with both, with identical configurations, and check build times and performances ;)
      • Re:Expected (Score:5, Funny)

        by markov_chain (202465) on Monday May 02 2005, @12:48PM (#12409765) Homepage
        Is it that surprising that a Gentoo user thinks of compiling time as the performance metric? :)
      • Re:Expected (Score:5, Informative)

        by The boojum (70419) on Monday May 02 2005, @01:05PM (#12409961)
        Sorry to tell you this, but the review is even mistaken with respect to Povray. Povray is not a C++ program - it's good ol' C. So in fact, none of the programs he benchmarked were C++. The test was exclusively on C code.

        As nice as C is, a lot of the improvements in GCC seemed to have been targetted at improving its handling of C++ code. I'd particularly like to know how it fairs with respect to modern C++ style code - massively templated stuff with STL, Boost, traits and policies, smart pointers, lots of small inlined methods, etc. This test tells me nothing about that, and that's where a lot of development is these days.
    • Re:Expected (Score:5, Interesting)

      by ajs (35943) <ajs@noSpAm.ajs.com> on Monday May 02 2005, @12:17PM (#12409309) Homepage Journal
      I'm not convinced that this test shows that gcc4 is less effecitve than gcc3, though.

      First off, all of the programs tested are programs that use hand-tooled assembly in the most performance-sensitive code. That has to mean that the compiler is moot in those sections.

      A better test would be to compare three things: the hand-optimized assembly under gcc 3 vs the C code (usually there's a configure switch that tells the code to ignore the hand-tuned assembly, and use a C equivalent) under gcc4 vs that same C code under gcc4.

      I think you'd see a surprising result, and if the vectorization code is good enough, you should even see a small boost over the hand-tuned assembly (since ALL of the code is being optimized this way, not just critical sections).
    • Re:Expected (Score:5, Interesting)

      by Rimbo (139781) <rimbosityNO@SPAMsbcglobal.net> on Monday May 02 2005, @01:00PM (#12409912) Homepage Journal
      I think the author of the article misunderstands just what happened with GCC 4.0.

      The main improvement in GCC 4.0 is implementing Single Static Assignment.

      SSA is not an optimization. It is a simplification. If you can assume SSA, then it opens the door to an entire class of optimizations that can help improve your performance without affecting your code's correctness.

      That last bit -- optimizing code without affecting correctness -- was a big problem in the days before SSA.

      In that regard, SSA is a similar technology to RISC -- it does not speed things up by itself, but it enables speedups for later on.

      The lack of SSA is one thing that kept gcc out of the hands of compiler researchers. Now that it does that, academia can start hacking away with gcc, and the delay you expect is the time between implementing SSA and implementing all of the optimizations that really will improve code performance.
      • Re:Expected (Score:5, Informative)

        by ma_luen (798746) <marronNO@SPAMcs.unm.edu> on Monday May 02 2005, @01:34PM (#12410334)
        I think you are over estimating the interest of the research community in working on gcc. The move from the intentionally underdocumented and ill defined intermediate representations to tree ssa is a huge step for gcc. Unfortunately, there is still no real effort to make the platform attractive to do experimental work on.

        The McCat compiler from McGill (which is what gcc borrowed the ssa rep from), C-- or the LLVM project all provide a much nicer platform. The internal representation is clearly documented, there are frameworks and examples for writing new passes and most importantly they all allow for whole program compilation.

        Until gcc decides to support some of this the project will continue to be ignored by research groups. This might be fine since research compiler work can be fairly ugly and it is just easier to port what works.

        Otherwise I agree that the move to ssa form is a critical step for gcc to take and it will enable it to become a "modern" compiler. More emportantly it will enable the inclusion of the large body of compiler work that is based on ssa forms.

        Mark
  • by pclminion (145572) on Monday May 02 2005, @11:49AM (#12408921)
    This has always bugged me.

    Some people spend 10 hours tweaking compiler settings and optimizations to get an extra 5% performance from their code.

    Other people spend 2 hours selecting the proper algorithm in the first place and get an extra 500% performance from their code.

    To semi-quote The Matrix: One of these endeavors... is intelligent. And one of them is not.

    • by kfg (145172) on Monday May 02 2005, @11:59AM (#12409064)
      And in both groups you will find people who believe that execution speed is the measurement of code quality.

      KFG
    • by mattgreen (701203) on Monday May 02 2005, @12:11PM (#12409212)
      It is because it is easier to delve into needlessly technical aspects afforded by compiler settings and 'optimizations' than it is to admit that one's algorithm is not sound. Kids running Gentoo delude themselves into thinking that omitting the frame pointer on compiles is going to make a massive difference in terms of performance, and fail to remember it makes bug hunting far more difficult when applications crash. Additionally, the 5% gain mentioned can be a severe overstatement. I frequent a game programming board, and the widespread use of C++ has led to an abundance of nano-optimization threads, the most amusing of which was an attempt to optimize strlen().

      Optimizing every single line of code is a complete waste of time, since the 80/20 rule generally applies. Use a profiler to determine where that 20% is.
      • by BlurredWeasel (723480) on Monday May 02 2005, @12:43PM (#12409685)
        I run gentoo (not for performance, but mainly because I am familiar with it, and it is easy), and you know what...I don't bug hunt. And adding -fomitframepointer or whatever the hell the option is (its in my flags somewhere) doesn't cost me anything, makes my system say (made up stat) 5% faster and I am happy. It makes no sense why you should deride me (read: gentooers) as an idiot. We're just end users, and if we can get a little bit of performance for free, well why not.
    • by Brandybuck (704397) on Monday May 02 2005, @02:58PM (#12411478) Homepage Journal
      If you have a choice of algorithms, then of course use the better algorithm. But for most of the day-to-day code we deal with, we don't have that choice, because we're not dealing with code that has any grand algorithms to it. For example, if I'm writing a GUI frontend to a command line app, what are my choices of algorithms? Not much.

      In my real life coding work, the places where algorithm efficiency makes a difference are far outweighed by those places that don't. And of those places that do make a difference, the performance is rarely a critical need. For example, I just coded up some RAMDAC lookup tables, and a difference of algorithm would make a huge difference in efficiency. But this particular routine was triggered by a user event (clicking a button in a config dialog), so that my dogslow but highly readable/understandable algorithm wasn't a bottleneck for anything. In this case tweaking the compiler settings would have given a 5% boost to everything, but a change in algorithm would only have given a 1/10 second boost for an event that would happen approximately once a week or less.
      • by pclminion (145572) on Monday May 02 2005, @11:54AM (#12408982)
        At some point you've got the best algorithm, you've profiled, you've hand-optimised, you've got the fastest hardware you can afford....and you *still* need that last 5%. That's when you spend 10 hours tweaking compilers settings...

        If you really, positively need an extra 5% performance, you might as well just buy a computer that's 5% faster.

        • by Stiletto (12066) on Monday May 02 2005, @11:57AM (#12409031) Homepage
          If you really, positively need an extra 5% performance, you might as well just buy a computer that's 5% faster.

          You work at Microsoft, right? No? Intel?
        • by Minwee (522556) <dcr@neverwhen.org> on Monday May 02 2005, @11:59AM (#12409055) Homepage
          Unfortunately, including a faster computer with every copy of the code you distribute may be prohibitively expensive.
          • by pclminion (145572) on Monday May 02 2005, @12:03PM (#12409106)
            You're obviously a small box user. Have you ever worked in the real world where huge batch runs can take weeks?

            Yes.

            You think companies should splash out another million or too on new hardware, just because you use a pissy little machine?

            I think that companies should re-evaluate their "need" for an extra 5% performance. Here's an idea -- if you need something 10 minutes faster, why not start the process 10 minutes sooner?

            5% just gets lost in the noise. You beef up your system, making it 5% faster... And then some retard in production makes a mistake and sets you back six weeks.

            • by The Snowman (116231) * <john@johngaughan.net> on Monday May 02 2005, @12:13PM (#12409249) Homepage

              I think that companies should re-evaluate their "need" for an extra 5% performance. Here's an idea -- if you need something 10 minutes faster, why not start the process 10 minutes sooner?

              In any large organization, the process gets in the way. Some suit decides the product needs a new feature, or needs to ship sooner, or whatever, and this slowly trickles down to the developers who suddenly are put in crunch time where every minute counts. Schedules and deadlines may change daily. People's jobs may be at risk. Shit happens.

              Nobody really likes it, but that is sometimes how we arrive at the point where we "need" an extra 5% performance, where we "need" the program to finish ten minutes sooner. Starting earlier is not always an option, usually because you don't know you even have to start *at all* until the last minute.

  • by Anonymous Coward on Monday May 02 2005, @11:51AM (#12408943)
    "Like a baby, we won't really appreciate its value until it's matured a bit."

    Does this mean I have to wait until it's 18?
  • Fast KDE compile. (Score:4, Informative)

    by Anonymous Coward on Monday May 02 2005, @11:52AM (#12408965)
    It's damn fast for KDE compile as someone tested [kdedevelopers.org].
    • Re:Fast KDE compile. (Score:4, Informative)

      by badfish99 (826052) on Monday May 02 2005, @11:56AM (#12409016)
      Well, the article you link to starts with the words:
      KDE sources now blacklist gcc 4.0.0 because it miscompiles KDE
      It must be easy to compile fast if you don't mind getting the wrong answer.
  • by Tim Browse (9263) on Monday May 02 2005, @11:53AM (#12408977)
    Like a baby, we won't really appreciate its value until it's matured a bit.

    Is that what you say to new parents? :-)

  • by jmcneill (256391) on Monday May 02 2005, @11:54AM (#12408983) Homepage
    Where are the screenshots?
  • by Anonymous Coward on Monday May 02 2005, @11:55AM (#12408992)
    http://www.kdedevelopers.org/node/view/1004

    Qt:
    -O0 -O2
    gcc 3.3.5 23m40 31m38
    gcc 3.4.3 22m47 28m45
    gcc 4.0.0 13m16 19m23

    KDElibs (with --enable-final)
    -O0 -O2
    gcc 3.3.5 14m44 27m28
    gcc 3.4.3 14m49 27m03
    gcc 4.0.0 9m54 23m30

    KDElibs (without --enable-final)
    -O0
    gcc 3.3.5 32m56
    gcc 3.4.3 32m49
    gcc 4.0.0 15m15

    I think KDE and Gentoo people will like GCC 4.0 ;)
  • by Laxitive (10360) on Monday May 02 2005, @12:02PM (#12409099) Journal
    "Like a baby, we won't really appreciate its value until it's matured a bit."

    Seriously, this is why I don't appreciate babies. At least after about 4 or 5 years, they're useful for mild manual labour. Sure they'll complain and cry, but all you gotta do is tie their dishwashing to the number of fish heads they're allotted that week. Works pretty well, I gotta say. Anyway, at least they're not a net productivity drain like babies are.

    Anyway, what I mean to say is: from your description, it looks like I'll be staying away from GCC 4 for a while, too. Goddamn babies.

    -Laxitive
  • by Just Some Guy (3352) <kirk+slashdot@strauser.com> on Monday May 02 2005, @12:03PM (#12409109) Homepage Journal
    As far as I'm concerned, unless you're using "-Os" because you're deliberately building small binaries at the expense of all else - say, for embedded development - the resulting binary size is completely irrelevant as a compiler benchmark. What if the smaller result uses a slower, naive algorithm (which in this case would mean choosing an obviously-correct set of opcodes to implement a line of C instead of a less-obvious but faster set)?

    Second, the runtime benchmarks were close enough to be statistically meaningless in most cases. The author concludes with:

    Is GCC 4.0 better than its predecessors?

    In terms of raw numbers, the answer is a definite "no".

    My take would have been "in terms of raw numbers, it's not really any better yet." It's close enough to equal (and slower in few enough cases that I'd be willing to accept them), though, that I'd be willing to switch to it if I could do so without having to modify a lot of incompatible code. It's clearly the way of the future, and as long as it's not worse than the current gold standard, why not?

  • by shreevatsa (845645) <shreevatsa.slash ... m ['ail' in gap]> on Monday May 02 2005, @12:07PM (#12409155)
    The worst part is that they now say that the
    <?
    ,
    >?
    ,
    <?=
    and
    >?=
    operators are deprecated, and will be removed. Damn, I liked them so much. Sure, they weren't part of the standard, and only a GCC extension, but it's just so much more fun to say
    a = b <? c
    than to say
    a = min(b,c)
    or even
    a=b<c?b:c
    . The best use was saying
    a<?=b
    instead of the painful
    a=min(a,b)
    .
    • by Heywood Jablonski (543761) on Monday May 02 2005, @02:51PM (#12411356)
      The worst part is that they now say that the <? , >? , <?= and >?= operators are deprecated...

      That's because they were conflicting with the new gphp front-end.

      • by shreevatsa (845645) <shreevatsa.slash ... m ['ail' in gap]> on Monday May 02 2005, @01:30PM (#12410284)
        I don't get it. I was very serious when I wrote that, still this comment has 60%Funny, and even 20%Troll.
        In case you were wondering why anyone would want to use a=min(a,b), you really haven't programmed enough. To take a simplistic example, how would you find the largest integer in an array? (Sure, you can just #include , then say *max_element(a,a+N) and be done with it, but let's suppose you don't want to do that...)
        Well, the way to do it would be to write a loop like this:
        int largest = 0;
        for(int i=0;i<N;++i)
        largest=max(largest,a[i]);
        I really think it's faster and better to code the last line as largest>?=a[i]. There is less unnecessary clutter.
        Oh well. I guess this comment will now lose all its Funny mod points, but what the heck.
  • by expro (597113) on Monday May 02 2005, @12:07PM (#12409159)

    I agree that this compiler is a cornerstone of free software.

    But it was very frustrating to me to try to port the compiler to a new platform by modifying existing back ends for similar platforms.

    After spending a few months on it (m68k in this case), I could not escape the layers of hack upon cruft upon hack upon cruft, that made it extremely difficult to make even fairly superficial mods because everyone seemed to be using the features differently and all the power seemed lost in hacks that made it impossible to do simple things (for me anyway). I am quite familiar with many assemblers and optimizing compilers.

    I hope that the new work makes a somewhat-clean break with the old, otherwise, I would fear yet another layer to be hacked and interwoven, with the other ones that were so poorly fit to the back ends.

    I suspect that not all backends are the same and perhaps the same experience would not be true for a more-popular target, but it seems to me it shouldn't be that hard to create a model that is more powerful yet more simple. Such would seem to me to be a major step forward and enable much greateer optimization, utilization, maintainability, etc.

  • -ftree-* (Score:5, Interesting)

    by Anonymous Coward on Monday May 02 2005, @12:08PM (#12409172)
    The whole point of gcc4.0.0 is the tree-ssa thing. The author of this test didn't seem to notice that this stuff doesn't get enabled in -O2 nor -O3, but does have to be enabled by hand. This includes autovectorization (-ftree-vectorice) among other things which may make a difference.

    If I was him, I'd repeat the tests again enabling the -ftree stuff when building with gcc4.0.0.
  • by diegocgteleline.es (653730) on Monday May 02 2005, @12:16PM (#12409290)
    I found this in the osnews announcement

    "Before we get a bunch of complaints about the fact that most binaries generated by GCC 4.0 are only marginally faster (and some a bit slower) than those compiled with 3.4, let me point out a few things that I've gathered from casually browsing the GCC development lists. I'm neither a GCC contributor nor a compiler expert.

    Prior to GCC 4.0, the implementation of optimizations was mostly language-specific; there was little or no integration of optimization techniques across all languages. The main goal of the 4.0 release is to roll out a new, unified optimization framework (Tree-SSA), and to begin converting the old, fragmented optimization strategies to the unified framework.

    Major improvements to the quality of the generated code aren't expected to arrive until later versions, when GCC contributors will have had a chance to really begin to leverage the new optimization infrastructure instead of just migrating to it.

    So, although GCC 4.0 brings fairly dramatic benefits to compilation speed, the speed of generated binaries isn't expected to be markedly better than 3.4; that latter speedup isn't expected until later installments in the 4.x series."
  • by DoofusOfDeath (636671) on Monday May 02 2005, @12:27PM (#12409418)
    There was one test case I did for my own use. I've got a small C++ program that's computationally heavey and has a small working set of memory.

    On that program (on a P4) I got an 11% reduction in runtime using GCC 4 vs. GCC 3.3.5. This was actually a big deal for me work.

    The lesson here: You're mileage with GCC 4.0's improvements may vary from the benchmarks, and you might want to try it on your own code.
  • by Gannoc (210256) on Monday May 02 2005, @12:36PM (#12409542)
    Like a baby, we won't really appreciate its value until it's matured a bit.

    Are you kidding? Babies are worth $15,000-$20,000 easily, even if they're female. Once e-Bay stops being a bunch of pussies and we get some open bidding started, I expect their value to go up even higher.


    Once again, we see that the ./ editors have no idea what they're writing about.

  • by cpghost (719344) on Monday May 02 2005, @12:52PM (#12409808) Homepage

    Recently, a discussion took place on a FreeBSD mailing list wether the project wanted to use GCC 4.0.0 as the system compiler. Some objections where:

    • KDE would not compile cleanly [freebsd.org]
    • Most of the 12.000+ ports would need manual tweaking because of other incompatibilities.
    • Some C constructs have been obsoleted, requiring huge sweeps over the existing BSD code base.

    If I understood it right, We won't have a GCC 4.0.0 system compiler on FreeBSD anytime soon. Installing the gcc40 port is, of course, always possible.

    • by chrysalis (50680) on Monday May 02 2005, @01:14PM (#12410077) Homepage
      gcc 4.0 just tries to follows standards. If something doesn't compile with gcc 4, don't blame the compiler. The source code was broken at the first place.
      • by JoeBuck (7947) on Monday May 02 2005, @03:25PM (#12411931) Homepage
        In the case of the KDE problem, there were two bugs, one of which was in KDE, but the other of which was in 4.0.0. So it sometimes is appropriate to blame the compiler. That bug is fixed in CVS and will be fixed in 4.0.1.

        It should not surprise anyone that the first 0.0 release has some bugs. It's the first release of a compiler with a completely new optimization structure (tree-ssa). I would advise waiting for 4.0.1 for a production-quality release, or go with vendor patches (by making Fedora Core 4 with a 4.0.0 based compiler, Red Hat will probably shake out a few more bugs).

  • by Paradox (13555) on Monday May 02 2005, @01:30PM (#12410286) Homepage Journal
    It isn't a huge deal for most people, but it seems like the new GCC is singificantly better at optimizing for the PowerPC now.

    I've been working with the GNU GSL on my mac a lot, and I recently updated to Tiger. The first thing I noticed when I recompiled the GSL with Apple's modified GCC4.0 is the significant and noticable speed increase. With this intense math stuff, doing SVD on 300x200 matricies, and it's shocking how much faster it is. I went from 3-5 seconds down to less than one.

    I am not going to post any hard numbers because I haven't rigorously compared them yet, but I'll make some formal comparisons this week.
    • Re:What about... (Score:5, Informative)

      by scotlewis (45960) on Monday May 02 2005, @11:56AM (#12409023)
      Yes and no. The default compiler is GCC4, however, the kernel and much of the OS (pretty much all of Darwin, in fact) are still compiled with GCC3 because they haven't completely cleared the codebase of GCC3-isms.

      That said, remember that the submitter is talking about GCC4 on x86 platforms, and remember that Apple is putting a lot of work into making sure the PowerPC optimizations are as good as possible. Not to mention things like GCC4's auto-vectorization of code to take advantage of the Altivec unit (which has a more noticeable effect than MMXing x86 code).

      It would be nice to see some test results for Apple's GCC versions 3 and 4.
    • Re:kettle? black? (Score:5, Informative)

      by Dan Berlin (682091) on Monday May 02 2005, @12:01PM (#12409082)
      Significant difference. If you ask gcc folk (like me), we'd happily tell you that 4.0 will probably be, performance wise, win in some cases, and a lose in others. Anytime you add large numbers of optimizations, it takes a while to tune everything else so that we get good generated code. 4.0 is more a test of the new optimizers than something that is supposed to produce spectacular results in all cases.
    • Re:intel compiler (Score:5, Interesting)

      by MORB (793798) on Monday May 02 2005, @12:07PM (#12409153)
      Intel compiler's reason why it generate faster code is because it does auto-vectorisation (ie, it automatically finds out how to transform some code patterns to take advantage of native vector operation, such as those provided by sse). They started to implement this in gcc 4.0, but it's a veyr first iteration that for what I know is still kinda limited. I'm not even sure it's enabled by default, even in -O3. There are lots of improvement there targeted at gcc4.1.