Genetic Algorithms for GCC Optimization 45
captain igor writes "For the power users in the house that enjoy taking the time to squeeze every last drop of performance out of their programs, here's an interesting little program I ran across today call Acovea. Out since 2003, Acovea's main function is using genetic algorithms to determine an optimal set off Gcc optimization flags to squeeze the most performance out of a given program. Certainly an interesting concept, definitely worth a look. Some nice results on a P4 and Opteron can be found here "
Even more interesting for OSS (Score:3, Interesting)
It's a feature on top of the existing advantages of "compiling into" instead of "installing onto" a system, and a feature that is pretty much esclusive to OSS.
Re:Even more interesting for OSS (Score:3, Funny)
Re:Even more interesting for OSS (Score:2)
Free beer to the first person to install it!
Correctness? Safety? (Score:2)
Ofcourse it's different if your a developer too, but...
Humm~~~ (Score:5, Informative)
I know that the listings in the manual are fairly accurate, not perfect though. If you want to know exactly what flags are activated when you compile you use the -Q flag (undocumented, AFAIK
gcc -v -Q file.c
Output:
[...]
options enabled: -fpeephole -ffunction-cse -fkeep-static-consts
-freg-struct-return -fgcse-lm -fgcse-sm -fsched-interblock -fsched-spec
-fbranch-count-reg -fcommon -fgnu-linker -fargument-alias -fident
-fmath-errno -ftrapping-math -m80387 -mhard-float -mno-soft-float
-malign-double -mieee-fp -mfp-ret-in-387 -mstack-arg-probe -mcpu=pentium
-march=i386
[...]
One thing I noticed with one of my GCCs is that -fomit-frame-pointer was not activated on -O3, even though the manual says it is...
Re:Humm~~~ (Score:2)
Re:Humm~~~ (Score:1)
Didn't do -mfpmath=sse,387 (Score:4, Interesting)
I would question just using SSE for floating point if the code is written just using double values - IIRC SSE doesn't like doing doubles very well. Allowing the compiler to use both the 387 mathco registers as well as the SSE registers might be a win here.
The other point about using SSE for floating point would be to use simple floats and see what difference the math options make.
Re:Didn't do -mfpmath=sse,387 (Score:5, Informative)
Major problems (Score:1, Informative)
Ouch (Score:5, Insightful)
Q: How can you tell you have perhaps gone slightly overboard in making compiler optimization options available?
A: Your users have not just given up trying to reason out what they should do, or even brute forcing every possible combination, they're inventing fucking genetic algorithms to find out what works best.
Basically the act of "calling gcc from the command line" is know officially a murky problem space to attack with pseudo-random hill climbing stuff...
Re:Ouch (Score:4, Funny)
It's a good idea in theory, but it seems much better to have a well thought out approach.
Re:Ouch (Score:5, Informative)
That's a bit unfair, another EP idea is to refactor endlessly, always improving the code as much as possible. In the situations where EP is good, the alternative is "code something, anything, until you think your hack does what the customer/boss wants done within an hour, and besides it did work with the two things you tried out, so let's put it live". In that sort of environment (taken from my job...), your description of EP would be a marked improvement.
But anyway, this isn't about EP. My opinion of genetic algorithms is "something you try when you don't understand the problem". Instead of solving the problem you try out a bunch of randomly generating tries, slightly systematically, and hope it works out. It's applicable to situations with a lot of variation in different elements, where you don't know much about relations between the choices for different elements, etc.
So I'd say it's a bad idea in theory, but even so it sometimes works when you can't find a solution by thinking.
And the idea that finding the right command line options for gcc is somehow a good problem domain for these things, well... :-)
Re:Ouch (Score:1)
Or... write test cases and then run an genetic algorithm until the right code is generated
Of course, you'd need to write a whole lot of test cases, it would probably be easier to write the actual function yourself.
Re:Ouch (Score:5, Insightful)
It's a good idea in theory, but it seems much better to have a well thought out approach.
Figuring out how different optimization passes interact is too complicated for humans to do correctly. The interactions tend to be subtle, unexpected, and sometimes completely counterintuitive. The only way to get it right is to run lots of real-world testcases with different combinations of optimizations and tune the defaults based on the results. Acovea just automates that for the nice folks working on gcc.
Re:Ouch (Score:2)
A: Your users have not just given up trying to reason out what they should do, or even brute forcing every possible combination, they're inventing fucking genetic algorithms to find out what works best.
Actually, this isn't intended for users, really. It's intended to help the compiler writers figure out what combinations of passes work best, so that the compiler writers can give the users simple
Speaking as a GCC maintainer... (Score:5, Informative)
...I strongly agree. No other compiler in history has this many knobs and dials. They interact in strange and unpredictable ways. Even worse, the -f (functionality/features) options can't be tested with all the possible -m (machine-specific) ones due to the number of platforms required. So we get bug reports complaining that -ffoo combined with -mbar on the Foonly 3000 causes random explosions killing a busload of nuns, and all we can do is shrug and say, "the -ffoo designers didn't have a Foonly, so just Don't Do That."
One of the things I'm pushing for GCC 4.x is to take an axe to the -f switches. It won't get consensus, of course, but it'll raise interesting discussion.
There's an insult in there somewhere, but I'm not sure what you're trying to attack.
Optimization options have always been a murky problem space in GCC. No other compiler targets as many processors as we do. The Right Thing on one chip will not be the Right Thing on another; that's why we have this mess.
As for what you call "peudo-random hill-climbing stuff," probably with little Dogbert-like waves of your hand, I suggest you take some formal courses in evolutionary algorithms. For solving non-linear discontinuous problem spaces, they are extremely effective, and go well beyond "hill climbing stuff."
One of the original goals of Acovea was to find better combinations of switches for popular platforms, and then make those the default for future major versions. I haven't followed the project as closely as I'd wished, so I can't say whether that's still the goal. Hope so.
Re:Speaking as a GCC maintainer... (Score:1)
There's an insult in there somewhere, but I'm not sure what you're trying to attack.
No, no, no, I love gcc, it wasn't meant as an insult at all. I just couldn't quite come to grips with the whole idea of using genetic algorithms for something like this. It's surreal to me. The world has become a bit weirder for me and I can't really manage to put the feeling into words :-)
I can imagine joking about this over some beers in a pub with some other nerds, but some people actually made this and people are ser
Re:Speaking as a GCC maintainer... (Score:2)
I disagree with that. The Texas Instruments Code Composer Studio compiler for the C67xx series DSP's is WAY more complicated and truely scary in how amazing the optimizer is - Even the assembler has optimization flags....
see docs here [ti.com]
I belive now there are more tools to graphically analyze optimization bottlenecks.
I love GCC though and appreciate the multi-platform optimization issues that arise. I believe what we need are computer languag
Performance by chance? (Score:5, Interesting)
Re:Performance by chance? (Score:2, Interesting)
Re:Performance by chance? (Score:3, Informative)
Re:Performance by chance? (Score:2, Informative)
Re:Performance by chance? (Score:5, Insightful)
That said, you don't take their results and just run with it. I find you that once a solution is found I have to go back and understand what and why it did what it did. In the case of circuit optimization I need to make sure that the optimizer isn't just exploiting a modeling problem (this has happened a few times). In the case of optimizing for complier flags I'd imagine the same thing applies.
After the optimizer finishes selecting flags the developer should go back and make sure that all the flags make sense and aren't risky. Its at this point a developer can decide what he/she wants to use.
Re: What about profiling? (Score:2, Interesting)
In addition to just trying some new flags on critical sections, perhaps you can actually try to understand what parts of these algorithms may be improved instead of just trying a bunch of opti
Gentoo (Score:4, Interesting)
Of course it would be unreasonable to have the each single ebuild compile and get benchmarked several times on each user's PC, but these genetical algorithms could be used to predetermine the optimal compiler settings for each architecture/ebuild-combination, store this information in a central database and have portage automatically select the optimal compiler setting from that database, each time it compiles an ebuild.
No more figuring out what the best compiler options are: the ebuild maintainers will take on that job for you!
Re:Gentoo (Score:4, Informative)
To sum it up: there is no single set of optimal compiler flags that result in the best performance in every situation. (If there was, then it would most certainly have been made the default setting).
Re:Gentoo (Score:2)
It would be really cool if this technology could somehow be integrated into the Gentoo [gentoo.org] project.
I'm a gentoo fan, but I'd definitely be in favor of this being done in a distributed kind of way, where the user can feed back a profile mix of usage patterns (eg, Web server, desktop, video encoding, scientific applications, etc.) and hardware and find the best compiler flags from a server. In that database that you described.
Otherwise, can you imagine doing NFLAG factorial complete builds with s
More to the point (Score:2)
That database should be called "the GCC source tree".
Once a particular set of flags has been shown to be consistently better for a majority of sample real-world code, those settings should become the new defaults for that platform on, say, -O2.
Doing this is slightly tricky, given the current GCC codebase, but that's why God
Dupe ... (Score:5, Informative)
Lots of pessimism flying around about this (Score:2)
Optimisation features of all processors are documented, but not all compilers know how to us
Re:Lots of pessimism flying around about this (Score:3, Informative)
Brute Force (Score:2)
PostgreSQL (Score:4, Interesting)
A code optimizer seems like a natural application for GAs. If you can prove a piece of code's logical equivalence to another's, you can have a code generator produce random versions of the same code (functions, loops, blocks) and then run that as a GA to find the best-performing version. On the other hand, compilation might take ages to run.
It's not a dupe (Score:2)
It is too a dupe: (Score:2)
I find it more than (Score:2, Funny)
Brute forcing all permutations is a very silly way to do this, IMHO. Compare an EA solving close to optimally the travelling salesman problem in a short time with the np-completeness of finding the absolute peak. 99% is good enough if 100% takes longer than the age of the universe to figure out for large inputs.
Al
New news? (Score:1)