Forgot your password?
Intel Open Source Programming Software

Comparing G++ and Intel Compilers and Vectorized Code 225

Posted by timothy
from the different-lenses dept.
Nerval's Lobster writes "A compiler can take your C++ loops and create vectorized assembly code for you. It's obviously important that you RTFM and fully understand compiler options (especially since the defaults may not be what you want or think you're getting), but even then, do you trust that the compiler is generating the best code for you? Developer and editor Jeff Cogswell compares the g++ and Intel compilers when it comes to generating vectorized code, building off a previous test that examined the g++ compiler's vectorization abilities, and comes to some definite conclusions. 'The g++ compiler did well up against the Intel compiler,' he wrote. 'I was troubled by how different the generated assembly code was between the 4.7 and 4.8.1 compilers—not just with the vectorization but throughout the code.' Do you agree?"
This discussion has been archived. No new comments can be posted.

Comparing G++ and Intel Compilers and Vectorized Code

Comments Filter:
  • by symbolset (646467) * on Thursday December 19, 2013 @12:05PM (#45736847) Homepage Journal
    Asking any audience larger than about 20 to compare the qualitative differences of object code vectorization is statistically problematic as the survey group is larger than the qualified population.
  • by david.emery (127135) on Thursday December 19, 2013 @12:10PM (#45736909)

    Mod parent up +1 insightful.

    Unless you suspect and are trying to debug a code generator error (one of the least pleasant/most difficult debugging experiences I've had), the base assertion that you should understand your compiler's code generation is at best unrealistic, and probably just dumb. Code generation is extremely complex, requiring deep knowledge of both this specific compiler's design and this specific computer's instruction set architecture, how the caches work, pre-fetching approaches, timing dependencies in instruction pipelines, etc, etc. If you do suspect a code generator error, you're best off hiring a compiler expert at least as a consultant, and be prepared for a long hard slog.

    Maybe 30 years ago, for a PDP-8, you could assert that the C code you wrote had some semblance to the generated machine code. That hasn't been true for a very long time, and C++ is most definitely not C in this regard.

  • by zubab13 (3445193) on Thursday December 19, 2013 @12:17PM (#45736985)
    Just use something like libsimdpp[1] and you are sure that your code stays vectorized between compiler versions. As a bonus, this and similar wrapper libraries give you an option to produce assembly for multiple instruction sets (say SSE2, AVX and NEON) from the same code. [1]: []
  • by Curupira (1899458) on Thursday December 19, 2013 @12:27PM (#45737089)
    Yeah, on Intel processors. What about AMD and other x86 processors? Don't ever forget that ICC was once caught red-handed disabling important features [] when the CPUID did not return GenuineIntel...
  • by Mr Z (6791) on Thursday December 19, 2013 @12:31PM (#45737129) Homepage Journal
    Actually, the scope of int i changed in C++. Previously, the scope would extend beyond the for. If you enable warnings, G++ will tell you all about it.
  • by drawfour (791912) on Thursday December 19, 2013 @12:54PM (#45737389)
    This is why all code should be compiled with highest warning level enabled, and all warnings should be treated as errors. The compiler can have a very hard time guessing at what you meant, so it's best to be as explicit as you can. If, for some reason, you're positive the code needs to be a certain way that is, and it is correct, you can always use a "pragma warning(disable)" (with appropriate push/pop semantics) to keep your code compiling clean.
  • by jmac_the_man (1612215) on Thursday December 19, 2013 @01:02PM (#45737463)
    Intel isn't providing the optimizations for free to their competitors.* Intel provides the compiler, along with all its optimizations, to its customers in exchange for payment.

    *Except the academic, evaluation and Linux-only non-commercial use versions, which could theoretically be downloaded by AMD employees, I guess.

  • I write code in Machine Code with a bootable hex editor (446 bytes, fits in a HDD boot sector). It's the easiest way to bootstrap an OS from scratch now that MoBos don't have boot from serial port anymore...
    Here, run it in a VM: "qemu-system-i386 hexboot.img []", if you want.
    Or, "dd if=hexboot.img of=/dev/sda bs=1 count=446 conv=notrunc", if you want to preserve the partition table on a bootable drive.
    Arrows,PgUp,PgDn,Home,End = navigate; Tab = ASCII/Hex, Esc = jump to segment under cursor, F8 = Run code at the cursor. (this is a real-mode version)
    When it boots you'll be looking at the code that booted, there's only two variables that didn't fit into the registers, you can see them changing at the bottom of the code as you stroll around.

    That's all you need to create an OS, complier, etc. from scratch. You'll probably destroy your system though if you're not careful, so keep in in the VM if you're a noob; Lock-up is a mild danger, but corrupting the CMOS, etc. can leave your system bricked. You can replace the BIOS too if you know what you're doing. Maybe some day I'll publish a path to go from zero to OS while avoiding the Ken Thompson Compiler Hack... Folks are only just beginning to get interested in having actual system security, so maybe we'll lick the problem some other way. There's still chip microcode to worry about, but programmable hardware may allow us to route that exploit vector out too some day.

    Screw your bullshit optimized compiler crap. It's stupid and far slower than you think, esp. since the binaries are bigger (1 cache miss and I've already beaten you in most cases). Besides, Next year or so the system will run twice as fast. My need for speed is tempered by my greater need for security and readable machine code. If I identify a patch of code that needs to be optimized or vectorized, I can do it myself.

    Premature optimization is the root of all evil.
    - Donald Knuth

    I don't care about my lawn, it's just there to keep the dirt intact.

  • by 0123456 (636235) on Thursday December 19, 2013 @02:52PM (#45738609)

    There is a reason for warnings -- it's because you're doing something wrong.

    Uh, no. It's because you're doing something that may be wrong. If it was wrong, the compiler would given an error, not a warning.

    'if ( a = b )' for example. The compiler warns because you probably meant 'if ( a == b )'. But maybe you didn't.

    There's little reason to write such C code on a modern quad-core 3GHz CPU which spends 90% of its time idle and where the compiler will probably generate the same machine code anyway, but that doesn't make it wrong.

  • by Mr Z (6791) on Thursday December 19, 2013 @03:34PM (#45739071) Homepage Journal
    Why isn't that just a lookup table? My point in mentioning factorial is that there's no point in vectorizing that thing. Even a simple loop would be small compared to the cost of a single L2 cache miss.
  • by Darinbob (1142669) on Thursday December 19, 2013 @04:08PM (#45739505)

    Sometimes warnings are false positives as well. Especially when turning warning levels up high they will warn about things that may be indicators of a bug or typo but which actually aren't problems, or in some cases are even intentional. Such as unused variables or parameters; is that a bug or a stylistic choice to not litter the code with extra #ifdef? An unused parameter in general seems an odd thing to complain about, usually the parameter list is fixed in an API or design document whether or not the actual implementation needs all the parameters.

The moon is a planet just like the Earth, only it is even deader.