Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
Open Source Programming Upgrades

LLVM 3.7 Delivers OpenMP 3.1 Support, ORC JIT API, New Optimizations 84

An anonymous reader writes: LLVM 3.7 was released today as the newest six-month update. LLVM 3.7 has OpenMP 3.1 support via Clang, a new On-Request Compilation JIT API, the Berkeley Packet Filter back-end was added, the AMDGPU back-end now supports OpenGL 4.1 when paired with Mesa 11.0, and many other functional changes. Early benchmarks against GCC show its performance quite competitive with GCC5, even superior in some workloads, and should be more competitive in scientific applications with there now being OpenMP support.
This discussion has been archived. No new comments can be posted.

LLVM 3.7 Delivers OpenMP 3.1 Support, ORC JIT API, New Optimizations

Comments Filter:
  • by turkeydance ( 1266624 ) on Tuesday September 01, 2015 @06:33PM (#50440419)
    news for nerds, indeed.
    • I really wonder what people like you are doing on this site... My 8 year old daughter knows how to use Google and Youtube...
    • But the real issue is not all geeks and nerds are the same. You could say something like C++ compiler LLMV was released. At least you have a context on what we are talking about.

  • Swift is currently crippled in performance by the requirement to use ARC (at least code that actually uses reference types). Just wondering if anything new in this might affect that.

    • In what way does that cripple performance? I can't think of any.
      • I'm talking mostly about high performance numerical computing, games, etc. Right now if you look at the object code generated by swift you'll see that even a trivial method call may generate dozens of retain/release calls on seemingly innocuous code. ARC is fine for most things but you pay a small penalty for it ever time you reference or pass a reference to an object... as opposed to a garbage collected language (e.g. Java) where you expect referencing long lived objects to be essentially free, pointer o

        • Sounds like Objective-C is the way to go lol.
          • Sounds like Objective-C is the way to go lol.

            Modern Objective-C for MacOS and iOS automatically generates ARC retain/release just like swift. Swift, Objective-C, Java is for UI code. The core code should be written in C/C++, written once, re-used/shared on iOS, MacOS, Android, Windows and Linux.

            • That's true, as a bonus, any code written like that in C/C++ can be also used in Python, Ruby, TCL, etc.
            • Objective-C++ also works pretty well now (including in the open source implementation), to the point that I generally prefer C++ containers to Objective-C ones. std::unordered_map seems to be faster than NSMutableDictionary for most things, and has the added advantage that you can have primitive types as keys or values without resorting to boxing. The big problem for Swift is that the FFI to C is fine, but the FFI to C++ is basically nonexistent.
        • I'm talking mostly about high performance numerical computing, games, etc. ... Right now the only way to write high performance code in Swift is to essentially abandon classes and work only with structs.

          You don't write the high performance part of your code in Swift or Objective-C or Java on Android; you write it in C/C++. You only write the user interface code in Swift, Objective-C or Java. Matter of fact you write your core application functionality in C/C++, high performance or not. You separate your core code from user interface code. Your core code will be portable and can be shared between iOS, Android, Mac OS X, Windows, Linux, etc.

          • That hasn't been productive advice since about 1998 :) Modern languages with runtimes like Java, C# (and presumably Swift when it gets its act together) can actually be *faster* than C/C++ in some cases because they have more optimization information at runtime than exists statically at compile time. In particular garbage collection in Java is just about optimal and you really can't beat it with hand crafted memory management. I assume that the ARC people have some plan for how to eliminate the overhead

            • That hasn't been productive advice since about 1998 :) Modern languages with runtimes like Java, C# (and presumably Swift when it gets its act together) can actually be *faster* than C/C++ in some cases ...

              Not in the cases you mentioned earlier, high performance numerics and games. I've worked in both areas. To be fair I am assuming you are not including casual video games.

              ... because they have more optimization information at runtime than exists statically at compile time. In particular garbage collection in Java is just about optimal and you really can't beat it with hand crafted memory management.

              Actually it is beaten quite routinely by game devs, again non-casual.

              I assume that the ARC people have some plan for how to eliminate the overhead for special cases like this eventually, but they just aren't there yet. More generally - yes, I could just rewrite my entire app in objc ...

              No, you could not because modern objective-c for Mac OS and iOS has ARC retain/release just like Swift. ARC is not specific to Swift.

              ... or C/C++ to work around the current problems with Swift but then I'd have 25k lines of ugly code instead of 10k lines of pretty code that I actually want to maintain and work on :)

              Having done quite a bit of C/C++, a fair amount of Objective-C and some Swift I'm not sure how you came up with any such ratios. We must ha

              • by DrXym ( 126579 )

                Not in the cases you mentioned earlier, high performance numerics and games. I've worked in both areas. To be fair I am assuming you are not including casual video games.

                The general advice for writing games in Java is avoid creating temporary objects - use long lived objects, don't create objects in the scope of a loop, avoid for-each mechanisms (temporary iterators), reuse buffers and arrays, store as much state in values and buffers instead of objects and only release during transitions (game over, new level) etc.

                Everything to reduce the duration and frequency of GCs in the middle of the action. Java GC works fine in general but it's very disruptive for the game world t

                • The general advice for writing games in Java is avoid creating temporary objects [...]

                  That's like saying "do not use classes or templates in the C++".

                  If you have a library or an interface, you inevitably end-up with temp objects to accommodate the other interface. (Heck, even the Java standard library on its own creates piles of temps.)

                  Literally everything these days is in libraries and wrapped in the interfaces, there is no way in hell a sane Java programmer can reduce drastically the number of temp objects.

                  Practical example. In one project, few devels spent several weeks optimizing t

                  • by DrXym ( 126579 )

                    That's like saying "do not use classes or templates in the C++".

                    No, it's like saying if you want a performant game written in Java then you must avoid doing certain things.

                    This is just ridiculous.

                    Yes [devahead.com], you're [stackexchange.com] completely [stackoverflow.com] right [war-worlds.com].

                    • This is just ridiculous.

                      Yes [devahead.com], you're [stackexchange.com] completely [stackoverflow.com] right [war-worlds.com].

                      Rrrrright.

                      A pile of generic performance optimization tricks definitely solves real world problems in real world applications. Or probably it does for you, the whole world is reduced to games and Android.

                      Try to write some business logic which crunches 100 millions entities, and then come back. Or networking application which serves 10K+/s requests in real-time. But why go so far - an Eclipse-like text editor without C, in pure Java. All that is routinely done in C/C++ - and still generally fails in Java.

                    • by DrXym ( 126579 )
                      Perhaps you're dense or something because I wasn't referring to writing business logic or network applications. I was referring specifically to what games have to do to avoid GCs in Java. The context is extremely clear. And yes I've developed lots of Java software.
                    • Try to write some business logic which crunches 100 millions entities, and then come back. Or networking application which serves 10K+/s requests in real-time. But why go so far - an Eclipse-like text editor without C, in pure Java. All that is routinely done in C/C++ - and still generally fails in Java. I know it, because I have tried.

                      Ok, first - you know that Eclipse is written in Java don't you? :)

                      Beyond that - a) The biggest financial institutions in the world use Java to crunch numbers on larger sets of entities than that every day (I have written some of these systems). b) Tomcat is a pure Java application server and it can easily scale to 10k/requests per second on a reasonable server... and c) The best IDE in the world, jetbrains IDEA, is pure Java and I use it every day.

                      I don't know why this thread has become a bash-Java thread

                • The general advice for writing games in Java is avoid creating temporary objects - use long lived objects, don't create objects in the scope of a loop,..

                  Exactly. And the problem with Swift / ARC as compared to Java / GC is that the Swift compiler has no idea how long-lived objects are and so it has to do this super paranoid retain/release every time any reference type is touched. What is basically free in Java (referencing long lived objects on the heap) is relatively costly in Swift or Objc with ARC. At minimum this is unexpected behavior for most people and makes writing high performance code in Swift awkward right now.

                  What I would desperately love is

              • No, you could not because modern objective-c for Mac OS and iOS has ARC retain/release just like Swift. ARC is not specific to Swift.

                Yes, I could because Objective-C has a lovely switch called -fno-objc-arc that would allow me to decide in a chunk of code by chunk of code basis where ARC was acceptable and where it was not.

                As for your other comments without getting into particular examples this isn't very productive. My original question again was about whether there is anything in the latest clang that might help Swift.

                • No, you could not because modern objective-c for Mac OS and iOS has ARC retain/release just like Swift. ARC is not specific to Swift.

                  Yes, I could because Objective-C has a lovely switch called -fno-objc-arc that would allow me to decide in a chunk of code by chunk of code basis where ARC was acceptable and where it was not.

                  Not quite, at least on Mac OS. ARC is required for Mac App Store apps. Not sure when the iOS App Store goes this way too. Apple has been telling developers to convert their code to ARC for years, there are even tools to do much of the work automatically.

                  In any case the point remains, modern objective-c is ARC based. Code generated by Xcode has been assuming ARC for a while.

            • > Modern languages with runtimes like Java, C# (and presumably Swift when it gets its act together) can actually be *faster* than C/C++ in some cases because they have more optimization information at runtime...

              Except that high performance code does NOT use OOP; it uses DOD (Data Orientated Design) which is far faster.

              * Pitfalls of Object Oriented Programming [cat-v.org]

              * Mike Acton: Code Clinic 2015: How to Write Code the Compiler Can Actually Optimize [youtube.com]

              • Ok, first, those articles are about statically compiled C/C++ and in particular targeting game systems that only support those types of applications. Runtime platforms like Java and .NET can do things do optimizations that those cannot like optimistically inlining methods where there is not enough information to prove that they never need dynamic dispatch, and making memory management for short live objects almost free by putting them on special parts of the heap.

                But to take a step back - yes, it's usuall

                • The problem with the OOP philosophy is that it tends to encourage architecturing/designing for the uncommon case: 1 object. DOD instead designs for the common case: many objects.

                  DOD is the 3rd tier of optimization.

                  1. Low-level bit twiddling [stanford.edu]
                  2. Algorithm
                  3. Data cache access and usage patterns

                  > where there is not enough information to prove that they never need dynamic dispatch

                  The other mantra of DOD is "Know Thy Data. Instead of having a generic container (because one is under the delusion this is "symm

            • Modern languages with runtimes like Java, C# (and presumably Swift when it gets its act together) can actually be *faster* than C/C++ in some cases because they have more optimization information at runtime than exists statically at compile time.

              People keep telling that for as long as I have dealt with the software development (~25 years now, counting from the first programming courses I took).

              The dreamers keep telling us that the compilers, which would be able to magically optimize the code, are just around the corner. So that even an idiot can write a program - and let the smart compiler to reduce it to the substance of what the user wanted. There would no need for the highly educated specialists to write software anymore and software developme

            • Modern languages with runtimes like Java, C# (and presumably Swift when it gets its act together) can actually be *faster* than C/C++ in some cases because they have more optimization information at runtime than exists statically at compile time.

              They can but in fact they aren't. Performance must always be measured.

              No matter of wishful thinking will change the fact that the C and C++ implementations gcc and icc are generally faster than implementations of other current languages (mostly due to smart compiler optimizations) except that Fortran tends to be faster than C for numerical stuff and GNAT Ada can sometimes beat C++ and even C or at least be on a par with it. I'm not saying that there are no occasional outliers or that speed is everything (of

  • by UnknownSoldier ( 67820 ) on Tuesday September 01, 2015 @07:57PM (#50440847)

    I've been forced to manually install gcc 5.x on OSX simply because clang didn't support OpenMP.

    This is great news. Now I can support both compilers on OSX.

    • I've been forced to manually install gcc 5.x on OSX

      Oh dear god, I feel your pain. I had to install it from source once on a system a few years old and quickly discovered that the old adage that installing GNU-anything requires installing GNU-everything-else still holds true, there were so many dependencies on other tools and "your version of A is out of date, you need to update A before you can update B and use that to update C which needs D and E and F and then you can finally build G which will allow you to install H" that in the end I gave up.

      • Guess I got lucky then. But yeah, the GNU toolchain definitely can end up in dependency-hell like you mention! For some reason my Ubuntu box is much more susceptible to this then OSX. :-/ Then again I wasn't using `brew` -- which has its own set of problems.

  • The current 6.4 "gcc -v":

    Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
    Apple LLVM version 6.1.0 (clang-602.0.53) (based on LLVM 3.6.0svn)
    Target: x86_64-apple-darwin14.5.0
    Thread model: posix

  • The compilation tests they ran are completely pointless because all it measures is the amount of time required to build XYZ which is not a measure of a good compiler. What they should be looking at is what is actually being generated for it's size, efficiency and most importantly, accuracy.

    Compiling code with -O3 on GCC will get you in trouble yet they still use it. However, I noticed some of there tests use -O2 instead which I presume is because some optimization resulted in an incorrect result.

    However,

    • The compilation tests they ran are completely pointless because all it measures is the amount of time required to build XYZ which is not a measure of a good compiler.

      It's one of many measures of a good compiler.

      What they should be looking at is what is actually being generated for it's size, efficiency and most importantly, accuracy.

      They test efficiency, that's what all the benchmarks are for.

      Compiling code with -O3 on GCC will get you in trouble

      No it won't, at least not more often than incredibly rarely,

      • -Os frankly is of little interest to desktop developers. Heck, I spend quite a bit of my time on 8 bitters these days, and I think you're being pedantic.

        You might want to tell Apple that, as they compile everything with -Os. It turns out that instruction cache pressure still matters, and matters a lot more if you're in the kind of environment where multiple applications are competing for space.

        • I care quite a lot about high performance stuff and scientific computing. My stuff hours faster on my Linux luggable and cluster using -O3, and yes I did benchmark it. I don't really care what Apple do, and I doubt they'd listen.

          Besides the gp was trying to be smug and superior by claiming that because they didn't do his pet test it's obviously crap and you shouldn't listen. I'm kind of sick of people like that, because they don't say anything to inform or offer insight, they're simply trying to make themse

          • If you're doing HPC, then you're definitely not the kind of 'desktop user' that the grandparent was talking about. For a single compute-bound application consuming all of the system resources, -O2 or -O3 will almost always win (unless they manage to blow out L1 i-cache on a hot loop, which does happen but is quite rare). When you benchmark systems with a lot of active processes, then the numbers become very different, because cache contention starts to matter (so does TLB contention, though on x86 with th
            • If you're doing HPC, then you're definitely not the kind of 'desktop user' that the grandparent was talking about.

              How do you know he's a desktop user? All the OPdid was state the benchmarks are useless and you shouldn't read the article.

              When you benchmark systems with a lot of active processes,

              Modern desktops are putting a lot of effort into reducing the number of wakeups per second in orer to reduce power draw. This means that on most systems, there are a lot of processes, but very few running at any give

              • How do you know he's a desktop user?

                Because (in the part of the post that I quoted in my reply), he said:

                -Os frankly is of little interest to desktop developers

                And I replied that -Os is relevant to desktop users, which you then disputed by saying that it's not relevant to HPC.

                Modern desktops are putting a lot of effort into reducing the number of wakeups per second in orer to reduce power draw. This means that on most systems, there are a lot of processes, but very few running at any given time.

                Timer coalescing does the exact opposite. It means that you'll have a single wakeup and then a load of processes run, and then sleep. This increases i-cache pressure, it doesn't reduce it.

        • All of my code runs fastest with -O3 and slowest with -Os. Tested extensively. But it's Ada (GNAT gcc) on Linux.

          • Are you testing the performance of a single program, or of an entire system? The numbers generally change quite a lot when you look at interference.
  • OpenMP 4.0 (Score:4, Interesting)

    by tanderson92 ( 1636327 ) on Tuesday September 01, 2015 @09:36PM (#50441279)

    They *just now* implemented OpenMP 3.1, a standard 4 years old. OpenMP 4.0 which is now more than 2 years old is unaddressed while GCC has had it for some time(indeed, they recently added support for OpenACC).

    Somehow I don't think scientific users are going to be lining up to use it

As you will see, I told them, in no uncertain terms, to see Figure one. -- Dave "First Strike" Pare

Working...