Stories
Slash Boxes
Comments

News for nerds, stuff that matters

Slashdot Log In

Log In

Create Account  |  Retrieve Password

Intel Releases Threading Library Under GPL 2

Posted by CmdrTaco on Wed Jul 25, 2007 10:02 AM
from the of-interest-to-some-of-you dept.
littlefoo writes "Intel Software Dispatch have announced the availability of the Threading Building Blocks (TBB) template library under the GPL v2 with the run-time exception — so this previously commercial only package is now open for all the use, whether for open-source projects or commercial offerings (although they are explicitly encouraging open source use). The interface is more task-based then thread-based, but with a somewhat different view of things than, e.g. OpenMP. From the Intel release: 'Intel® Threading Building Blocks (TBB) offers a rich and complete approach to expressing parallelism in a C++ program. It is a library that helps you leverage multi-core processor performance without having to be a threading expert. Threading Building Blocks is not just a threads-replacement library. It represents a higher-level, task-based parallelism that abstracts platform details and threading mechanism for performance and scalability.'"
+ -
story
This discussion has been archived. No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More
Loading... please wait.
  • Woohoo (Score:3, Insightful)

    by jshriverWVU (810740) on Wednesday July 25 2007, @10:07AM (#19982825)
    If it's as smooth as the Intel C compilers this ought to be a treat. Now if only they'd release the icc under a similiar license.
  • GPL 2 (Score:3, Informative)

    by raffe (28595) * on Wednesday July 25 2007, @10:08AM (#19982835) Journal
    As the GPL 2 they link to says:
    "Each version is given a distinguishing version number. If the Program specifies a version number of this License which applies to it and "any later version", you have the option of following the terms and conditions either of that version or of any later version published by the Free Software Foundation"

    You can of course get it as GPL 3....
    • Re: (Score:3, Informative)

      The source explicitly says version 2. The "any later version" clause was left out.

      Threading Building Blocks is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License version 2 as published by the Free Software Foundation.
    • From the Development download src/tbb/Makefile:

      # Copyright 2005-2007 Intel Corporation. All Rights Reserved.
      #
      # This file is part of Threading Building Blocks.
      #
      # Threading Building Blocks is free software; you can redistribute it
      # and/or modify it under the terms of the GNU General Public License
      # version 2 as published by the Free Software Foundation.

      There's no "Or Later" in there. This is GPL v2 only.
      • Re:GPL 2 only (Score:5, Interesting)

        by networkBoy (774728) on Wednesday July 25 2007, @11:54AM (#19984289) Homepage Journal
        Which is perfectly fine. I have a friend at Intel and based on what I've heard of the corporate culture, open ended licenses are a no-go. That doesn't mean they won't later release under GPL v3, just that they want their lawyers to have a chance to review any license they release under and don't want to be beholden to the unknown. Frankly I think that's a good thing. In theory GPLv4 could say: this can be used in closed source proprietary DRM schemes. and if they had the "or later" clause they would have to allow it.
        -nB
        • Re:GPL 2 (Score:4, Informative)

          by Aladrin (926209) on Wednesday July 25 2007, @11:02AM (#19983481)
          It depends on which version of the GPL you use. There's a 'runtime exception' version (That Intel chose for this project) that allows you MORE freedom than the LGPL in the case of libraries.

          Simply put, you can link in the code as a library without worrying about LGPL's library requirements. (Namely the need to be able to replace the library with an upgraded version.) Intel notes that this is necessary for C++ libraries because of the way they have to be linked.

          For the parent's code, I doubt he chose to have this clause in the GPL he chose, and it wouldn't be possible with his.
        • Re: (Score:3, Interesting)

          Each version is given a distinguishing version number. If the Program specifies a version number of this License which applies to it and "any later version"

          Actually it says (and "any later version").The part of the program that says this is licensed under the GPL would have to say the "OR" version. The portion you and many others who don't know the GPL well enough to discern the intent pick the part outside the GPL entitled how to apply this for reference. It isn't part of the GPL and it isn't anything ot

  • by malfunct (120790) * on Wednesday July 25 2007, @10:12AM (#19982875) Homepage
    I find it interesting that the original poster took the trouble to differentiate between open source and commercial offerings as if there has to be a difference.
  • I'm glad to hear it (Score:5, Informative)

    by ookabooka (731013) on Wednesday July 25 2007, @10:13AM (#19982889)
    I attended a seminar about this at GDC (Game Developers Conference) this year. It is really nifty stuff, automatically parallelizes things for you and helps take the load off of the OS scheduler. It is also trivial to implement in many cases, for instance there are parallel loops that execute things in parallel, all you have to do is write it like a normal loop but use a different keyword (ok so it is a wee bit more involved, but you get the idea). If I recall correctly it is basically a thread-pool that manages scheduling itself better than the OS because it knows ahead of time the needs of the code. Also you don't have to know the # of cores or anything as it handles that transparently. Also it isn't limited to Intel processors, I'm pretty sure at GDC it was actually being demoed on some sparc machines. If I had the time and/or a reason to use it I would definately investigate further.
  • by TomorrowPlusX (571956) on Wednesday July 25 2007, @10:15AM (#19982897)
    I looked at some of the tutorials yesterday, and I believe I'm going to dip my toes in this.

    But. As much as I love C++ ( and I do ) the real weakness is the lack of usable closures/lambda. The parallel_for example requires you to pass a functor to execute on ranges, which is fine, it makes sense, but since you can't define the closure in the calling-scope in C++ you end up filling your namespace with one-off function objects.

    This is not a critique of TBB, but rather of C++. In java I can make an anonymous subclass within function scope. In python and hell even javascript I can make anonymous functions to pass around. But in C++ I can't, and this means that my code will be ugly.

    Not that this is new news. I use Boost.thread for threading right now, and most of my functors are defined privately in class scope ( which is, at the very least, not polluting my namespace ) but it's too bad that I don't have a more elegant option in C++.

    That being said, Boost.lambda makes my brain hurt a little, so my complaints are really just a tempest in a teacup. If I were smarter and could really grok C++ I could probably use Boost.Lambda and this would be a non-issue.
    • But the thing is (Score:4, Informative)

      by Sycraft-fu (314770) on Wednesday July 25 2007, @02:06PM (#19986197)
      C++ (or C) is where all the fast code is still written. Thus it is the most relevant place for this kind of thing. If you look at Intel's page, you'll see they sell compilers, but only for two languages: C/C++ and Fortran. The reason is that their compilers are specifically to get as much performance as possible on an x86/x64 chip. So they target the languages people use when they are performance oriented. There are lots of other great languages out there, but face it, you aren't (or at least shouldn't) be using a managed language like Java when every last clock cycle counts.

      You'll find that this is rather evident in most games. While it is increasingly common to write large portions of the game in a scripting language since that make it easier to write and perhaps more importantly easier to mod, you'll find that the high speed stuff is still C++. Take Civ 4 for example. They wrote almost the whole damn game in XML and Python. All data (like unit definitions, technology tree, etc) is stored in XML files, all the scripting necessary to make them work is Python. Makes the game extremely easy to mod. However, the AI code, which they also released to end users, is in C++. The reason is that the AI is highly intensive and would have run too slow in Python. Also, the core engine of the game (not released to users) is C++ as well.

      So it isn't surprising this is where Intel is targeting their optimisations. Also, I'd argue that to a large degree any of this kind of thing for a managed language is the responsibility of the runtime itself. If Java is to have better support for automatically threading things, the JRE is probably where that should be done.
      • by ray-auch (454705) on Wednesday July 25 2007, @11:46AM (#19984145)
        Erm, yes, C++ has local classes, however there is a "BUT" and it's a big one:

        Local classes / structs do not have external linkage and therefore can't be used as template arguments. So, for functors etc., which is precisely where you'd want something like a local class (ie. because you really want a closure), they are useless.

        Hence why we have Boost lambda. Expect, and I agree with the GP, the syntax ends up so horrible (due to the constraints of C++, not in any way the fault of the Boost devs) that you end up not using it. Not a lot of point in trying to do something because it is technically cleaner and neater if it ends up unreadable and therefore unmaintainable (for that, there is always Perl).
        • Re: (Score:3, Informative)

          Local classes are definitely standard, section 9.8 I think.

          Local _functions_ aren't in C++, but may be a GCC extension - which might be confusing you.
  • GPLv2 only (Score:3, Informative)

    by starseeker (141897) on Wednesday July 25 2007, @11:30AM (#19983901) Homepage
    As near as I can tell, this is GPLv2 ONLY (without the "or any later version" clause). Checking a random source file in the distribution, there is no "later version" language present.

    This doesn't surprise me much, actually - I imaging Intel wouldn't want to commit their code to an unknown future license, and I expect they're still evaluating GPLv3. Even if they were done with that evaluation, the process for releasing this under v2 probably took a LONG time to complete - Intel is after all a large corporation. Restarting with GPLv3 probably would have just delayed it, although I suppose the only ones who would actually know that work for Intel.
  • by ohell (821700) on Wednesday July 25 2007, @11:45AM (#19984131)

    I read on their FAQ that TBB requires 512MB to run, though they recommend 1GB. This appears to be very high, especially when compared to Boost.Threads etc. I can't think of a reason why they need to allocate this much - and it would probably be a problem for consumer applications.

    Also from the FAQ, the so-called concurrent containers still need to be locked before access. So no change from normal STL containers there.

    But I will download it just for the memory allocator they supply, since it can be plugged into STL, and claims to hand out cache-aligned memory. It can apparently be built independently of the rest of TBB.

    • by ookabooka (731013) on Wednesday July 25 2007, @10:20AM (#19982945)
      Thats the thing, it makes programming easier by making the whole parallel thing a bit more transparent. Basically picture a foreach loop. This thing allows you to do the same thing but instead can do multiple instances of the loop at once and automatically uses the "optimal" number of threads based on the cores available, you just have to call parallel_for. It's not quite as simple as that but it certainly does take the grunt work out of parallelizing things.
      • it makes programming easier by making the whole parallel thing a bit more transparent
        I'd argue that it makes things more opaque, by abstracting away the need to explicitly deal with threads. Instead, you just define "tasks" that can run concurrently, and the toolkit takes care of mapping the tasks to actual threads.

        Agreed it does look to take a lot of the grunt work out of writing parallel-processing code. There are supposedly Java and .NET versions under development, it'll be interesting to see if they're able to implement the concepts as cleanly as in C++. My guess is both implementations will be a little "clunky" (cumbersome and less efficient).
        • Re: (Score:3, Insightful)

          Well. . .c++ abstracts away from ASM, so is it bad too? Abstraction isn't a problem really, especially when it handles a bunch of grunt work correctly and efficiently. Yeah some programmers might not understand exactly what they are doing, but tools that add a layer of abstraction are OK in my book so long as they don't make things more complicated or grossly inefficient. Besides, if you really wanted to do it differently you could either modify the GPL code or write it from scratch. Hopefully, handling thr
    • by Holi (250190) on Wednesday July 25 2007, @10:42AM (#19983185)
      >There are 11 types of people in the world, those who know binaries and those who don't.

      Obviously you are in the those who don't group.
      • by dubbreak (623656) on Wednesday July 25 2007, @11:53AM (#19984283)
        I fixed it for him:

        There are 11 types of people in the world: those who know binaries, those who don't and those who don't.


        The then/than mixup is kind of funny though. Reminds me of something I read in the engineering faculty on a white board (I assume a first year engineer):
        "I'd rather be retarded then do my engineering homework.."

        Looks like he had the pre-requisite fulfilled and should have just got on with the homework.
    • Re:I'm thinking (Score:5, Informative)

      by hrieke (126185) on Wednesday July 25 2007, @10:49AM (#19983247) Homepage
      The AMD question was raised on their Forums, and there is no issues with TTB running on AMD CPUs.
      And, if there was, well it's under the GPL now, and I'm sure someone would have added / corrected that mistake.

      • PS3? (Score:4, Interesting)

        by LinuxGeek (6139) * <linuxgeek@djan[ ]om ['d.c' in gap]> on Wednesday July 25 2007, @01:58PM (#19986077)
        I checked the site and forum, but no search results on PS3. Having just bought a shiny new 60gig PS3, this release makes me wonder just how easy it could be to take fairly good advantage of all the cores.

        Hmmm, it may be one of my first projects; six cores running @ 3.2GHz and an easy method of putting them to use. It would be interesting to parallelize pi calculation and see how long it would take to get one million digits.
        • Re:PS3? (Score:4, Informative)

          by Doctor Memory (6336) on Wednesday July 25 2007, @02:45PM (#19986725) Homepage

          Having just bought a shiny new 60gig PS3, this release makes me wonder just how easy it could be to take fairly good advantage of all the cores.
          That should be interesting, since the Cell is a non-orthagonal multi-core CPU (sort of like a PPC core with multiple AltiVec units). Opcodes for the main core (the PPE) are Power/PowerPC, while the satellite processors (the SPEs) run a vector (similar to the AltiVec or VMX) instruction set. I believe the PPE can also execute the vector instructions, so maybe it would be possible to just target that. I'm not sure how general-purpose those opcodes are, though, and since I don't believe the PPE has the SPE's complement of 128 registers, you might wind up to just supporting whatever register set the PPE has.
    • by James_Intel (1082551) on Thursday July 26 2007, @01:47AM (#19993055)
      We've been supporting Linux, Windows and Mac OS X for x86, x86-64 and Itanium processors in the commercial product for a year. And, yes, those include Intel and AMD processors. The commercial product information only lists those.

      The commercial product information quoted does not include some ports which were completed for the open source project only days before the open source release.

      Preparing for open source, we were able to get G5 for Mac OS X as well as support for Solaris and FreeBSD (both x86 and x86-64) working before releasing on Tuesday. It was tight - but they made it. I wasn't sure until the week before what we would have - but the team got them working. I think it will be easier now that the project is started - and we can let other join in to help us.

      I should also say we got a bunch more Linux distributions working for builds too. We have tested them enough to see no issues - but we haven't enough experience to call them supported on the product pages (commercial product). Please look for the latest ports on the open source project threadingbuildingblocks.org. We'll work with anyone who has processors/system expertise and needs any advice we can offer. Understandably, we don't have a lot of non-Intel hardware inside Intel to test upon and we are hoping others can help a bit with that.

      For compilers - we have gcc, Intel, Microsoft and Apple (gcc in Xcode environment) compilers all working with the builds. It seems like we may have something to do for Sun's compilers and/or environment working - some Sun engineers are in touch and helping us double check this. No schedule - just working together - which I have faith will get results to put out in an updated open source copy in the not too distant future - non-binding wish - this is not a promise ;-) We're talking about what to do together to add SPARC support to - which shouldn't be too hard but will take some work.

      The biggest issues from processor to processor is knowing how to implement a few key locks, and atomic operations, best in assembly language. Since we have support for processors with both weak and strong memory consistency models - we know TBB is up to the task.

      TBB is very strongly tied to shared memory, and so a port to a Cell processor (or a GPU) would be a bit more challenging - but might be doable for the Cell. We've had only a few discussions/thoughts - no progress I know of figuring out a good approach there. That will almost certainly take someone with more Cell experience than we have at this time. I'm open to learning - but I'd need a teacher for sure.