Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
×
Technology

Unleashing the Power of the Cell Broadband Engine 136

An anonymous reader writes "IBM DeveloperWorks is running a paper from the MPR Fall Processor Forum 2005 explores programming models for the Cell Broadband Engine (CBE) Processor, from the simple to the progressively more advanced. With nine cores on a single die, programming for the CBE is like programming for no processor you've ever met before."
This discussion has been archived. No new comments can be posted.

Unleashing the Power of the Cell Broadband Engine

Comments Filter:
  • Cell architecture (Score:3, Informative)

    by rd4tech ( 711615 ) * on Sunday November 27, 2005 @01:37AM (#14122551)
    The Cell Architecture grew from a challenge posed by Sony and Toshiba to provide power-efficient and cost-effective high-performance processing for a wide range of applications, including the most demanding consumer appliance: game consoles. Cell - also known as the Cell Broadband Engine Architecture (CBEA) - is an innovative solution whose design was based on the analysis of a broad range of workloads in areas such as cryptography, graphics transform and lighting, physics, fast-Fourier transforms (FFT), matrix operations, and scientific workloads. As an example of innovation that ensures the clients' success, a team from IBM Research joined forces with teams from IBM Systems Technology Group, Sony and Toshiba, to lead the development of a novel architecture that represents a breakthrough in performance for consumer applications. IBM Research participated throughout the entire development of the architecture, its implementation and its software enablement, ensuring the timely and efficient application of novel ideas and technology into a product that solves real challenges; More [ibm.com]...
  • I just want to draw a flowchart and have the compiler and realtime scheduler distribute processes and data among the hardware resources. If we are getting a new architecture and new "programming models", and therefore new compilers and kernels, how about a new IDE paradigm.
    • Re:New Me (Score:5, Funny)

      by NanoGator ( 522640 ) on Sunday November 27, 2005 @01:44AM (#14122578) Homepage Journal
      "I just want to draw a flowchart and have the compiler and realtime scheduler distribute processes and data among the hardware resources. If we are getting a new architecture and new "programming models", and therefore new compilers and kernels, how about a new IDE paradigm."

      Bingo, sir.
    • Time to port the Lego Mindstorms development environment to the Cell processor!
    • I just want to draw a flowchart and have the compiler and realtime scheduler distribute processes and data among the hardware resources. If we are getting a new architecture and new "programming models", and therefore new compilers and kernels, how about a new IDE paradigm.

      Isn't this basically the dataflow [wikipedia.org] paradigm ?

      I think it should be possible to make any programming language automatically spread its work across multiple processors simply by analyzing which operations depend on which to generate "wa

      • Your programming example would remove the need for mutexes, but that's about all. Spawning a thread, while faster than an entire process, has cost.

        In your example, the context of the current thread would have to be duplicated (how does the compiler know what data operation() will use). They can't run from the same data - what if doSomeThing() modifies a field that operation() uses?

        Java's FutureTask requires extra work, but it protects you from concurrent data modifications errors. (Or, it appears to. I'

        • Your programming example would remove the need for mutexes, but that's about all. Spawning a thread, while faster than an entire process, has cost.

          No, you'd still need mutexes. My example is simply a clean and simple way of expressing "this task can be run on a separate worker thread".

          In your example, the context of the current thread would have to be duplicated (how does the compiler know what data operation() will use). They can't run from the same data - what if doSomeThing() modifies a field that

    • Real geeks don't use IDEs! ;)

      I'll use an IDE when they invent one that is fast, flexible, and chrome free as a normal bash command prompt. I have hated every IDE I've tried due to slowness and their insistance on making an ugly bloated interface. I don't need the IDE to try to guess what I'm typing, to offer code debugging while I'm typing, to FTP my files for me, etc.

      Basic project management, checking my code for errors (when I ask.. not constantly), good seach and replace (with regular expressions and acr
      • Enjoy your wallow in the 1970s. My coding experience, which started there, includes typing hex machine code into the Apple ][+ machine language monitor. And it also includes working with customers, graphic artists and mathematicians. Which is why productivity is clearly highest throughout the cycle with flowcharts of reusable schematic objects. We've got lots of native topological intuition and skills we almost use while coding and debugging, but which we're not skilled in operating by typing.

        I want to keep
        • Individual indiosyncracies are nice. I hate being forced to use a given enviroment to work on something. I keep working on my ideal enviroment which is of course ideal only to me. Fancy tools with low overhead. ;)
          • I like subclassing and embedding :).
            • My biggest fetish is making lots of extensible small programs/libraries and building other programs out of them.. so I guess I like subclassing and embedding too. I was once accused of creating my own custom programming language for everything I programmed. I do that less now but do like to expose the internals in ways that let my programs be modifed in all sorts of interesting ways.
  • Damn you marketing droids! This has nothing to do with broadband at all.
  • Wow ... (Score:5, Interesting)

    by JMZorko ( 150414 ) on Sunday November 27, 2005 @01:45AM (#14122583) Homepage

    ... all those _registers_ make me salivate! One of the coolest things about the RCA1802 (the processor I learned on) compared to others in its' time was that it had _loads_ of registers when compared to a 6502 or 8085. It spoiled me, though ... when I started exploring those other CPUs, I always thought "Huh? Where are all of the registers?"

    So yes, I want a Cell-based devkit now, 'cuz this sounds like _fun_ :-)

    Regards,

    John

    • Umm, did the 6502 need many registers? If there were to be many more and they were to be useful command size might out-reach bus width, some fancy dancing might make it a break-even on speed. Best idea might be a scratchpad area built in, still not much of a win over just using zero-page.

      I wonder how well an uber-multicore 6502 at appropriately modern clockspeeds would fare today...
    • What, an accumulator wasn't enough for you?

      Kids these days...
      • The 6502 had X and Y too.
        • X and Y are index registers, and they aren't used in calculations or any "real work," just primarily for adddressing memory. (though, admittedly, some basic instructions provided for X and Y could be used for "real work" if you feel like juggling registers.)
    • You had to do subroutine call and return in actual code. It wasn't until the 1805 that you got a very slow microcoded call and return and that you got to be able to load and save those registers to the stack without programming (RSXA and RLDX). The trouble with the 1802 was its microarchitecture was just too exposed to the programmer - like the PDP-8. I really don't want to go there again.

      Another horrible early processor was the TMS9900, which pretended to have 16 16-bit registers but they were just mapped

    • A quick calculation shows that the Cell has BUTT LOADS OF REGISTERS!!!

      SPE registers = 8 SPEs * 128 registers * 128 bits
      PPE registers = 2 threads * (32 integer + 32 floating pont) registers * 64 bits
      PPE VMX-128 registers = 2 threads * 128 registers * 128 bits

      Grand total = 21+ Kbytes of registers, not to mention the 2 Mbytes of SRAM on the SPEs.

      This thing is going to be a lot of fun!
  • ps3 programming (Score:3, Insightful)

    by orlyonok ( 729444 ) on Sunday November 27, 2005 @01:47AM (#14122588)
    from the article and if the ps3 cell cpu is even half the processor than this monster is i say that game companies will need a lot of real programmers to make real good games (as if they cared).
    • I say no, they won't need lots of real programmers. They only need 1 or 2 per game team to do the overall design and let the compilers do the rest. Since the real guts of it will be compiler optimization. If your lead designers do their job, the compiler will be able to do its job and everything will work like it should.

      Its when you take old code from previous things and then try to do a direct port that you will see some issues in performance hits. But if designed from the ground up in terms of the code

    • Re:ps3 programming (Score:3, Interesting)

      by MikeFM ( 12491 )
      It'd seem to me that a lot of the development trickery will be in getting a proper compiler and specialized libs out there that take advantage of this parallelism without requiring massive changes to how the average developer has to write their code.

      Most of the bitching we've heard from developers so far hasn't been that the cell sucks but that their existing codebases don't take advantage of it's design and they don't want to make a rewrite that locks them into the platform.

      As with every platform the reall
      • Re:ps3 programming (Score:3, Insightful)

        by iota ( 527 ) *
        It'd seem to me that a lot of the development trickery will be in getting a proper compiler and specialized libs out there that take advantage of this parallelism without requiring massive changes to how the average developer has to write their code.

        Certainly people are working on that very idea. However, it's a long way off and not likely to happen in the lifetime of this version of the processor. Both XLC (IBM's optimizing compiler) and GCC have a very difficult time vectorizing (i.e. taking advantage of
        • Re:ps3 programming (Score:3, Interesting)

          by TheRaven64 ( 641858 )
          C is an incredibly bad language for programming a modern CPU. There are many parts of the C language which assume that the target machine looks a lot like a PDP-11. Trying to turn this code into something that runs on a superscalar machine, let alone a vector processor is incredibly difficult - we can only do it at all because such a huge amount of effort has been invested in research. If you want a good language for programming something like the cell, then you should take a look at more or less any fun
          • the first is that they don't deal well with resource contention. No language, or any other thing for that matter, does.

            When you fork N processes on N objects and you have N-M processors, it costs you computationally, which translates into efficiency.

            Its one thing to think of this situation as a bunch (N) of ball-bearings going a bunch of holes (N-M) with each ball-bearing having its state information local to it. (Any kind of concept of a sieve can serve as a 'gedanken' experiment.)

            The situation becomes ho
            • I wonder if we will see new language designs emerge due to the relative increase in the number of multi-processor/core CPUs out there. I mean something at a high level, not compiler improvements that try to overlap CPU operations from original top-down style code, but something that encourages the actual developers to use producer consumer patterns, co-routines, whatever, at their level to maximize utilization of CPU resources. We keep using the same old languages and assuming the compiler is smart enough
            • Java and Smalltalk are both imperative languages and, while I am quite fond of Smalltalk, my post was about functional languages. Most functional languages don't permit aliasing, which dramatically reduces locking issues related to resource contention (and copy-on-write optimisations can make them very fast).
          • A Bluebottle port to the Cell would be ideal as far as I am concerned. Then, a functional language like Ocaml or Moby was wanted, it could be written in Oberon. I've got a software realtime raytracer running on Bluebottle (dual boot Linux/BB x86 box). Boy, would I like to port it to Cell. Of course the tracer could be converted to C++. But that would be ugly.
    • "from the article and if the ps3 cell cpu is even half the processor than this monster is i say that game companies will need a lot of real programmers to make real good games (as if they cared)."

      I'm not so worried about the programming aspect of it. Yeah, it'll cost more, and in the beginning it'll be nerve wracking to get adjusted, but I expect that in the end it'll be the least of their concerns. Not only will the techniques be laid down, but I imagine there'll be a lot of engine licensing going on. I
    • Re:ps3 programming (Score:5, Insightful)

      by iota ( 527 ) * on Sunday November 27, 2005 @02:50AM (#14122770) Homepage
      from the article and if the ps3 cell cpu is even half the processor than this monster is i say that game companies will need a lot of real programmers to make real good games (as if they cared).

      1. Some of us do care, actually.
      2. The Cell processor described is exactly the processor in the PS3.
      3. Yes, regardless of what some would like to believe, there is no magic. It's different, but it's the way things are going, so some of us are adapting the way develop. It'll take work, and maybe a little time, but that's always been our job - we get hardware and we figure out how to do something cool with it.
      4. It is actually really fun to work on and very impressive.
      • 2. The Cell processor described is exactly the processor in the PS3.


        Not quite. I heard a talk from a Sony representative and she said that the PS3's cpu has eight SPEs, but only seven of them are enabled. This is to increase yields.

        • True. But that may well be just software. Perhaps at power on it loads a test program into each of the SPEs, and chooses 7 of the SPEs that complete without error. Or perhaps it's done as part of a soak test at the factory, and the SPE that's not to be used stored in non-volatile memory on the PS3.
    • The article suggests to me that game companies will only need average programmers to make real good games, but they will need some absolute geniuses if they want to optimise them.

      Even if you forget the SPEs entirely, and just write for one of the two threads the PPC offers, then you'll have a lot more horsepower than the PS2 to throw around. I exepct we'll see some pretty impressive early games that have the SPEs doing either nothing, or fairly minimal things like generating music, or doing little bits of a
      • The article suggests to me that game companies will only need average programmers to make real good games, but they will need some absolute geniuses if they want to optimise them.

        I would suggest that what game companies need to produce good games are good, creative game designers.

        Unless, of course, you are hacking together Virtua Strip Poker, in which case you have my full blessing to spend 95% of the budget on pretty graphics ;-)
  • 20 core die (Score:3, Funny)

    by Anonymous Coward on Sunday November 27, 2005 @01:56AM (#14122633)
    Amazing progress. So with 20 cores on a single die, we can play D&D in real time?

    It's Saturday night and I'm all alone here, cut me some slack...

  • yea but (Score:2, Funny)

    by rrosales ( 847375 )
    can it do infinite loops in 5 seconds?
  • Remind anyone... (Score:3, Insightful)

    by Kadin2048 ( 468275 ) <slashdot@kadin.xoxy@net> on Sunday November 27, 2005 @02:35AM (#14122744) Homepage Journal
    ... of the promotional material for the Sega Saturn from a few years back?

    I remember right about the time it came out, there was a lot of hype about it's architecture. Two main processors and a bunch of dedicated co-processors, fast memory bus, etc., etc. I don't remember any more specifics, but at the time it seemed very impressive. Of course it flopped spectacularly, because apparently the thing was a huge pain in the ass to program for and the games never materialized. Or at least that's the most often spoken reason that I've heard.

    Anyway, and I'm sure I'm not the first person to have realized this, Cell is starting to sound the same way. The technical side is being hyped and seems clearly leaps and bounds ahead of the competition, but one has to wonder what MS is doing to prevent themselves from producing another Saturn on the programming side.
    • MS?

      Uhm. MS is the one with the 3-core PPC. This is a 1-core (dual-threaded) PPC with 8 coprocessors.

      And I want one.
    • At the time there was also the issue of the PS1 being more powerful in addition to doing 3D while the Saturn basically had 3D support added in response and it not being very good.

      This same "difficult to program" issue came up with the PS2 but seems to not have had much of an impact overall on sales. :)
      • Well, that would mean the first PS3 games that are any good are at least 2 years away.

        Only recently (past 1-2 years) PS2 devs have begun to really exploit the aging hardware. And due to teeny RAM of PS2, it really takes skill.
        • All consoles suffer from the problem of games only really exlpoiting the systems full power after a couple years of developer practice. Really, unless it's based entirely on an existing known system, you have to expect that.

          Underestimating RAM needs is a common problem too. At least since BG's 640K is enough for anyone.
    • they gave up... (Score:5, Interesting)

      by YesIAmAScript ( 886271 ) on Sunday November 27, 2005 @03:00AM (#14122795)
      Both Sony and MS realized they couldn't make a single true general-purpose CPU with the performance they wanted for a price they could afford to sell in their consoles.

      Sony went to a CPU, GPU and 7 co-processors (Cell).
      MS went to a 3 CPUs with vector-assist and a GPU.

      Both companies are going to need to spend a lot of time and money on developer tools to help their developers more easily take advantage of their oddball hardware, or else they will end up right where Saturn did.

      I guess the good news for both companies is that there is no alternative (like PS1 was to Saturn) which is straightforward and thus more attractive.

      PS2 requires programming a specialized CPU with localized memory (the Emotion Engine) and it seems to get by okay. So developers can adapty, given sufficient financial advange to doing so.
      • No alternative? The Nintendo codename-Revolution will be comparatively "under"-powered, but will definitely be a simpler machine to code for and have novel (not novelty!) controller hardware that will afford the kind of possibilities Sony and Microsoft's idea of "next generation" don't offer. Just pushing more polygons isn't where it is at. There's been no growth in size of the gaming market since the SNES era, just more spending by those who do game. Nintendo's next generation model is at least looking to
        • Although Nintendo isn't even talking about the hardware specs, so we can't be sure.

          But I didn't include the Revolution because Nintendo is saying the same thing they did with the Gamecube, that they don't need 3rd party developers. Revolution seems largely like a platform for Nintendo to sell you their older games again. Additionally, if Revolution is sufficiently underpowered compared to the other two, it may be that 3rd parties just plain cannot port their games to this platform, or else have to "dumb dow
      • I guess the good news for both companies is that there is no alternative (like PS1 was to Saturn) which is straightforward and thus more attractive.

        It's called the Revolution.

    • by Sycraft-fu ( 314770 ) on Sunday November 27, 2005 @04:11AM (#14122956)
      Well, not quite. The odd processors were a problem for the Saturn, but not the major one. The really major problem was that it wasn't good at 3D. The Saturn was basically designed to be the ultimate 2D console, which it was. However 3D was kinda hacked on later and thus was hard to do and didn't look as good as the competition. This was at a time when 3D was new and flashy, and thus an all-important selling point.

      However you are correct in that having a system with a different development model could be problematic. Game programmers (and I suppose all programmers) can be fairly insular. Many are already whining about the multi-core movement. They like writing single-thread code, a big while loop in essencde, since that's the way it's always been done. However the limitations of technology are forcing new thinking. Fortunately, doing multi-threaded code shouldn't require a major reqorking of the way things are done, espically with good development tools.

      Well, the Cell is something else again. It's multi-core to the extreme in one manner of thinking, but not quite, because the Cells aren't full, independant processor cores. So programming it efficiently isn't just having 8 or 9 or however many cores worth of tasks for it.

      Ultimately, I think the bottom line will come down to the development tools. Game programmers aren't likely to be hacking much assembly code. So if the compiler knows how to optimise their code for the cell, it should be pretty quick. If it doesn't and requires a very different method of coding, it may lead to it being under utilised.

      Now it may not be all that imporant. Remember this isn't like the PS2, the processor isn't being relied on for graphics tranformations, the graphics chip will handle all that. So even if the processor is underultilised and thus on the slow side, visually stunning games should still be possible.

      However it is a risk, and a rather interesting one. I'm not against new mthods of doing things, but it seems for a first run of an architecture, you'd want it in dev and research systems. Once it's been proven and the tools are more robust, then maybe you look at the consumer market. Driving the first generation out in a mass consumer device seems risky, espically given that the X-box has lead time and thus it's development model is already being learned.
      • Re:Remind anyone... (Score:3, Interesting)

        by TheRaven64 ( 641858 )
        Many are already whining about the multi-core movement. They like writing single-thread code, a big while loop in essencde, since that's the way it's always been done.

        Meanwhile, those of us who have been advocating building large numbers of loosely coupled, message passing, components all running with their own process space gave enormous grins on our faces at the thought of being able to do the message passing via a shared cache with only a cycle or two penalty...

        • I imagine the multi-core design will improve AI code, weather simulation, physics simulation, etc. Anything that needs lots of small concurrent calculations.

          It seems that the cell design, once utilized, should make for games that feel better even if they look the same. Maybe the difference won't show in a screenshot but you'll be able to tell it's there when you play the game.

          I'm interested in the rumor that multiple Cell processors will be able to work together even over the PS3's built-in networking. That
        • Meanwhile, those of us who have been advocating building large numbers of loosely coupled, message passing, components all running with their own process space gave enormous grins on our faces at the thought of being able to do the message passing via a shared cache with only a cycle or two penalty...

          Yeah, but are you used to doing that in a DSP-like architecture, where most of your operations are going to be SIMD ones lest you hit massive branch penalties?
          • Interesting question. It probably could be done. Split the task into segments that can be done without conditional branching. Put each of these in a SPU task. Now your branching is not in the execution phase, it's in the part deciding where you send your results. Assuming that you have an OS that can multiplex the SPUs, this could work quite efficiently.
  • by Animats ( 122034 ) on Sunday November 27, 2005 @02:49AM (#14122769) Homepage
    The nCube, in the 1980s, was much like this. 64 to 1024 processors, each with 128KB and a link to neighboring processors, plus an underpowered control machine (an Intel 286, surprisingly.)

    The Cell machines are about equally painful to program, but because they're cheaper, they have more potential applications than the nCube did. Cell phone sites, multichannel audio and video processing, and similar easily-parallelized stream-type tasks fit well with the cell model. It's not yet clear what else does.

    Recognize that the cell architecture is inherently less useful than a shared-memory multiprocessor. It's an attempt to get some reasonable fraction of the performance of an N-way shared memory multiprocessor without the expensive caches and interconnects needed to make that work. It's not yet clear if this is a price/performance win for general purpose computing. Historically, architectures like this have been more trouble than they're worth. But if Sony fields a few hundred million of them, putting up with the pain is cost-justified.

    It's still not clear if the cell approach does much for graphics. The PS3 is apparently going to have a relatively conventional nVidia part bolted on to do the back end of the graphics pipeline.

    I'm glad that I don't have to write a distributed physics engine for this thing.

    • I'm glad that I don't have to write a distributed physics engine for this thing.

      Granted, it's harder to program on a multi-processor. But it's not that much harder, more just fear of the unknown.

      Programmers are already multiprocessing bigtime to handle multiple IO devices and to watch the wall clock time (independent of the processing time) and it's a rare real world programming problem that can't be easily partitioned, usually geometrically. In the case of the physics engine I'd initially just put th

    • IANAGP (game programmer), but it would seem to me that physics and lighting calculations should be easily parallelizable. Each processor can compute the physics for a separate set of objects / pixels / etc. Same for AI for each agent, if the companies actually bothered to put some effort into gameplay over graphics. On the other hand, I would guess that things like fluids (i.e. Far Cry) would be more difficult to do in parallel, due to the less local nature of the interactions.
    • I disagree that that the cell architectures is "inherently less useful than a shared-memory multiprocessor".

      Shared memory is the cause of 80% of the nasty little race conditions programmers leave peppered through their code on parallel machines - it's just too easy to break discipline, particularly considering the crap programming languages support we have - C and C++ are just not up to the task because of their assumption that you may touch anything in the address space.

      Cell-like architectures have on

      • I wonder if they aren't considering inventing a new language, or modifying an existing language, that has constructs to make this kind of programming easier. Something that would let you create tasks that look like sepperate mini-programs running but can communicate easily using some common concept such as pipes. There is really no reason that programmers need to write everything in C and C++.

        A Python-like language that had extra semantics for the multi-processor design and that compiled down to machine cod
  • The problem will be for much of the IT industry is that those making the decisions would ask only one question:

    Does it Run Windows?

    If the answer is no that the manager will say something like:

    "I don't care if the processor is the most powerful ever developed, costs next to nothing to produce and will allow us to build a powerful computer the size of of pea. If it doesn't run Windows, then I'm not interested".

    And that sums up the total IT knowledge of that manager.
  • Mod me down if you wish but I think the CBE architecture is bound to fail. The reason is that you don't design your software model around a new processor. It should be the other way around. You first come up with a software model and then design a processor optimized for the new model. This way you are guaranteed to have a perfect fit. Otherwise, you're asking for trouble.

    The primary reason that anybody would want to devise a new software model is to address the single most pressing problem in the computer
    • Re:CBE = Failure (Score:5, Insightful)

      by plalonde2 ( 527372 ) on Sunday November 27, 2005 @11:00AM (#14123916)
      You're right - you don't design around a new processor.

      But you should design around the changes in architecture that have been coming at us for the last 5-10 years: the bus is the bottleneck, and the Cell makes this explicit. It goes so far as to deal with the clock-rate limits we've reached by taking the basic "bus is the limit" and exposing it in a way that lets you stack a bunch of processors without excessive interconnect hardware (and associated heat) into a more power-efficient chip.

      I've been working on Cell for nearly a year now, and it's been really nice being forced to pay attention to the techniques that will be required to get performance on all multi-core machines, which in essence means all new processors coming out. Our bus/clockrate/heat problems are all inter-dependent, and Cell is the first project I've seen that gets serious about letting programmers know they need to change to adapt to this new space.

  • if every ps3 was networked and sony rented out your redundant core to the DoD, how fast would the worlds most powerful super computer be?
    • Since most of the inter-processor "interconnects" would be consumer-grade DSL/Cable links, it'd have phenomental capacity to process chunks of data but serious latency issues in distributing work units. Commercial cluster data-processing units probably use gigabit ethernet or faster connections to get around this.
      • i was fortunate enough to see first-hand a prototype of a broadband LAN link (i forget what it's called) that at the time was capable of 800Mb/sec data transmissions with high efficiency when 1Gb links were still a year or so away. it was a chained network that avoided the problem of collisions by providing a direct connection from one computer to any other. the cable was about 0.5m long, but was a 24-line ribbon cable, and claimed to provide "100% efficiency" for over 200 computers. you could tell it wa
      • Still, plenty of problems exist that can be attacked that don't need to send a lot of data back and forth quickly.
  • by RESPAWN ( 153636 )

    I haven't really done much programming since college and none of those programs have been multithreaded, so maybe I don't have the right background to comment. But, all I can say is wow. This is crazy compared to the Sparc processors that I learned assembly on. As somebody pointed out, not only do these processors have multiple cores, but apparently each one has 128 registers?! Processor design has come a long way.

    That said, I see a lot of comments reflecting on how hard it will be for programmers to ad

    • The compiler can't do a lot of the optimization this kind of processor needs. It's not a matter of instruction scheduling or selection, it's a matter of algorithm selection. To make use of the CBE you need to design your program to combine massive amounts of threading with massively parallel data processing within those threads. You're trying to get it so there's enough happening simultaneously to keep all those cores fed with instruction streams. The compiler can't rewrite your program to do that, the prog

    • Imagine(TM) a cluster of these processor boards (PCI, PCI-X, SLI) with a compiler and IDE able to make use of them all. The Transputer computing engine has (AFAIK) all but died off, but this could conceivably be the 21st century's replacement.

  • "programming for the CBE is like programming for no processor you've ever met before"

    Which is exactly why it will never take off.
  • cellular broadband coverage is spotty at best in my area, and the damned providers charge too much per minute for the airtime. :)

This is now. Later is later.

Working...