Unleashing the Power of the Cell Broadband Engine 136
An anonymous reader writes "IBM DeveloperWorks is running a paper from the MPR Fall Processor Forum 2005 explores programming models for the Cell Broadband Engine (CBE) Processor, from the simple to the progressively more advanced. With nine cores on a single die, programming for the CBE is like programming for no processor you've ever met before."
Cell architecture (Score:3, Informative)
New Me (Score:1)
Re:New Me (Score:5, Funny)
Bingo, sir.
Re:New Me (Score:1)
Mindstorms! (Score:1)
Re:New Me (Score:2)
Isn't this basically the dataflow [wikipedia.org] paradigm ?
I think it should be possible to make any programming language automatically spread its work across multiple processors simply by analyzing which operations depend on which to generate "wa
Re:New Me (Score:1)
Your programming example would remove the need for mutexes, but that's about all. Spawning a thread, while faster than an entire process, has cost.
In your example, the context of the current thread would have to be duplicated (how does the compiler know what data operation() will use). They can't run from the same data - what if doSomeThing() modifies a field that operation() uses?
Java's FutureTask requires extra work, but it protects you from concurrent data modifications errors. (Or, it appears to. I'
Re:New Me (Score:2)
No, you'd still need mutexes. My example is simply a clean and simple way of expressing "this task can be run on a separate worker thread".
Re:New Me (Score:2)
I'll use an IDE when they invent one that is fast, flexible, and chrome free as a normal bash command prompt. I have hated every IDE I've tried due to slowness and their insistance on making an ugly bloated interface. I don't need the IDE to try to guess what I'm typing, to offer code debugging while I'm typing, to FTP my files for me, etc.
Basic project management, checking my code for errors (when I ask.. not constantly), good seach and replace (with regular expressions and acr
Re:New Me (Score:2)
I want to keep
Re:New Me (Score:2)
Re:New Me (Score:2)
Re:New Me (Score:2)
Has nothing to do with Broadband (Score:2)
Re:Has nothing to do with Broadband (Score:1, Informative)
Re:Has nothing to do with Broadband (Score:3, Informative)
Oh yeah. If you read their web page [ibm.com] they also mention the Cell processor will be able to handle broadband rich media applications and streaming content:
The first-generation Cell Broadband Engine (BE) processor is a multi-core chip comprised of a 64-bit Power Architecture pr
Re:Has nothing to do with Broadband (Score:5, Insightful)
"The Pentium III will make the Internet a much more consumer-friendly environment," says Jami Dover, Intel's marketing vice president. Surfing today, Dover maintains, is a limited experience because data-transfer rates over ordinary telephone lines do not allow for high-quality audio, video and 3D graphics. "You take people raised on TV and show them a flat, text [Web] page," says Dover. "It's quite a juxtaposition." [asiaweek.com] I guess Intel was hoping the world could go through a phone line with enough compression.
To us this is a nitpick, to the general public this is more confusion in a jargon filled marketplace.
Re:Has nothing to do with Broadband (Score:2)
Re:Has nothing to do with Broadband (Score:1)
thanks for making me smirk
Re:Has nothing to do with Broadband (Score:2)
Wow ... (Score:5, Interesting)
So yes, I want a Cell-based devkit now, 'cuz this sounds like _fun_ :-)
Regards,
John
Re:Wow ... (Score:1)
I wonder how well an uber-multicore 6502 at appropriately modern clockspeeds would fare today...
A couple of hundred registers... (Score:2)
Re:Wow ... (Score:2)
Kids these days...
Re:Wow ... (Score:1)
Re:Wow ... (Score:2)
I remember the 1802, too (Score:2)
Another horrible early processor was the TMS9900, which pretended to have 16 16-bit registers but they were just mapped
Re:Wow ... (Score:2)
SPE registers = 8 SPEs * 128 registers * 128 bits
PPE registers = 2 threads * (32 integer + 32 floating pont) registers * 64 bits
PPE VMX-128 registers = 2 threads * 128 registers * 128 bits
Grand total = 21+ Kbytes of registers, not to mention the 2 Mbytes of SRAM on the SPEs.
This thing is going to be a lot of fun!
ps3 programming (Score:3, Insightful)
Re:ps3 programming...no, not really (Score:3, Insightful)
Its when you take old code from previous things and then try to do a direct port that you will see some issues in performance hits. But if designed from the ground up in terms of the code
Re:ps3 programming (Score:3, Interesting)
Most of the bitching we've heard from developers so far hasn't been that the cell sucks but that their existing codebases don't take advantage of it's design and they don't want to make a rewrite that locks them into the platform.
As with every platform the reall
Re:ps3 programming (Score:3, Insightful)
Certainly people are working on that very idea. However, it's a long way off and not likely to happen in the lifetime of this version of the processor. Both XLC (IBM's optimizing compiler) and GCC have a very difficult time vectorizing (i.e. taking advantage of
Re:ps3 programming (Score:3, Interesting)
HLL like Java & Smalltalk have two faults (Score:2)
When you fork N processes on N objects and you have N-M processors, it costs you computationally, which translates into efficiency.
Its one thing to think of this situation as a bunch (N) of ball-bearings going a bunch of holes (N-M) with each ball-bearing having its state information local to it. (Any kind of concept of a sieve can serve as a 'gedanken' experiment.)
The situation becomes ho
Re:HLL like Java & Smalltalk have two faults (Score:1)
Re:HLL like Java & Smalltalk have two faults (Score:3, Informative)
Re:ps3 programming (Score:1)
Re:ps3 programming (Score:2)
I'm not so worried about the programming aspect of it. Yeah, it'll cost more, and in the beginning it'll be nerve wracking to get adjusted, but I expect that in the end it'll be the least of their concerns. Not only will the techniques be laid down, but I imagine there'll be a lot of engine licensing going on. I
Re:ps3 programming (Score:5, Insightful)
1. Some of us do care, actually.
2. The Cell processor described is exactly the processor in the PS3.
3. Yes, regardless of what some would like to believe, there is no magic. It's different, but it's the way things are going, so some of us are adapting the way develop. It'll take work, and maybe a little time, but that's always been our job - we get hardware and we figure out how to do something cool with it.
4. It is actually really fun to work on and very impressive.
Re:ps3 programming (Score:2)
Not quite. I heard a talk from a Sony representative and she said that the PS3's cpu has eight SPEs, but only seven of them are enabled. This is to increase yields.
Re:ps3 programming (Score:2)
Re:ps3 programming (Score:2)
Re:ps3 programming (Score:2)
Even if you forget the SPEs entirely, and just write for one of the two threads the PPC offers, then you'll have a lot more horsepower than the PS2 to throw around. I exepct we'll see some pretty impressive early games that have the SPEs doing either nothing, or fairly minimal things like generating music, or doing little bits of a
Re:ps3 programming (Score:1)
I would suggest that what game companies need to produce good games are good, creative game designers.
Unless, of course, you are hacking together Virtua Strip Poker, in which case you have my full blessing to spend 95% of the budget on pretty graphics
Re:ps3 programming (Score:1)
Obligatory real programmer note: real programmers program only in FORTRAN, assembly language, or machine code.
Or in situations where use of lowest-level languages is deprecated, such as when one set of program logic has to work on a PC-class machine and a handheld device[1], they program in C and make sure to look at the generated assembly language code (gcc -O3 -S) to make sure that the compiler is in fact doing its job.
[1] This situation is common in multiplatform video game development.
Re:ps3 programming (Score:1)
20 core die (Score:3, Funny)
It's Saturday night and I'm all alone here, cut me some slack...
Faster Than Realtime - Just port Nethack (Score:2)
You can run a 68000 or 80386 emulator in each of the SPUs, or just run lots of native processes in parallel.
Re:Faster Than Realtime - Just port Nethack (Score:2)
Not to mention having no cache at all will be SO great in such a non-streaming application (and no, those 256K ram dont count)
Re:20 core die (Score:1)
I think that the D&D joke gave that away...
yea but (Score:2, Funny)
Re:yea but (Score:1)
Re:yea but (Score:2)
(Less stellar drift/varying relative velocities, etc, but they shouldn't count for much on the scale of parsecs, unless "a long time ago" is of the order of millions of years).
Remind anyone... (Score:3, Insightful)
I remember right about the time it came out, there was a lot of hype about it's architecture. Two main processors and a bunch of dedicated co-processors, fast memory bus, etc., etc. I don't remember any more specifics, but at the time it seemed very impressive. Of course it flopped spectacularly, because apparently the thing was a huge pain in the ass to program for and the games never materialized. Or at least that's the most often spoken reason that I've heard.
Anyway, and I'm sure I'm not the first person to have realized this, Cell is starting to sound the same way. The technical side is being hyped and seems clearly leaps and bounds ahead of the competition, but one has to wonder what MS is doing to prevent themselves from producing another Saturn on the programming side.
Re:Remind anyone... (Score:2)
Uhm. MS is the one with the 3-core PPC. This is a 1-core (dual-threaded) PPC with 8 coprocessors.
And I want one.
Re:Remind anyone... (Score:2)
This same "difficult to program" issue came up with the PS2 but seems to not have had much of an impact overall on sales.
Re:Remind anyone... (Score:2)
Only recently (past 1-2 years) PS2 devs have begun to really exploit the aging hardware. And due to teeny RAM of PS2, it really takes skill.
Re:Remind anyone... (Score:2)
Underestimating RAM needs is a common problem too. At least since BG's 640K is enough for anyone.
they gave up... (Score:5, Interesting)
Sony went to a CPU, GPU and 7 co-processors (Cell).
MS went to a 3 CPUs with vector-assist and a GPU.
Both companies are going to need to spend a lot of time and money on developer tools to help their developers more easily take advantage of their oddball hardware, or else they will end up right where Saturn did.
I guess the good news for both companies is that there is no alternative (like PS1 was to Saturn) which is straightforward and thus more attractive.
PS2 requires programming a specialized CPU with localized memory (the Emotion Engine) and it seems to get by okay. So developers can adapty, given sufficient financial advange to doing so.
Re:they gave up... (Score:2, Interesting)
you're probably right (Score:3, Interesting)
But I didn't include the Revolution because Nintendo is saying the same thing they did with the Gamecube, that they don't need 3rd party developers. Revolution seems largely like a platform for Nintendo to sell you their older games again. Additionally, if Revolution is sufficiently underpowered compared to the other two, it may be that 3rd parties just plain cannot port their games to this platform, or else have to "dumb dow
Revolution (Score:1)
It's called the Revolution.
Re:Remind anyone... (Score:4, Insightful)
However you are correct in that having a system with a different development model could be problematic. Game programmers (and I suppose all programmers) can be fairly insular. Many are already whining about the multi-core movement. They like writing single-thread code, a big while loop in essencde, since that's the way it's always been done. However the limitations of technology are forcing new thinking. Fortunately, doing multi-threaded code shouldn't require a major reqorking of the way things are done, espically with good development tools.
Well, the Cell is something else again. It's multi-core to the extreme in one manner of thinking, but not quite, because the Cells aren't full, independant processor cores. So programming it efficiently isn't just having 8 or 9 or however many cores worth of tasks for it.
Ultimately, I think the bottom line will come down to the development tools. Game programmers aren't likely to be hacking much assembly code. So if the compiler knows how to optimise their code for the cell, it should be pretty quick. If it doesn't and requires a very different method of coding, it may lead to it being under utilised.
Now it may not be all that imporant. Remember this isn't like the PS2, the processor isn't being relied on for graphics tranformations, the graphics chip will handle all that. So even if the processor is underultilised and thus on the slow side, visually stunning games should still be possible.
However it is a risk, and a rather interesting one. I'm not against new mthods of doing things, but it seems for a first run of an architecture, you'd want it in dev and research systems. Once it's been proven and the tools are more robust, then maybe you look at the consumer market. Driving the first generation out in a mass consumer device seems risky, espically given that the X-box has lead time and thus it's development model is already being learned.
Re:Remind anyone... (Score:3, Interesting)
Meanwhile, those of us who have been advocating building large numbers of loosely coupled, message passing, components all running with their own process space gave enormous grins on our faces at the thought of being able to do the message passing via a shared cache with only a cycle or two penalty...
Re:Remind anyone... (Score:2)
It seems that the cell design, once utilized, should make for games that feel better even if they look the same. Maybe the difference won't show in a screenshot but you'll be able to tell it's there when you play the game.
I'm interested in the rumor that multiple Cell processors will be able to work together even over the PS3's built-in networking. That
Re:Remind anyone... (Score:2)
Yeah, but are you used to doing that in a DSP-like architecture, where most of your operations are going to be SIMD ones lest you hit massive branch penalties?
Re:Remind anyone... (Score:2)
Reminds me of programming the nCube (Score:4, Interesting)
The Cell machines are about equally painful to program, but because they're cheaper, they have more potential applications than the nCube did. Cell phone sites, multichannel audio and video processing, and similar easily-parallelized stream-type tasks fit well with the cell model. It's not yet clear what else does.
Recognize that the cell architecture is inherently less useful than a shared-memory multiprocessor. It's an attempt to get some reasonable fraction of the performance of an N-way shared memory multiprocessor without the expensive caches and interconnects needed to make that work. It's not yet clear if this is a price/performance win for general purpose computing. Historically, architectures like this have been more trouble than they're worth. But if Sony fields a few hundred million of them, putting up with the pain is cost-justified.
It's still not clear if the cell approach does much for graphics. The PS3 is apparently going to have a relatively conventional nVidia part bolted on to do the back end of the graphics pipeline.
I'm glad that I don't have to write a distributed physics engine for this thing.
Re:Reminds me of programming the nCube (Score:2)
I'm glad that I don't have to write a distributed physics engine for this thing.
Granted, it's harder to program on a multi-processor. But it's not that much harder, more just fear of the unknown.
Programmers are already multiprocessing bigtime to handle multiple IO devices and to watch the wall clock time (independent of the processing time) and it's a rare real world programming problem that can't be easily partitioned, usually geometrically. In the case of the physics engine I'd initially just put th
Reply to: Reminds me of programming the nCube (Score:1)
Re:Reminds me of programming the nCube (Score:3, Interesting)
Shared memory is the cause of 80% of the nasty little race conditions programmers leave peppered through their code on parallel machines - it's just too easy to break discipline, particularly considering the crap programming languages support we have - C and C++ are just not up to the task because of their assumption that you may touch anything in the address space.
Cell-like architectures have on
Re:Reminds me of programming the nCube (Score:2)
A Python-like language that had extra semantics for the multi-processor design and that compiled down to machine cod
Does it run Windows? (Score:1, Offtopic)
Does it Run Windows?
If the answer is no that the manager will say something like:
"I don't care if the processor is the most powerful ever developed, costs next to nothing to produce and will allow us to build a powerful computer the size of of pea. If it doesn't run Windows, then I'm not interested".
And that sums up the total IT knowledge of that manager.
CBE = Failure (Score:2, Troll)
The primary reason that anybody would want to devise a new software model is to address the single most pressing problem in the computer
Re:CBE = Failure (Score:5, Insightful)
But you should design around the changes in architecture that have been coming at us for the last 5-10 years: the bus is the bottleneck, and the Cell makes this explicit. It goes so far as to deal with the clock-rate limits we've reached by taking the basic "bus is the limit" and exposing it in a way that lets you stack a bunch of processors without excessive interconnect hardware (and associated heat) into a more power-efficient chip.
I've been working on Cell for nearly a year now, and it's been really nice being forced to pay attention to the techniques that will be required to get performance on all multi-core machines, which in essence means all new processors coming out. Our bus/clockrate/heat problems are all inter-dependent, and Cell is the first project I've seen that gets serious about letting programmers know they need to change to adapt to this new space.
rent out your ps3s redundant cell core? (Score:1)
interconnect restrictions (Score:3)
Re:interconnect restrictions (Score:2)
Re:interconnect restrictions (Score:2)
Wow (Score:2)
I haven't really done much programming since college and none of those programs have been multithreaded, so maybe I don't have the right background to comment. But, all I can say is wow. This is crazy compared to the Sparc processors that I learned assembly on. As somebody pointed out, not only do these processors have multiple cores, but apparently each one has 128 registers?! Processor design has come a long way.
That said, I see a lot of comments reflecting on how hard it will be for programmers to ad
Re:Wow (Score:2)
The compiler can't do a lot of the optimization this kind of processor needs. It's not a matter of instruction scheduling or selection, it's a matter of algorithm selection. To make use of the CBE you need to design your program to combine massive amounts of threading with massively parallel data processing within those threads. You're trying to get it so there's enough happening simultaneously to keep all those cores fed with instruction streams. The compiler can't rewrite your program to do that, the prog
Re:Wow (Score:2)
Fail! (Score:2)
Which is exactly why it will never take off.
Won't work... (Score:2)
Re:PS3 Suggestion (Score:5, Informative)
Re:PS3 Suggestion (Score:2, Interesting)
Re:PS3 Suggestion (Score:1, Interesting)
People keep forgetting that Sony and Microsoft are in absolutely no way interested in providing you with a cheap computing platform for your linux cluster endevours at their loss. They make money off of selling games for these th
Re:PS3 Suggestion (Score:5, Interesting)
Re:PS3 Suggestion (Score:2)
I sure hope that's true, but I'm pretty sketical. Sony makes butt loads of cash off the software sales for the PS2 so if Sony actually does give us Linux it's going to have some strings attached. For example, I very much doubt that Sony will give us access to the GPU.
What gives me the most hope for a relatively unencumbered PS3 Linux distro is the Blu-Ray format. All Blu-Ray players will have a Java layer for interactive crud, which should be enoug
Mod parent down, FUD (Score:1)
The part of Sony that has been providing Linux kits [playstation2-linux.com] for the PS2 [ps2linux.com] since 2002.
The console homebrew scene is rather big, and Sony and Microsoft can do nothing about it.
Mambo development (Score:5, Informative)
IBM will also be releasing Cell-based Blade servers next year, so pick one up if you're serious about development!
MOD PARENT DOWN (Score:5, Informative)
Re:MOD PARENT DOWN (Score:2)
Re:PS3 Suggestion (Score:2, Informative)
Frankly, I don't see why they couldn't just use flash memory instead...everyone's doing it these days.
Re:PS3 Suggestion (Score:1)
Usually you are much more inventive. What the hell took you shitheads this long this time?
Oh baby! (Score:1, Funny)
Re:Task switching... (Score:3, Interesting)
The first core could be the main processor, handling processes, and the second core, could just be there to be interrupted by dedicated threads executed on the SPEs, and communicate with them. The main problem would come from memory bandwidth used by the core which handles the 8 SPEs, it should be designed to minimize the impact on the first core.
A solution to this could be to have a cell processor and a traditi
Re:Task switching... (Score:1)
Well, it seems like it is (almost) your lucky day today.
The PPE is already hyperthreaded, posing as a dual core.