BrookGPU: General Purpose Programming on GPUs 275

Posted by CmdrTaco on Sunday December 21, 2003 @12:57PM from the bump-maps-make-me-giggle dept.

An anonymous reader writes " BrookGPU is a compiler and runtime system that provides an easy, C-like programming environment (read: No GPU programming experience needed) for today's GPUs. A shader program running on the NVIDIA GeForce FX 5900 Ultra achieves over 20 GFLOPS, roughly equivalent to a 10 GHz Pentium 4. Combine this with the increased memory bandwidth, 25.3 GB/sec peak compared to the Pentium 4's 5.96 GB/sec peak, and you've got a seriously fast compute engine but programming them has been a real pain. BrookGPU adds simple data parallel language additions to C which allow programmers to specify certain parts of their code to run on the GPU. The compiler and runtime takes care of the rest. Here is the Project Page and Sourceforge page."

This discussion has been archived. No new comments can be posted.

BrookGPU: General Purpose Programming on GPUs

Load All Comments

Search 275 Comments Log In/Create an Account

Comments Filter:

High Performance for General Purpose? (Score:4, Interesting)

by tempfile ( 528337 ) writes: on Sunday December 21, 2003 @12:59PM (#7779605)

I suspect that this high performance is only attainable for the field the GPU is specialized for, i.e. graphics-related things. Or isn't it?

Share
twitter facebook
- Re:High Performance for General Purpose? (Score:4, Informative)
  
  by fidget42 ( 538823 ) writes: on Sunday December 21, 2003 @01:02PM (#7779634)
  
  Actually, since "graphics-related things" are all matrix operations, this would turn the GPU into a high-end vector (matrix) engine.
  
  Parent Share
  twitter facebook
  - A real-world example - ray tracing (Score:4, Informative)
    
    by ron_ivi ( 607351 ) writes: <(moc.secivedxelpmocpaehc) (ta) (ontods)> on Sunday December 21, 2003 @06:43PM (#7781929)
    
    http://portal.acm.org/citation.cfm?doid=566654.566 640
    http://www.theregister.co.uk/content/54/25312.html
    http://online.cs.nps.navy.mil/DistanceEducation/on line.siggraph.org/2002/Papers/13_GraphicsHardware/ purcell.ppt
    
    Parent Share
    twitter facebook
- Re:High Performance for General Purpose? (Score:5, Insightful)
  
  by Anonymous Coward writes: on Sunday December 21, 2003 @01:04PM (#7779648)
  
  "graphics-realted" things include things like floating point mathmatics, linear algebra, and vector operations. If you are doing anything computationally intensive, this might be usefull. You don't have to actually use the hardware to do anything graphical if you are just interested in turning numbers.
  
  Parent Share
  twitter facebook
  - Re:High Performance for General Purpose? (Score:2)
    
    by JonnyRo88 ( 639703 ) writes:
    
    Arent several cryptography operations related to matrix manipulation?
    - Re:High Performance for General Purpose? (Score:4, Interesting)
      
      by Directrix1 ( 157787 ) writes: on Sunday December 21, 2003 @06:24PM (#7781815)
      
      Yes, anything computationally intensive that works over a range of data can usually find a parrallel solution. Such as image/video manipulation/encoding/decoding, encryption, and cracking (and hopefully this will give us a platform for better software RF). I've always wondered why this stuff didn't just become worked into a coprocessor. Because very little new stuff actually happened that was directly related with the video card (as in taking output from the machine and displaying it on a screen). I think the card manufacturers saw this, so they jumped on the 3d acceleration bandwagon toting it as a new video card feature, when it should've just been in the domain of a new math coprocessor.
      
      Parent Share
      twitter facebook
  - HP for GP?-AGP Bottleneck. (Score:2, Interesting)
    
    by Anonymous Coward writes:
    
    Wasn't there a Slashdot story about the slowness of reading back across the AGP bus? How will that affect the usefullness of GPUs?
    - Re:HP for GP?-AGP Bottleneck. (Score:5, Insightful)
      
      by Nexx ( 75873 ) writes: on Sunday December 21, 2003 @02:08PM (#7780067)
      
      WARNING: Lots of conjecture involved.
      
      That said, if you can fit your data sets and your program on to the video memory (128MB isn't uncommon on high-end), and you're doing lengthy calculations on these sets while being only interested in the results (again, not uncommon in HPC), then the relative slowness of reading these results back becomes a nonissue.
      
      Does that help? :)
      
      Parent Share
      twitter facebook
    - Re:HP for GP?-AGP Bottleneck. (Score:2, Insightful)
      
      by Anonymous Coward writes:
      
      In 12 months, AGP will be obsolete.
      
      It will be replaced by PCI-Express, which as a general purpose bus supposely won't have these issues.
    - Re:HP for GP?-AGP Bottleneck. (Score:3, Informative)
      
      by skookum ( 598945 ) writes:
      
      Yes, the AGP -> main memory transfer rate of most video cards is abysmally slow, because it's not something that's needed for gaming. Maybe newer cards have changed, but I don't see why they would. background article [tech-report.com]
      - Re:HP for GP?-AGP Bottleneck. (Score:3, Insightful)
        
        by skookum ( 598945 ) writes:
        
        You're missing the point completely. Main memory -> AGP is blazing fast, for the reasons you just stated. AGP -> main memory is painfully slow, because there's almost no requirement for much data to flow this direction. The result of %99.9999 of the output that the video card computes is displayed on the screen and then discarded with the next refresh.
  - Re:High Performance for General Purpose? (Score:5, Interesting)
    
    by BrainInAJar ( 584756 ) writes: on Sunday December 21, 2003 @02:34PM (#7780200)
    
    would the percision be enough though? as far as i know, GPU's do a lot of rounding off
    
    Parent Share
    twitter facebook
    - Re:High Performance for General Purpose? (Score:3, Informative)
      
      by Anonymous Coward writes:
      
      NVIDIA's parts are OK, precision wise. You get IEEE floats, more or less. ATI's parts don't quite get you there are the moment, but their next series is planned to.
      - Re:High Performance for General Purpose? (Score:3, Informative)
        
        by Viking Coder ( 102287 ) writes:
        
        Nope.
        
        Those are 4-component (RGBA) types, with 32, 16, and 24 bits per component, respectively.
        
        None of them are enough for double floats, and none of them are good enough for 80-bit reals that x87 uses.
    - Re:High Performance for General Purpose? (Score:3, Insightful)
      
      by sql*kitten ( 1359 ) * writes:
      
      as far as i know, GPU's do a lot of rounding off
      
      It depends. If you have a gaming card, it will sacrifice precision for speed to hit its price. If you're rendering 100 fps in a game and in a couple of noncontiguous frames the walls don't quite line up, no big deal. But a professional CAD card, speed is sacrificed for precision - the risk of an engineer making a mistake or failing to spot one in an assembly alignment because of rendering artefact is too high.
      
      In practice, a CAD card is just as fast as a gam
      - Re:HP for GP?-Fakeout. (Score:4, Informative)
        
        by sql*kitten ( 1359 ) * writes: on Sunday December 21, 2003 @06:40PM (#7781909)
        
        I thought the real reason to get a *professional level* card is to get a guarantee of reliability
        
        Well, ISV certification - a CAD vendor will assert "with this card, our software produces no rendering artifacts".
        
        Parent Share
        twitter facebook
- Re:High Performance for General Purpose? (Score:4, Interesting)
  
  by Total_Wimp ( 564548 ) writes: on Sunday December 21, 2003 @01:32PM (#7779852)
  
  I can't help but notice the similarity between shader operations and how neurons interact. These processors might be a good platform for some AI tasks.
  
  I especially like the idea that the GPU and CPU can work together on the task. If the GPU was handling neuron tasks and the CPU was handling other necessary tasks we could get a very big boost to desktop AI
  
  TW
  
  Parent Share
  twitter facebook
- Re:High Performance for General Purpose? (Score:4, Insightful)
  
  by axxackall ( 579006 ) writes: on Sunday December 21, 2003 @11:29PM (#7783389) Homepage Journal
  
  Matrix and vector calculations with floating point makes GPU as a very excelent place to host Neural Network (NN) computation.
  Of course NN can be used for "graphics-related things", such as image recognition, but not only image, for example voice recognition. And not only recognition, for example forecasting on huge sequences with explicit and implicit (hidden) side-factors.
  Stock market trader on GPU, anyone?
  
  Parent Share
  twitter facebook
Cool, but (Score:3, Interesting)

by MooCows ( 718367 ) writes: on Sunday December 21, 2003 @01:00PM (#7779608)

What kind of instructions does the GPU actually accept?
I mean, you probably just can't run any kind of algorithm on there can you?

Share
twitter facebook
- Good point. (Score:5, Insightful)
  
  by yoshi_mon ( 172895 ) writes: on Sunday December 21, 2003 @01:11PM (#7779698)
  
  After taking a quick peek at the language [stanford.edu] part of the project it seems right now that most of it's functions are all about sets of data and how to move them around.
  
  Makes sence of course as that is what a GPU is all about. (Yes I'm vastly over-simplyifying here.) So I would gather that it might be used for types of data that are streamed alot? Maybe used for video editing, real time video, etc where your trying to deal with a lot of data at once that your trying to move around and not just store or have to perform some more complicated types of functions upon.
  
  However, I'm no 3d programmer and I should would love a more detailed analysis of the potentals for this.
  
  Parent Share
  twitter facebook
- Re:Cool, but (Score:4, Informative)
  
  by scrytch ( 9198 ) writes: <chuck@myrealbox.com> on Sunday December 21, 2003 @01:24PM (#7779792)
  
  > I mean, you probably just can't run any kind of algorithm on there can you?
  
  Probably. I should imagine it has local storage with the corresponding fetch and store instructions, basic math, and ability to jump to arbitrary points in the shader program, which makes it very much turing complete. Everything else is a matter of a compiler backend. Bus latency would be an issue, so it'd be painful for programs that need a lot of I/O, but that's not an issue for a lot of programs.
  
  Parent Share
  twitter facebook
- GPU opcodes (Score:4, Informative)
  
  by Anonymous Coward writes: on Sunday December 21, 2003 @01:45PM (#7779929)
  
  Here is a Beyond3d link that has some opcode info [beyond3d.com]. Look around their site for a NV30 vs R300 architecture document that has lots of great stuff. If you are looking for the best s/n ratio, Beyond3d is one of the best. All meat, little fanboyism.
  
  Parent Share
  twitter facebook
Basically like having two processors... (Score:4, Interesting)

by Anonymous Coward writes: on Sunday December 21, 2003 @01:00PM (#7779612)

I wonder how long till we see a (insert worthwhile cause here)-At-Home client that supports this?

Share
twitter facebook
- Re:Basically like having two processors... (Score:2, Offtopic)
  
  by Doc Ruby ( 173196 ) writes:
  
  Keeping the prehistoric Atari/Commodore flamewar alive, I point out that I used to program multiprocesing on my Atari 400. Syncing its ANTIC vertical blank server routines with 6502 client routines through its wonderfully generic SIO scheduler, I was multiprocessing in 1981! Harnessing fast GPUs to speed general logic is at least as old as the Roman spatial metaphor for value, where "superior" means both "higher" and "better", et cetera.
  - Re:Basically like having two processors... (Score:3, Interesting)
    
    by Chordonblue ( 585047 ) writes:
    
    Yeah, I remember that! Lucasfilm used it to animate a mothership in 'Rescue on Fractalus' (itself a marvel of tech for the Atari) while the game loaded. The were cool (de)compression routines that harnessed this as well.
    
    I also seem to recall certain music pieces that could play extra parts by blanking the screen. There was also a really cool 9 second sample of 'You really got me' - the Van Halen version - and it blanked the screen to play it.
    
    Wow! Them were the salad days!
  - Re:Basically like having two processors... (Score:2)
    
    by CTalkobt ( 81900 ) writes:
    
    Bah humbug. Just to continue the Atari/Commodore flamewar...
    
    I coded a basic multi-tasker that would allow different threads of a basic program to be run at the same time for the Commodore. It got confusing if you tried to modify a variable in more than 1 spot. It was more fun to play with than really practical.
  - Re:Basically like having two processors... (Score:4, Interesting)
    
    by cybergibbons ( 554352 ) writes: on Sunday December 21, 2003 @07:52PM (#7782321) Homepage
    
    Ha! The C64 disk drive had it's own processor which you could use to run programs as long as you could deal with the painfully slow serial link. Beat that.
    
    Parent Share
    twitter facebook
Cool ... (Score:5, Interesting)

by torpor ( 458 ) writes: <ibisum@@@gmail...com> on Sunday December 21, 2003 @01:01PM (#7779624) Homepage Journal

... can you say 'software synthesists' wet dream?

Oh, suddenly, that 'game investment' also gives you a few 100 extra voices of polyphony?

Sweet ... $5 to the first person to use Brooke to make a synthesizer. :)

Share
twitter facebook
- Re:Cool ... (Score:2, Informative)
  
  by usrusr ( 654450 ) writes:
  
  think fx not synth... just use it as a bad-ass real time convolver, and _then_ get wet.
  
  isn't it much more interesting to do things that were not possible before, than to just do the some thing, but in increased quantity? Also convolution is the single most universal operation in audio dsp (fir filters, reverb), one well-built plugin would suffice for everything. synth development creativity would certainly suffer from the increased development costs.
  - Re:Cool ... (Score:2, Interesting)
    
    by torpor ( 458 ) writes:
    
    What does 'synth' mean to you?
    
    To me it doesn't just mean Virtual Analog, or subtractive... it can be anything that makes noise ... so yeah, filters, yeah, effects, yeah, a single monster filter...
    
    Its all good. Lets see what the GPU's can do ...
first link is incorrect (Score:5, Informative)

by 2.246.1010.78 ( 721713 ) writes: on Sunday December 21, 2003 @01:02PM (#7779626)

but the link to the project page [stanford.edu] is correct.

Share
twitter facebook
Like the good old days (Score:5, Funny)

by fiskbil ( 734457 ) writes: on Sunday December 21, 2003 @01:02PM (#7779631) Homepage

Reminds me of the good old days when you used the processors in the C64 tapedrive to compute stuff. Wouldn't want to waste those precious cycles.

I'm sure a lot of old farts will tell me how they used some serial controller to compute stuff back in the 60's and that I'm just a little kid. :)

Share
twitter facebook
- Re:Like the good old days (Score:3, Informative)
  
  by Trracer ( 210292 ) writes:
  
  I guess you mean in the C1541 floppydrive.
- Re:Like the good old days (Score:3, Informative)
  
  by tzanger ( 1575 ) writes:
  
  Reminds me of the good old days when you used the processors in the C64 tapedrive to compute stuff. Wouldn't want to waste those precious cycles.
  
  Actually it was the old 1540/1541 and later 1571/1581 disk drives. The tape drive did not have a processor in it.
wait a minute (Score:5, Interesting)

by Janek Kozicki ( 722688 ) writes: on Sunday December 21, 2003 @01:03PM (#7779637) Journal

A shader program running on the NVIDIA GeForce FX 5900 Ultra achieves over 20 GFLOPS, roughly equivalent to a 10 GHz Pentium 4.

wait, if there is a technology that allows construction of GPU that is 3 times faster than the fastest CPUs, why Intel and AMD do not use this technology to build those 3times faster CPUs?

are you sure that you can compare the speed of GPU and CPU?

Share
twitter facebook
- Re:wait a minute (Score:2, Informative)
  
  by MooCows ( 718367 ) writes:
  
  The keywords are:
  A shader program
  
  The GPU is designed for CG, not for 'general purpose computing'.
  I guess the instruction set is pretty limited too.
- Re:wait a minute (Score:2, Informative)
  
  by AvitarX ( 172628 ) writes:
  
  You can compare there ability to run shader programs (see the example given).
  
  It does not mean you can use the GPU as a general purpose prossessor effectivly, or that it is even turing complete.
  
  All it means is that certain types of programs could possibly run 3 times faster if ported to this system.
- Re:wait a minute (Score:2, Informative)
  
  by ankit ( 70020 ) writes:
  
  Its probably because the Pentium 4 needs to be more generic. It needs to support a far greater number of instructions.
  
  A GPU on the other hand can do only so much. But its strength lies in areas where the CPU lags. Fast memory interfacing, extreme parallelization etc.
  
  Now there exist cmoputing problems that can be solved very efficiently on the GPU, even with its limited instruction set. This is what this project is all about - to provide a generic programming language that compiles to a vertex/pixel shader
- Re:wait a minute (Score:5, Informative)
  
  by the uNF cola ( 657200 ) writes: on Sunday December 21, 2003 @01:12PM (#7779703)
  
  You are assuming using the GPU technologies are possible in a CPU. Because something is applicable in one instance doesn't mean it is in all instances. Making some things efficient may take away from the efficiency of others, but in the case of such aa specialized chip, it may not matter.
  
  It may be ok to compare the speed of a GPU and a CPU if they are infact different. If a GPU was a CPU used with cheaper material, yeah, it would be unfair. But as life goes, they both have their merits.. so why not? A GPU is prolly best at some matrix math transforms.. or not. :)
  
  Parent Share
  twitter facebook
- Re:wait a minute (Score:5, Insightful)
  
  by enigma48 ( 143560 ) * writes: <jeff_new_slash@jeffdom . c om> on Sunday December 21, 2003 @01:15PM (#7779728) Journal
  
  Definately possible - general purpose CPUs have to do everything where graphics cards can specialize and do what little they can, faster.
  
  Also, good point about comparing GHz to GHz - AMD CPUs do more per cycle than Intel, but are also clocked much lower. You could look at a subset of instructions (ie: FLoating-point OPerations (FLOPS)) but this only gives you a piece of the overall performance picture.
  
  Without having read the article, my guess is they extrapolated (educated, math-based guess) how fast a 10GHz P4 would perform and compared the results that way.
  
  I'd LOVE to see this tech built into a SETI or Folding@Home client (steroids version). (Imagine the kids - "Mom, I need the Radeon 9800XT to find a cure for Grandma's cancer!")
  
  Parent Share
  twitter facebook
  - remember those 3dfx tv ad's... (Score:2, Funny)
    
    by agent2 ( 628468 ) writes:
    
    Imagine the kids - "Mom, I need the Radeon 9800XT to find a cure for Grandma's cancer!"
    ...that went something like "we have the technology...blah...something....to save lives....but instead....we've used 'em for games!!"
- Re:wait a minute (Score:3, Interesting)
  
  by Jah-Wren Ryel ( 80510 ) writes:
  
  All the world is not a FLOP. GPU = Graphics Processing Unit, not General Purpose Unit.
- Re:wait a minute (Score:5, Informative)
  
  by Entropy_ajb ( 227170 ) writes: on Sunday December 21, 2003 @01:18PM (#7779753)
  
  Because CPUs are limited to running instructions (for the most part) in serial. GPUs get to run a large number of instructions in parallel. As some above posts mentioned, a lot of the stuff the GPU can do is vector and matrix multiplication, therefore the GPU is really good at multiplying a lot of numbers times a lot of numbers at once. But in everyday life you aren't multiplying a bunch of number times a bunch of numbers at once, you are multiplying one number time another, then multiplying the result times a number, and so on. GPUs are built to a specific task, and at that task they are very fast, but outside that task they won't be able to compete with a real CPU. And on top of all of that I can buy 3 2.4Ghz P4s for the price of a Geforce FX5950.
  
  Parent Share
  twitter facebook
  - Re:wait a minute (Score:4, Interesting)
    
    by mdpye ( 687533 ) writes: on Sunday December 21, 2003 @01:28PM (#7779819)
    
    And on top of all of that I can buy 3 2.4Ghz P4s for the price of a Geforce FX5950
    
    But you forget the 256MB (at least) RAM on a steaming fast interface that you get with the GeForce... It makes the P4s' cache look pretty paltry in size by comparison.
    
    MP
    
    Parent Share
    twitter facebook
- Re:wait a minute (Score:5, Informative)
  
  by Kjella ( 173770 ) writes: on Sunday December 21, 2003 @01:39PM (#7779894) Homepage
  
  wait, if there is a technology that allows construction of GPU that is 3 times faster than the fastest CPUs, why Intel and AMD do not use this technology to build those 3times faster CPUs?
  
  are you sure that you can compare the speed of GPU and CPU?
  
  Well, yes and no. In the same way you can take a render farm and say that "this provides the equivalent of a 100GHz Pentium" Which might be true, for that specific task. You see it already between GPUs, compare Pentium, Xeon, Athlon XP and Athlon 64. Do you get one benchmark "X is 3% faster than Y"? No. Faster at some, slower at others. For a specific benchmark, the difference can be pretty big already among "general" processors.
  
  A specialized processor like a GPU will show much greater variation. It might really shine on some, really suck on others. Which is why it's no good using a GPU as a CPU. Those numbers tell you that it can be much faster than the fastest CPU around. Or better yet, if you can make it run in parallell to the normal CPU, give you a total performance which may theoretically be about 13GHz (10 + 3), where 3 of those can be general-purpose operations. Or it may be a task the GPU runs like a dog, and isn't even worth the overhead.
  
  Kjella
  
  Parent Share
  twitter facebook
- Re:wait a minute (Score:5, Interesting)
  
  by barik ( 160226 ) writes: on Sunday December 21, 2003 @01:54PM (#7779983) Homepage
  
  Are you sure that you can compare the speed of GPU and CPU?
  
  Professor Pat Hanrahan [stanford.edu], of Stanford University, made a stab at answering this question in his presentation 'Why is Graphics Hardware so Fast? [stanford.edu]'. The first half of the presentation focuses on this question, while the second half of the presentation covers programming languages that utilitize this hardware. Specifically, the Stanford Real-Time Shading Language (RTSL) and Brook are discussed. Overall, it's a good presentation that should get you up to speed with the basics of what's happening in this area of research.
  
  Parent Share
  twitter facebook
  - Re:wait a minute (Score:3, Funny)
    
    by larkost ( 79011 ) writes:
    
    *arrrg*!!
    
    PowerPoint-like presentation... going dumb... noooooo...
- Re:wait a minute (Score:3, Funny)
  
  by SirDaShadow ( 603846 ) writes:
  
  2 words: X86 architecture. Everyone who hated it told you it sucks. Now you see why.
- - DSPs = linear equation processors (Score:3, Interesting)
    
    by Doc Ruby ( 173196 ) writes:
    
    We used the AT&T DSP32, a 12.5MFLOPS DSP, 15 years ago at Array Technologies. Programmable in a native C source code, with multiply-accumulate (MAC) instructions optimized in microcode, the DSP32 was lightning fast at y = mx + b equations in its arithmatic logic unit (ALU), and its control logic unit (CLU) was also very fast at branching, including no-overhead looping. Linux runs on one of its many fascinating descendants, the Xilinx Virtex-2 Pro [commsdesign.com].
    - AT&T DSP32 Cluster Supercomputer in late 80s (Score:3, Interesting)
      
      by billstewart ( 78916 ) writes:
      
      The AT&T DSP32 definitely rocked. In addition to doing 32-bit floating point multiply and accumulate, it could simultaneously do 24-bit integer calculations. The supercomputer cluster was up to 128 of them (I forget if they were 8 or 16 per board), with communications structured as a tree, which could give you 1 GFLOPS sustained and up to 2 GFLOPS if you could keep them busy doing multiply-and-accumulate. Not bad for a desktop in the late 80s, though of course you can get that for $49 today:-)
      A typi
      - Re:AT&T DSP32 Cluster Supercomputer in late 80 (Score:3, Funny)
        
        by Doc Ruby ( 173196 ) writes:
        
        We got our first boards from the developers of an antiaircraft RADAR signature decoder/sight. We wound up using DSP32Cs, 25MFLOPS as I recall, by late 1990. We had an EISA card (PCI was in the future) with an FPGA for linearly scalable pluggable DSPs. We had experimented with a transputer, but found we could use the DSPs to preprocess the video sensor data during calibration, and load custom logic and buses into the FPGAs for maximum efficency routing the data. When the company folded and reformed, the tech
        
        Re:AT&T DSP32 Cluster Supercomputer in late 80 (Score:4, Interesting)
        
        by billstewart ( 78916 ) writes: on Monday December 22, 2003 @10:44AM (#7785892) Journal
        
        Yes, they were 25 MFLOPS. The chip had a 12.5 MHz cycle rate (I think that was also the clock speed), and each cycle could do a 32-bit multiply, a 32-bit add, and a 24-bit simple integer operation (some integer ops took multiple clocks, I think?)
        Your music application sounds like fun. I didn't know anybody was still doing anything quite like that by 1990 - there was a whole range of people around John Cage's time who did lots of prepared piano stuff.
        
        Some of the people who were trying to sell our multi-processor supercomputer flavor came up with a music studio application, doing lots of audio processing and mixing, sort of like your device turned inside out. Don't know if they sold more than one of them before the Lucent spinoff took them away.
        
        Parent Share
        twitter facebook
How does this look? (Score:5, Interesting)

by adrianbaugh ( 696007 ) writes: on Sunday December 21, 2003 @01:03PM (#7779642) Homepage Journal

I'm completely new to meddling with graphics card, so apologies if this is a silly question: when programs utilising the GPU for arbitrary calculations are running does the screen go weird, or is there a way of stopping the output being displayed? A screenfull of junk might not matter to a scientist leaving their computer to crunch numbers for a few months but it wouldn't be good for a general-purpose program.

Share
twitter facebook
- Re:How does this look? (Score:5, Informative)
  
  by Anonymous Coward writes: on Sunday December 21, 2003 @01:12PM (#7779705)
  
  Nope. Nothing appears on your screen until the contents of the area of memory known as the "frame buffer" are rewritten by a program (on either the GPU or CPU). The GPU can execute math code all day and you won't see the results unless it deliberately modifies the frame buffer.
  
  Parent Share
  twitter facebook
- The deaf leading the blind... (Score:5, Informative)
  
  by Kjella ( 173770 ) writes: on Sunday December 21, 2003 @01:14PM (#7779727) Homepage
  
  ...but I assume that in any advanced texturing/shading/bump mapping/other GFX function rendering, you apply all the different effects, and when you're done, specifically call that the frame is to be displayed on screen. (E.g. why your FPS != your monitor refresh rate)
  
  I would assume that this program simply never calls the drawing function, but instead gets the results back from the GPU. The normal screen should be able to run in the meanwhile (I assume you can e.g. build a 3D environment while showing a 2D cutscreen), so I would think you can have a plain GUI, as long as it doesn't need to use anything advanced.
  
  Kjella
  
  Parent Share
  twitter facebook
I am not an EE, but... (Score:5, Interesting)

by unfortunateson ( 527551 ) writes: on Sunday December 21, 2003 @01:08PM (#7779669) Journal

It would seem to me that the GPU is not going to be as general-purpose as the CPU, but could still attain the high mathematical throughput with vector-oriented processing.

Doing string searches, complex logic analyses, etc. would probably suck, but big data manipulations, such as SETI-style wave transformations, molecular analysis, etc., might be able to take advantage of them.

Share
twitter facebook
A good example of how an OS should be programmed. (Score:2, Insightful)

by qualico ( 731143 ) writes:

"Brook is an extension of standard ANSI C and is designed to incorporate the ideas of data parallel computing and arithmetic intensity into a familiar, efficient language" I'll qualify that this is the first I've heard of Brook, however, the words: "efficient language", ring loud in my ears. If Operating systems were programmed as such, imagine how fast bootup and operation of a computer would be. Instead we have bloated software on all sides of the board, that can barely show us the differences of MHz t
Fast Fourier Transform (Score:4, Interesting)

by HalfFlat ( 121672 ) writes: on Sunday December 21, 2003 @01:12PM (#7779704)

I'd love to see an FFT implementation (maybe it's not so hard ... will have to download and play with it.)

A lot of scientific code is constrained by how fast you can do an FFT, perhaps of arbitrary size. And a fast graphics card is a lot cheaper than a high-end processor.

For embarassingly parallel vector problems, this is just the sort of thing for cheap, powerful clusters based around a cheap PC and a fast GPU.

Share
twitter facebook
- Re:Fast Fourier Transform (Score:5, Interesting)
  
  by Kazymyr ( 190114 ) writes: on Sunday December 21, 2003 @01:16PM (#7779736) Journal
  
  Not to mention that you can put several PCI video cards in the same cheap PC. Multiply power by N.
  
  Parent Share
  twitter facebook
  - Re:Fast Fourier Transform (Score:5, Funny)
    
    by BiggerIsBetter ( 682164 ) writes: on Sunday December 21, 2003 @03:01PM (#7780348)
    
    Multiply power by N.
    
    You work for Nvidia, don't you?
    
    Parent Share
    twitter facebook
- Re:Fast Fourier Transform (Score:5, Informative)
  
  by jonsmirl ( 114798 ) writes: on Sunday December 21, 2003 @01:25PM (#7779795) Homepage
  
  http://www.cs.unm.edu/~kmorel/documents/fftgpu/
  
  The FFT on a GPU
  This page contains supplemental material for the following paper.
  
  Moreland, K and Angel, E. "The FFT on a GPU." In SIGGRAPH/Eurographics Workshop on Graphics Hardware 2003 Proceedings, pp. 112-119, July 2003.
  
  Parent Share
  twitter facebook
- Re:Fast Fourier Transform (Score:2)
  
  by SharpFang ( 651121 ) writes:
  
  *sigh* I've seen so many more or less sophisticated code to do the bit mirror-reversing in FFT, and why haven't they still made a CPU (ASM) command for that (or did they?) That's SO easy in hardware, just twist the bus 180 degrees.
Site is slow (Score:2, Informative)

by Anonymous Coward writes:

As the programmability and performance of modern GPUs continues to increase, many researchers are looking to graphics hardware to solve problems previously performed on general purpose CPUs. In many cases, performing general purpose computation on graphics hardware can provide a significant advantage over implementations on traditional CPUs. However, if GPUs are to become a powerful processing resource, it is important to establish the correct abstraction of the hardware; this will encourage efficient appli
Homepage of GPGU research (Score:5, Informative)

by zymano ( 581466 ) writes: on Sunday December 21, 2003 @01:15PM (#7779731)

www.gpgpu.org [gpgpu.org]

Very cool. Vector/Graphics processors could one day overtake General processors. They are way more energy efficient too.

Share
twitter facebook
Drawing text with GPU shader units? (Score:4, Interesting)

by jonsmirl ( 114798 ) writes: on Sunday December 21, 2003 @01:16PM (#7779737) Homepage

Has anyone tried drawing text with GPU shader units? It would work something like this:
1) Each character would have it's own shader program.
2) You would set the shader program, draw a rectange, and the character would appear.
3) The shader programs would be automatically generated by processing TrueType files.
To implement:
1) Break Truetype outline up into a number of convex curve segments.
2) Each of these curve segments would be represented as a set of constants in the shader program
3) For each pixel, test a line from pixel to an edge.
4) If the number of segments crossed is odd the pixel is black else white.
The algorithm can be refined to add antialiasing and hinting.
What you end up with is text that is clear at any resolution. The size of the text is controlled by the rectangle you draw it in. The text can also be clearly rotated and sheared.
An obvious optimization is to get the GPU vendors to add a shader instruction to do the calculation for which side of the bezier curve segment the current point lies.
While not important for games drawing text is critical for desktops. And we all know about the current trends to draw desktops with 3D hardware.

Share
twitter facebook
- Re:Drawing text with GPU shader units? (Score:2)
  
  by asparagus ( 29121 ) writes:
  
  It's a nice idea, but far simpler to rasterize the characters one time to a buffer, and then use them as 2d-textures. Then it's easier to code, optimize, and tweak said textures/characters if you don't like how they look. It eats memory, but that's one thing there's plenty of on modern graphics cards.
  - Re:Drawing text with GPU shader units? (Score:3, Interesting)
    
    by jonsmirl ( 114798 ) writes:
    
    Think about a compositing system where the window the app is being drawn into has been transformed into a non-rectangular shape by the compositing engine.
    The app thinks it is drawing into a flat rectangle. But the compositing engine distorts the font bitmap with it's transform. With the shader approach the distortion doesn't happen. Same problem happens when the compositing engine does scaling.
    You only need one shader program per glyph not matter what point size you want to draw. There is a lot of overhe
- Re:Drawing text with GPU shader units? (Score:2)
  
  by Have Blue ( 616 ) writes:
  
  It might be more efficient to convert the fonts beforehand into polygons that can be processed by the GPU's hardware tesselator (I know the Radeons have this, not sure about the Geforces). Then they can be rasterized using a faster process.
  - Re:Drawing text with GPU shader units? (Score:2)
    
    by jonsmirl ( 114798 ) writes:
    
    With tesselating you can't perform hinting at tiny point sizes.
  - Re:Drawing text with GPU shader units? (Score:2)
    
    by Allen Akin ( 31718 ) writes:
    
    Yeah, I have code that does this (and I'm sure lots of other folks do, too).
    
    The plus is that it can be used to produce fairly nice antialiased text that intermixes well with other primitives, and rendering is very fast.
    
    The minus is that a single set of geometric primitives for a character won't work for all point sizes if you need to use hinting. (Whether this is important or not depends on your application -- especially whether you need very small text, or have so little graphics memory available that y
- Re:Drawing text with GPU shader units? (Score:2)
  
  by whovian ( 107062 ) writes:
  
  The bezier stuff you mentioned whaffs of PostScript. Maybe the IT lawyers could chime in and say whether the extant patents that companies (like Adobe, IIRC) have also apply to GPU-rendered curves.
  - Re:Drawing text with GPU shader units? (Score:3, Insightful)
    
    by BiggerIsBetter ( 682164 ) writes:
    
    Why not? Printers have been doing this for years... There's no reason you couldn't make a graphics card to display postscript in hardware.
  - Re:Drawing text with GPU shader units? (Score:2)
    
    by One Louder ( 595430 ) writes:
    
    I don't think there are any patents on simply rendering the filled bezier outlines - the patents usually have to do with adjustments to improve appearance very small sizes, such as hinting or level-of-detail substitutions, or taking advantage of pixel geometry like in flat panels.
    - Re:Drawing text with GPU shader units? (Score:2)
      
      by whovian ( 107062 ) writes:
      
      Thanks. Self correction: I guess that should be Apple not Adobe, according to FreeType:
      
      Apple Computer owns three patents that are related to the processing of glyph outlines within TrueType fonts. This process if also called hinting or grid-fitting and is used to enhance the quality of glyphs at small bitmap sizes.
      ( http://freetype.sourceforge.net/patents.html )
  - - Re:Drawing text with GPU shader units? (Score:2)
      
      by whovian ( 107062 ) writes:
      
      It appears instead that Adobe is embracing SVG by releasing their own viewer [adobe.com].
- Re:Drawing text with GPU shader units? (Score:2)
  
  by EddWo ( 180780 ) writes:
  
  I believe Microsoft is doing this in Longhorn. They are reworking their Cleartype code to use pixels shaders where available.
Brook (Score:5, Insightful)

by belmolis ( 702863 ) writes: on Sunday December 21, 2003 @01:24PM (#7779789) Homepage

This looks like a straightforward and clean extension that experienced C/C++ programmers won't find difficult to learn, but it isn't entirely clear to me whether just using this language, without any knowledge of GPU architecture, will lead to big improvements in performance. Granted, you don't need to know the details, but you've got to have an idea of what it is that you're trying to do and in a general way how the special constructs of the language allow you to do that. As with other such language extensions, you can nominally write in the language but not really use the extensions (how many "C++" programs have you seen that were really C programs with // comments and a few couts?) or use them in unintended ways that prevent the intended optimization. It seems to me that if the project really is aiming at programmers who are not familiar with GPUs, they need at least to provide a brief introduction to the special properties of GPU architecture and some guidelines as to how to use the features of the language to take advantage of them. At present I don't find this either on the web sites or in the distribution.

Share
twitter facebook
Excellent! (Score:3, Interesting)

by macemoneta ( 154740 ) writes: on Sunday December 21, 2003 @01:31PM (#7779841) Homepage

I had submitted an AskSlashdot on this subject:

2003-04-20 01:51:36 Using video processing as "attached processor" (askslashdot,hardware) (rejected)

But as you can see it was rejected. I was particularly interested in the use of the GPU for cryptographic functions (e.g., with a loopback encrypted filesystem), to offload the processing from the main CPU. Is anyone aware of any work in this area?

Is this even a viable implementation, or would the overhead of continually dispatching work to the GPU exceed the benefit derived?

Share
twitter facebook
- Re:Excellent! (Score:2)
  
  by Quixote ( 154172 ) writes:
  
  But as you can see it was rejected. I was particularly interested in the use of the GPU for cryptographic functions (e.g., with a loopback encrypted filesystem), to offload the processing from the main CPU.
  I'd say it won't work. The AGP bus is slow at pushing data out.
  - - Re:Excellent! (Score:4, Informative)
      
      by larkost ( 79011 ) writes: on Sunday December 21, 2003 @05:06PM (#7781309)
      
      2.1 GB/s is very nice, but it only refers to transfers in one direction: to the card. There is a (much) smaller bandwidth back to the motherboard. This is because for their designed purpose, graphics cards do not need to talk back to the system much, they just crunch the numbers and spit out the results to a monitor.
      
      With encryption you are usually looking at processing streams of data. If your encryption method involves a lot of floating point math (almost never) on every bit of information, then it would be nice. But encryption is almost always integer based (GPUs don't' shine in integer like they do in floating point), and involves just as much data going in as coming back.
      
      If you are looking for a great (co) processor for integers, look at the Altivec section of the G4 (and the similar one in the G5.. I forget the IBM name).
      
      Parent Share
      twitter facebook
Research (Score:5, Insightful)

by dfj225 ( 587560 ) writes: on Sunday December 21, 2003 @01:36PM (#7779869) Homepage Journal

I've always wondered why certain research programs (like Folding@home or SETI@home) don't use this type of code. My GPU sees more free time than my CPU plus it would probably get the work done faster. Also, imagine the speed increase of utilizing both the GPU and the CPU to their fullest potential. Now thats some fast folding!

Share
twitter facebook
- Re:Research (Score:5, Interesting)
  
  by BiggerIsBetter ( 682164 ) writes: on Sunday December 21, 2003 @02:16PM (#7780112)
  
  I (and presumably others) have asked some project leaders about this, but it seems to come down to testing and support of various cards. Also, remember that this is relatively unknown technology - Amiga blitting aside ;-) - you have to be pretty sure it's going to give accurate and consistent results before using it seriously. Find-A-Drug [find-a-drug.com] was my project of interest, and they have a Linux version too.
  
  Parent Share
  twitter facebook
- Re: (Score:2)
  
  by account_deleted ( 4530225 ) writes:
  
  Comment removed based on user account deletion
I've always wondered when this would happen... (Score:3, Interesting)

by malakai ( 136531 ) * writes: on Sunday December 21, 2003 @01:38PM (#7779884) Journal

But what I'm really looking forward to is a Physics specific processor that sits alongside the graphics processor, and is resposible for collisions detection.

The last few SIGGRAPHS had numerous approaches using GPU's to detect collisions, in real-time, betwen complex volumes using only the GPU. With some minor tweaking, graphics manufacturers can make this 100x more efficent and easier to implement.

With the 'shader' languages being able to create and modify meshesh now, procedurally, this is the best place to detect collisions (beaking back the mesh data to your motherboard so that your local CPU can figure out what collided, is not efficent).

Share
twitter facebook
- Re:I've always wondered when this would happen... (Score:4, Interesting)
  
  by Animats ( 122034 ) writes: on Sunday December 21, 2003 @01:54PM (#7779981) Homepage
  
  But what I'm really looking forward to is a Physics specific processor that sits alongside the graphics processor, and is resposible for collisions detection.
  It's been done. The Havok [havok.com] game physics system is available for the Playstation 2, and the physics is running in the vector processors, where most of the PS2's compute power resides.
  Collision detection isn't that CPU-intensive. (This may surprise people not familiar with the field. But it's true. If collision detection is using substantial CPU time, you're doing it wrong.) Correct collision resolution is where the time goes.
  Physics code works better with double-precision FPUs. You need both dynamic range and long mantissas to do it well. Some of the game consoles, and most of the GPUs, only have single-precision FPUs. It's possible to make physics code work in single precision, but fast-moving objects that cover considerable distance may have problems.
  
  Parent Share
  twitter facebook
Nivida CG (Score:4, Informative)

by Popsikle ( 661384 ) writes: on Sunday December 21, 2003 @01:46PM (#7779931) Homepage

Nvidia has this already!
"About Cg The Cg Language Specification is a high-level C-like graphics programming language that was developed by NVIDIA in close collaboration with Microsoft Corporation. The Cg environment consists of two components: the Cg Toolkit including the NVIDIA Cg Compiler Beta 1.0 optimized for DirectX(R) and OpenGL(R); and the NVIDIA Cg Browser, a prototyping/visualization environment with a large library of Cg shaders. Developers also have access to user documentation and a range of training classes and online materials being developed for the Cg language."

http://www.nvidia.com/object/IO_20020612_7133.html

Share
twitter facebook
- Re:Nivida CG (Score:2)
  
  by dimator ( 71399 ) writes:
  
  Cg is first thing I thought of when I saw this post... I need to read the project page, but it seems to me, Cg would be the technology to learn given it's strong corporate backing and maturity [barnesandnoble.com].
Interesting (Score:2)

by Stonent1 ( 594886 ) writes:

I've wondered about this very thing for a few years now. Good to see that it really was possible.
DivX (Score:2)

by iamacat ( 583406 ) writes:

Is there a prize for the first optimized encoder for some flavor of MPEG4? Imaging ripping a DVD in one hour. Hopefully ATI users on OSX are not left behind.
memory bandwidth is the key (Score:3, Insightful)

by peter303 ( 12292 ) writes: on Sunday December 21, 2003 @03:11PM (#7780408)

Even though general purpose CPUs approach the flop rate of GPUs, you cant feed the memory for many data intensive computations fast enough. A GPU may give you 12 or so bytes of data per cycle, where very few commodity CPU buses can do that.

Share
twitter facebook
GPU use for scientific programming. (Score:4, Interesting)

by kiniry ( 46244 ) writes: on Sunday December 21, 2003 @03:13PM (#7780426) Homepage

Researchers at Caltech and other institutions have been looking at this for about three years. See "Sparse Matrix Solvers on the GPU: Conjugate Gradients and Multigrid" by Bolz, Farmer, Grinspun and Schroder (SIGGRAPH 2003), for example. The paper, illustrations, and movies are available from Dr. Grinspun's homepage [caltech.edu]. The primary problems with the approach at the time this work was done was the limited bandwidth of texture-related operations in OpenGL based upon improper assumptions in pipeline optimization.

Share
twitter facebook
More speed for the Terascale cluster? (Score:2, Interesting)

by Anonymous Coward writes:

Weren't the Virginia Tech's G5 supercomputer nodes all equipped with standard ATI cards? If used right, there could be 1100 more processors to use...
distributed.net (Score:3, Interesting)

by terminal.dk ( 102718 ) writes: on Sunday December 21, 2003 @03:53PM (#7780773) Homepage

When will the new client be out for this platform ?

I know my PC eats 20 Watts more of power when in 3D mode, but still, I want the faster agent :=)

Share
twitter facebook
Crypto (Score:3, Interesting)

by Effugas ( 2378 ) writes: on Sunday December 21, 2003 @04:08PM (#7780882) Homepage

We've talked a decent amount about doing crypto on GPU's. The fundamental issue is that such processors are massively optimized for operating on floating point numbers, and almost all crypto is integer based -- lots of bitshifts, MODs, and XOR's, only the latter of which this gear handles correctly. Even if the problem with getting data back off the card was solved, the card itself couldn't do the job.

Indeed, I only know of one crypto hack that uses floats -- being from DJB, it's predictably brilliant. Basically, it's easy to compute the floating point error from a given operation, but computationally hard to find an operation that yields a given error. So you can effectively sign (or at least MAC) arbitrary content. Nice!

--Dan

Share
twitter facebook
Imagine a Beowulf Cluster... no, seriously (Score:5, Interesting)

by billstewart ( 78916 ) writes: on Sunday December 21, 2003 @05:24PM (#7781441) Journal

There's a cluster of Sony Playstations [uiuc.edu] at UIUC (BBC) [bbc.co.uk] that's using the Emotion Engine to do numbercrunching and running Linux on the main processors to do communications and I/O. It's probably not strictly Beowulf, because it's using the Playstation version of Linux.

This cluster has 70 Playstations (one article said that they'd ordered 100, but only 70 are in the cluster... Obviously the others are being used for "research".)

Share
twitter facebook
How long until (Score:4, Funny)

by Lord Kano ( 13027 ) writes: on Sunday December 21, 2003 @10:39PM (#7783163) Homepage Journal

Someone ports a GPU Linux and some asshole loads 8 PCI cards into his machine and maked a beowulf cluster inside of one case?

Share
twitter facebook
Ray tracing with a GPU? (Score:3, Interesting)

by Angst Badger ( 8636 ) writes: on Monday December 22, 2003 @03:22AM (#7784352)

So I have to wonder how much POVray could be sped up -- if any -- by modifying it so that suitable calculations were run on the GPU, in parallel, while the CPU took care of the rest.

Share
twitter facebook
- Re:The future is the past (Score:5, Interesting)
  
  by Total_Wimp ( 564548 ) writes: on Sunday December 21, 2003 @01:49PM (#7779948)
  
  PCI-X can fix this data bus in other ways as well. Motherboards come with one AGP slot, but PCI-X can and will provide many expansion slots.
  
  Picture five high end GPUs on the motherboard eclipsing the single high-end cpu for a fraction of the price. Intel and AMD would be forced to cut the asking price of their products to compete. We could finally see some real four-way competition for "processors".
  
  TW
  
  Parent Share
  twitter facebook
- Re:multi gpu? (Score:2)
  
  by BiggerIsBetter ( 682164 ) writes:
  
  Hmmm. What's the fastest PCI graphics card you can buy these days?
  - Re:multi gpu? (Score:2)
    
    by BiggerIsBetter ( 682164 ) writes:
    
    And to answer my own question... maybe GeForceFX 5200 is it? Radeon 9000 seems to be available too.

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

High Performance for General Purpose? (Score:4, Interesting)

Re:High Performance for General Purpose? (Score:4, Informative)

A real-world example - ray tracing (Score:4, Informative)

Re:High Performance for General Purpose? (Score:5, Insightful)

Re:High Performance for General Purpose? (Score:2)

Re:High Performance for General Purpose? (Score:4, Interesting)

HP for GP?-AGP Bottleneck. (Score:2, Interesting)

Re:HP for GP?-AGP Bottleneck. (Score:5, Insightful)

Re:HP for GP?-AGP Bottleneck. (Score:2, Insightful)

Re:HP for GP?-AGP Bottleneck. (Score:3, Informative)

Re:HP for GP?-AGP Bottleneck. (Score:3, Insightful)

Re:High Performance for General Purpose? (Score:5, Interesting)

Re:High Performance for General Purpose? (Score:3, Informative)

Re:High Performance for General Purpose? (Score:3, Informative)

Re:High Performance for General Purpose? (Score:3, Insightful)

Re:HP for GP?-Fakeout. (Score:4, Informative)

Re:High Performance for General Purpose? (Score:4, Interesting)

Re:High Performance for General Purpose? (Score:4, Insightful)

Cool, but (Score:3, Interesting)

Good point. (Score:5, Insightful)

Re:Cool, but (Score:4, Informative)

GPU opcodes (Score:4, Informative)

Basically like having two processors... (Score:4, Interesting)

Re:Basically like having two processors... (Score:2, Offtopic)

Re:Basically like having two processors... (Score:3, Interesting)

Re:Basically like having two processors... (Score:2)

Re:Basically like having two processors... (Score:4, Interesting)

Cool ... (Score:5, Interesting)

Re:Cool ... (Score:2, Informative)

Re:Cool ... (Score:2, Interesting)

first link is incorrect (Score:5, Informative)

Like the good old days (Score:5, Funny)

Re:Like the good old days (Score:3, Informative)

Re:Like the good old days (Score:3, Informative)

wait a minute (Score:5, Interesting)

Re:wait a minute (Score:2, Informative)

Re:wait a minute (Score:2, Informative)

Re:wait a minute (Score:2, Informative)

Re:wait a minute (Score:5, Informative)

Re:wait a minute (Score:5, Insightful)

remember those 3dfx tv ad's... (Score:2, Funny)

Re:wait a minute (Score:3, Interesting)

Re:wait a minute (Score:5, Informative)

Re:wait a minute (Score:4, Interesting)

Re:wait a minute (Score:5, Informative)

Re:wait a minute (Score:5, Interesting)

Re:wait a minute (Score:3, Funny)

Re:wait a minute (Score:3, Funny)

DSPs = linear equation processors (Score:3, Interesting)

AT&T DSP32 Cluster Supercomputer in late 80s (Score:3, Interesting)

Re:AT&T DSP32 Cluster Supercomputer in late 80 (Score:3, Funny)

Re:AT&T DSP32 Cluster Supercomputer in late 80 (Score:4, Interesting)

How does this look? (Score:5, Interesting)

Re:How does this look? (Score:5, Informative)

The deaf leading the blind... (Score:5, Informative)

I am not an EE, but... (Score:5, Interesting)

A good example of how an OS should be programmed. (Score:2, Insightful)

Fast Fourier Transform (Score:4, Interesting)

Re:Fast Fourier Transform (Score:5, Interesting)

Re:Fast Fourier Transform (Score:5, Funny)

Re:Fast Fourier Transform (Score:5, Informative)

Re:Fast Fourier Transform (Score:2)

Site is slow (Score:2, Informative)

Homepage of GPGU research (Score:5, Informative)

Drawing text with GPU shader units? (Score:4, Interesting)

Re:Drawing text with GPU shader units? (Score:2)

Re:Drawing text with GPU shader units? (Score:3, Interesting)

Re:Drawing text with GPU shader units? (Score:2)

Re:Drawing text with GPU shader units? (Score:2)

Re:Drawing text with GPU shader units? (Score:2)

Re:Drawing text with GPU shader units? (Score:2)

Re:Drawing text with GPU shader units? (Score:3, Insightful)

Re:Drawing text with GPU shader units? (Score:2)

Re:Drawing text with GPU shader units? (Score:2)

Re:Drawing text with GPU shader units? (Score:2)

Re:Drawing text with GPU shader units? (Score:2)

Brook (Score:5, Insightful)

Excellent! (Score:3, Interesting)

Re:Excellent! (Score:2)

Re:Excellent! (Score:4, Informative)