Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
Databases

Why Don't Open Source Databases Use GPUs? 241

An anonymous reader writes "A recent paper from Georgia Tech (abstract, paper itself) describes a system than can run the complete TPC-H benchmark suite on an NVIDIA Titan card, at a 7x speedup over a commercial database running on a 32-core Amazon EC2 node, and a 68x speedup over a single core Xeon. A previous story described an MIT project that achieved similar speedups. There has been a steady trickle of work on GPU-accelerated database systems for several years, but it doesn't seem like any code has made it into Open Source databases like MonetDB, MySQL, CouchDB, etc. Why not? Many queries that I write are simpler than TPC-H, so what's holding them back?"
This discussion has been archived. No new comments can be posted.

Why Don't Open Source Databases Use GPUs?

Comments Filter:
  • by AHuxley ( 892839 ) on Wednesday December 25, 2013 @11:10AM (#45781891) Journal
    The people with the skills have day jobs and want to enjoy time off with other projects.
    The people with the skills have no jobs and want to write the code but the hardware is too expensive.
  • Risk aversion. (Score:2, Interesting)

    by Anonymous Coward on Wednesday December 25, 2013 @11:13AM (#45781919)
    Because a lot of us have personal experience on how "reliable" GPU calculations are.

    A few screen "artifacts" tend to be less painful than db "artifacts". Maybe things have changed. But it's not been that long since nvidia had a huge batch of video cards that were dying in all sorts of ways.

    As for AMD/ATI, I suspect you'd normally use some of their crappy software when doing that GPU processing.
  • by ron_ivi ( 607351 ) <sdotno@cheapcomp ... m ['ces' in gap]> on Wednesday December 25, 2013 @01:00PM (#45782339)

    performance ... put up cash...

    The biggest opportunity for GPUs in Databases isn't for "performance". As others pointed out - for performance it's easier to just throw money at the problem.

    GPU powered databases do show promise for performance/Watt.

    http://hgpu.org/?p=8219 [hgpu.org]

    However, energy efficiency is not enough, energy proportionality is needed. The objective of this work is to create an entire platform that allows execution of GPU operators in an energy proportional DBMS, WattBD, and also a GPU Sort operator to prove that this new platform works. A different approach to integrate the GPU into the database has been used. Existing solutions to this problem aims to optimize specific areas of the DBMS, or provides extensions to the SQL language to specify GPU operation, thus, lacking flexibility to optimize all database operations, or provide transparency of the GPU execution to the user. This framework differs from existing strategies manipulating the creation and insertion of GPU operators directly into the query plan tree, allowing a more flexible and transparent framework to integrate new GPU-enabled operators. Results show that it was possible to easily develop a GPU sort operator with this framework. We believe that this framework will allow a new approach to integrate GPUs into existing databases, and therefore achieve more energy efficient DBMS.

    Also note that you can write PostgreSQL stored procedures in OpenCL - which may be useful if you're doing something CPU intensive like storing images in a database and doing OCR or facial recognition on them: http://wiki.postgresql.org/images/6/65/Pgopencl.pdf [postgresql.org]

    Introducing PgOpenCL - A New PostgreSQL Procedural Language Unlocking the Power of the GPU

  • by TheRaven64 ( 641858 ) on Wednesday December 25, 2013 @02:51PM (#45782849) Journal

    No, I'm not shouting at you, I'm shouting at the fucking bogus pseudo-academics who wanted to bullshit with micro-optimisation rather than making actual advancements in the field of databases.

    Any paper that does X on a GPU generally fits into this category. It's not science to run an existing algorithm on an existing Turing-complete processor. At most it's engineering. But it's a fairly easy way to churn out papers. Doing X 'in the cloud' or 'with big data' have a similar strategy. It's usually safe to ignore them.

I've noticed several design suggestions in your code.

Working...