Become a fan of Slashdot on Facebook


Forgot your password?
Databases Programming Software IT

Brian Aker On the Future of Databases 175

blackbearnh recommends an interview with MySQL Director of Technology Brian Aker that O'Reilly Media is running. Aker talks about the merger of MySQL with Sun, the challenges of designing databases for a SOA world, and what the next decade will bring as far as changes to traditional database architecture. Audio is also available. From the interview: "I think there's two things right now that are pushing the changes... The first thing that's going to push the basic old OLCP transactional database world, which... really hasn't [changed] in some time now — is really a change in the number of cores and the move to solid state disks because a lot of the... concept around database is the idea that you don't have access to enough memory. Your disk is slow, can't do random reads very well, and you maybe have one, maybe eight processors but... you look at some of the upper-end hardware and the mini-core stuff,... and you're almost looking at kind of an array of processing that you're doing; you've got access to so many processors. And well the whole story of trying to optimize... around the problem of random I/O being expensive, well that's not that big of a deal when you actually have solid state disks. So that's one whole area I think that will... cause a rethinking in... the standard Jim Gray relational database design."
This discussion has been archived. No new comments can be posted.

Brian Aker On the Future of Databases

Comments Filter:
  • Re:Well (Score:3, Informative)

    by emurphy42 ( 631808 ) on Tuesday June 03, 2008 @07:25PM (#23645391) Homepage
  • by atomic777 ( 860023 ) on Tuesday June 03, 2008 @07:33PM (#23645481)
    I recently blogged on this [], but essentially, as long as your average PHP developer thinks of MySQL as a glorified flat file system to place their serialized PHP objects, an always-available, pay-as-you-go distributed database is going to revolutionize application development in the coming years. For those that want to keep control of their data, HBase [] is coming along quite nicely.
  • by Johnno74 ( 252399 ) on Tuesday June 03, 2008 @07:58PM (#23645743)
    Umm I'd say you have it wrong - "Traditional" databases have many different lock granularities, such as Table locks, page locks and row locks. SQL server and Oracle certainly do this.

    MySQL only does table locks, which are much simpler and much faster for light workloads, but as I'm sure you can imagine when you have many CPUs trying to update the table at once in the end each thread has to wait their turn to grab the lock and perform their updates sequentially.

    In SQL Server, Oracle, or any other "enterprisey" db multiple threads can update the same table at exactly the same time, as long as its not the same row.

    Stuff like this is exactly why people who use MS-SQL and oracle look down their nose at people who use MySQL and claim it is capable of playing with the big boys.

    Once again, despite what MySQL are saying there is nothing innovative here. All this stuff has existed in the mainstream database engines for many, many years and they are still playing catchup.
  • Re:Well (Score:3, Informative)

    by Bogtha ( 906264 ) on Tuesday June 03, 2008 @08:21PM (#23645965)

    Come on, he's talking about the future of databases. He was just trying to set the mood by doing his best Kirk impression.

  • Re:Too small (Score:2, Informative)

    by Anonymous Coward on Tuesday June 03, 2008 @08:24PM (#23645985)
    Except that Bill Gates never said that. Bluefoxlucid did.

    I'm sure he'll feel lots worse. While Gates gets hounded for something he never said, at least he has mountains and mountains of cash to console him.
  • by XanC ( 644172 ) on Tuesday June 03, 2008 @08:24PM (#23645987)
    What you say is true for MyISAM tables, but MySQL's InnoDB tables fully support row-level locking. And I believe their BDB tables support page-level locking.
  • by anarxia ( 651289 ) on Tuesday June 03, 2008 @08:34PM (#23646073)
    It is called MVCC. Other databases such as oracle and postgres also use this approach. MVCC has its pros and cons. It allows for higher concurrency, but it might require extra copies of data and that translates to more memory and disk space. On a "weak" server it might actually end up being less concurrent.
  • Re:Admittedly.... (Score:3, Informative)

    by CastrTroy ( 595695 ) on Tuesday June 03, 2008 @08:54PM (#23646227) Homepage
    I'm going along with the other two guys. I can't see what application would need more than 1000 columns in a single table. What really gets my is the MS SQL Server 2000 maximum of 8 KB ( SQL Server 7 was 2 KB) in a single row. Now there's a limitation that's badly designed. Oh, and you can define a table with 15 Varchar(8000) fields, just don't try filling every field. 1000 columns I could do just fine with (SQL Server supports 2048?) but the big killer is that you can't even use 2000 columns, because if you did, you would run out of space in the row, unless the average field size was under 4 bytes.
  • Re:Admittedly.... (Score:2, Informative)

    by allanw ( 842185 ) on Tuesday June 03, 2008 @10:07PM (#23646721)
    If you find you have to create thousands of columns response_0001, response_0002, ... response_4096, then you should probably realize that there's something wrong with your schema. It's just basic database normalization. (Though I suppose you might have a reason for doing it this way. But it sounds incredibly horrible.)
  • Re:This IS news! (Score:3, Informative)

    by gfody ( 514448 ) on Tuesday June 03, 2008 @11:36PM (#23647271)
    I'm sure he meant the 'implementation of'
    Relational algebra has nothing to do with random IO however building a relational database system has everything to do with random IO because it is by and large the worst bottleneck in the system. The best performing RDBMSs are the ones completely designed around avoiding random IO. That's why TFA says a new RDBMS could be created from scratch and blow the existing players out of the water in the new SSD world.
  • Re:Dear Slashot (Score:3, Informative)

    by Bacon Bits ( 926911 ) on Wednesday June 04, 2008 @01:41AM (#23647947)
    Simply put, MyISAM isn't meant for data sets that large. It's meant to be fast with less regard for data integrity than the alternatives. That's by design. When you increase the max size of of the table, you change the bit length of the addresses used for indexing the table and such. Increasing the bit length slows the system, particularly when the bit length exceeds the bit size of the processor. I'd argue more that the default engine should be InnoDB rather than MyISAM, and that internal tables should also be run as InnoDB now.

    Additionally, I'd argue that comparing a MyISAM table to SQL Server (or any other transactional, ACID-compliant RDBMS) is not a fair comparison. If all you care about is speed, then you can get even more if you go with an embedded database like Firebird or SQLite. Or try a flat file. Those are terrifically fast if you do them right. Why do you think file systems are so much more efficient than RDBMS's?

    Honestly, there are better ways to optimize most databases which don't involve sacrificing data integrity to do so. Examine your indices and views. Maybe your DB isn't normalized properly. IMO, sacrificing OLTP integrity to satisfy OLAP speed is like taking supports from the first floor to finish the roof.
  • Re:Admittedly.... (Score:2, Informative)

    by spatialguy ( 951355 ) on Wednesday June 04, 2008 @02:24AM (#23648123)
    A database table is not an Excel sheet with fewer limits! Have some local wizard help you in the design. And if you use postgresql or any other full featured database, you can use views to retrieve your data in a format you need for analysis.
  • by tuomoks ( 246421 ) <> on Wednesday June 04, 2008 @02:51AM (#23648253) Homepage
    Had to comment, the reference to Jim Gray was a little weird? I was lucky to work with Jim and we were often talking about technology changes and enhancements. Now - see what for example Tandem did call "massively parallel" database! The system was already built to allow several cpus and several nodes to interconnect transparently, Jim did see how that could be used and how the database optimizer really could work. Of course making direct access to any disc faster will help, especially now when the SDD's are getting bigger but the theory is nothing new. Even SQLite can show you that and think systems where you have 32, 128 or even 256 bit flat, memory speed but storage backed world - will change the picture, or? But be careful, we have already gone through many iterations making part of the system faster, as fixed head disks and even indexing in solid state, and found that it may (will) create other problems, not always seen upfront (except by JG!)

Don't tell me how hard you work. Tell me how much you get done. -- James J. Ling