Stories
Slash Boxes
Comments

News for nerds, stuff that matters

Slashdot Log In

Log In

Create Account  |  Retrieve Password

Are Relational Databases Obsolete?

Posted by kdawson on Thu Sep 06, 2007 12:27 PM
from the long-in-the-tooth dept.
jpkunst sends us to Computerworld for a look at Michael Stonebraker's opinion that RDBMSs "should be considered legacy technology." Computerworld adds some background and analysis to Stonebraker's comments, which appear in a new blog, The Database Column. Stonebraker co-created the Ingres and Postgres technology while a researcher at UC Berkeley in the early 1970s. He predicts that "column stores will take over the [data] warehouse market over time, completely displacing row stores."
+ -
story

Related Stories

This discussion has been archived. No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More
Loading... please wait.
  • by KingSkippus (799657) * on Thursday September 06 2007, @12:28PM (#20495819) Homepage Journal

    Okay, at the risk of sounding stupid...

    Since when is a column store database and a relational database mutually exclusive concepts? I thought that both column store and row store (i.e. traditional) databases were just different means of storing data, and had nothing to do with whether a database was relational or not. I think the article misinterpreted what he said.

    Also, I don't think it's news that Michael Stonebraker (a great name, by the way), co-founder and CEO of a company that (surprise!) happens to develop column store database software, thinks that column store databases are going to be the Next Big Thing. Right or wrong, his opinion can't exactly be considered unbiased...

    • by XenoPhage (242134) on Thursday September 06 2007, @12:35PM (#20495911) Homepage
      Since when is a column store database and a relational database mutually exclusive concepts? I thought that both column store and row store (i.e. traditional) databases were just different means of storing data, and had nothing to do with whether a database was relational or not. I think the article misinterpreted what he said.

      Agreed. It definitely looks like a storage preference. Though column-based storage has definite benefits over row-based when it comes to store once, read many operations. Kinda like what you'd find in a data warehouse situation...

      Also, I don't think it's news that Michael Stonebraker (a great name, by the way), co-founder and CEO of a company that (surprise!) happens to develop column store database software, thinks that column store databases are going to be the Next Big Thing. Right or wrong, his opinion can't exactly be considered unbiased...

      Hrm.. You must be new here....
      • Column stores are great (better than a row store) if you're just reading tons of data, but they're much more costly than a row store if you're writing tons of data.

        Therefore, pick your method depending on your needs. Are you storing massive amounts of data? Column stores are probably not for you...Your application will run better on a row store, because writing to a row store is a simple matter of adding one more record to the file, whereas writing to a column store is often a matter of writing a record to many files...Obviously more costly.

        On the other hand, are you dealing with a relatively static dataset, where you have far more reads than writes? Then a row store isn't the best bet, and you should try a column store. A query on a row store has to query entire rows, which means you'll often end up hitting fields you don't give a damn about while looking for the specific fields you want to return. With column stores, you can ignore any columns that aren't referenced in your query...Additionally, your data is homogenous in a column store, so you lose overhead attached to having to deal with different datatypes and can choose the best data compression by field rather than by data block.

        Why do people insist that one size really does fit all?
        • by theGreater (596196) on Thursday September 06 2007, @12:47PM (#20496125) Homepage
          So it seems to me the -real- money is in integrating an RDBMS which, for usage purposes, is row-oriented; but which, for archival purposes, is column-oriented. This could either be a backup-type thing, or an aging-type thing. Quick, to the Pat(ent)mobile!

          -theGreater
          • by stoolpigeon (454276) * <bittercode@gmail> on Thursday September 06 2007, @12:51PM (#20496189) Homepage Journal
            Maybe, but I doubt it. The money is in the data warehouse market and the etl tools that move the data from the oltp environment to the warehouse environment. I think what the author points out is not that people are trying to use the same database to do both, but rather that they are trying to use the same product to both. He says it would make more sense to use Oracle (for example) for oltp - and something else for the warehouse, rather than trying to get Oracle to do both well.
        • by KingSkippus (799657) * on Thursday September 06 2007, @01:00PM (#20496341) Homepage Journal

          Why do people insist that one size really does fit all?

          I went back and read the original article. To Michael Stonebreaker's credit, the ComputerWorld article (and the submitter) grossly misrepresents what he said.

          He did not say that RDBMSes are "long in the tooth." He said that the technology underlying them hasn't changed since the 1970's, and that column stores is a better way to represent data in certain situations. In fact, the very name of his original column was "One Size Fits All - A Concept Whose Time Has Come and Gone"

    • by stoolpigeon (454276) * <bittercode@gmail> on Thursday September 06 2007, @12:37PM (#20495947) Homepage Journal
      You are exactly right and this is backed up by the home page for c-store [mit.edu]. It says: "C-Store is a read-optimized relational DBMS " - c-store is the open source project that apparently is the basis for Vertica - Stonebraker's commercial offering.
    • by Anonymous Coward on Thursday September 06 2007, @12:38PM (#20495971)
      Well I just turned my server on its side and now all my tables are storing in columns. I love new technology.
    • by -homb- (82455) on Thursday September 06 2007, @12:56PM (#20496265)
      I wish we could put this thing to rest once and for all. And I wish so-called "experts" in the field actually were.

      Rule of thumb:
      - you use row dbs for OLTP. They're great for writing.
      - you use column dbs for data mining. They're amazing for reading aggregates (average, max, complex queries...)

      The major problem with column dbs is the writing part. If you have to write one row at a time, you're screwed because it needs to take each column, read, insert into it and store. If you can write in batch, the whole process isn't much more expensive. So writing a single row could take 500ms, but writing 1000 rows will take 600ms.
      Once the data's in, column dbs are the way to go.
  • by winkydink (650484) * <sv.dude@gmail.com> on Thursday September 06 2007, @12:28PM (#20495821) Homepage Journal
    The name of his blog is The Database Column after all.
  • by DrinkDr.Pepper (620053) on Thursday September 06 2007, @12:33PM (#20495879)
    Relational databases aren't being obsoleted. Some schema design heuristics are.
  • dual-mode db? (Score:5, Interesting)

    by 192939495969798999 (58312) <info&devinmoore,com> on Thursday September 06 2007, @12:33PM (#20495889) Homepage Journal
    Is there a dual-mode db, that lets you create a row-based or column-based "table"? I imagine cross-mode queries would kill performance, but at least you could have a system front-loaded with row tables, where data comes in, and then archive this data over time into the column-based tables, so that reads were fast.
  • well (Score:5, Informative)

    by stoolpigeon (454276) * <bittercode@gmail> on Thursday September 06 2007, @12:34PM (#20495891) Homepage Journal
    every article linked makes it clear that this is about warehousing as opposed to oltp. so is the technology dead? no - can it do everything? no
  • Rotate (Score:5, Funny)

    by Kozar_The_Malignant (738483) on Thursday September 06 2007, @12:35PM (#20495909)

    >"column stores will take over the [data] warehouse market over time, completely displacing row stores."

    Hmmmm. So if I rotate my Paradox or Excel table by 90 degrees, I have achieved database coolness? Who knew it was so easy.

      • Re:Rotate (Score:5, Insightful)

        by ben there... (946946) on Thursday September 06 2007, @01:12PM (#20496473) Journal

        Excel only handles 255 Columns.
        It should be noted that if you've designed a database (rather than an Excel abomination) with more than 255 columns, chances are, you're doing it wrong.
  • The guy... (Score:5, Interesting)

    by AKAImBatman (238306) <akaimbatman @ g m a i l . com> on Thursday September 06 2007, @12:38PM (#20495963) Homepage Journal
    ...is duping [slashdot.org] himself [slashdot.org] and thus Slashdot is duping the stories by extension.

    Stonebraker has been pushing the concept of column-oriented databases for quite some time now, trying to get someone, ANYONE, to listen that it's superior. While I think he has a point, I'm not sure if he really goes far enough. Our relational databases of today are heavily based on the ISAM files of yesteryear. Far too many products threw foreign keys on top of a collection of ISAMs and called it a day. Which is why we STILL have key integrity issues to this day.

    It would be nice if we could take a step back and re-engineer our databases with more modern technology in mind. e.g. Instead of passing around abstract id numbers, it would be nice if we had reference objects that abstracted programmers away from the temptation of manually managing identifiers. Data storage is another area that can be improved, with Object Databases (really just fancy relational databases with their own access methods) showing how it's possible to store something more complex than integers and varchars.

    The demands on our DBMSes are only going to grow. So there's something to be said for going back and reengineering things. If column-oriented databases are the answer, my opinion is that they're only PART of the answer. Take the redesign to its logical conclusion. Let's see databases that truly store any data, and enforce the integrity of their sets.
  • You've all heard of the IBM product called DB2, right? So what was DB1? Answer: IMS, which is a hierarchical database. They were a pain in the ass to use--PSBs and all--but they were/are faster than hell and I doubt any company is going to throw them out for any reason. Same goes for relational databases. They're going nowhere. Sure, we have room for more but nobody is going to displace the RDBMS anytime soon.
  • by littlefoo (704485) on Thursday September 06 2007, @12:39PM (#20495981)
    No. There, that was easy !

    It's like the packet of crisps that says "Is there a 20 pound note in here !!?" - the answer should always be 'No'.

    Except maybe for one person.

    sed -e 's/crisps/potato chips/' -e 's/pound/dollar/'
  • Obviously, he's biased. But more importantly, he just said that column-store databases are going to take over the WAREHOUSE market. That doesn't mean that row-store databases are going to become obsolete, because there will always be applications out there that do a substantial amount of writing as well as reading.

    In fact, the new wave of user-generated-content websites and webapps seems to me to indicate the exact opposite - if anything, row-store databases, with their usefulness in write-heavy applications, should becoming, if anything, more and more necessary/useful on the web.

    So...chalk this one up to some grandstanding on the part of a guy who wants to put more money in his pockets...
  • Aha! (Score:5, Funny)

    by Stanistani (808333) on Thursday September 06 2007, @12:42PM (#20496029) Homepage Journal
    The next big thing in DBMS:
    turning your head sideways.
  • by dada21 (163177) <adam.dada@gmail.com> on Thursday September 06 2007, @12:42PM (#20496039) Homepage Journal
    In my IT business, a vast majority of our top tier clients (grossing over US$100 million annually) are still using antiquated software that is still using a relational database backend. While these companies are generally VERY efficient in terms of providing services or products to their market, their accounting, purchase orders and project management software is decades outdated. Many of the companies that maintain these packages have merely made the interface more current (but still 5+ years old, but are still using terribly outdated software. I can't begin to tell you how often the words "FoxPro" and "MS SQL" come up and it ends up being a relational database "solution" or even worse.

    It is very frustrating because we do have programmers on staff that create third party plug-ins to these databases to try to make solutions that the OEM code doesn't. When you meet younger programmers, many of them are frustrated themselves to work on ancient solutions that have no hope of being upgraded, because these industries we work in are not in a rush to try anything new and shiny, but instead are happy with the status quo.

    I just bid a job a few months back that would cost $150,000 to upgrade their database infrastructure, and likely save the company $300,000+ annually in added efficiency, less downtime, and a more robust report system. Guess what they said? "We all think it is fine the way it is." That's money thrown out the window, employees who are frustrated (without knowing why), and forcing the company to lose efficiency by not being able to compete with newer companies that are utilizing newer technology to better their bottom line.

    Ugh.
  • by roman_mir (125474) on Thursday September 06 2007, @12:52PM (#20496199) Homepage
    Once someone shows that there is no longer a use for any relationship between data entries, then we'll be able to say that RDBMSs are obsolete. Actually both headlines (/. and the linked article) are mistaken about what Michael Stonebraker is saying. He is talking about read intensive applications mostly and he is talking about optimization of data for reading purposes. This does not mean that RDBMSs are obsolete for all uses, just that he sees a faster way to retrieve data for certain uses.
  • by sohp (22984) <[moc.oi] [ta] [notwens]> on Thursday September 06 2007, @01:06PM (#20496421) Homepage
    Along with Procedural Programming [slashdot.org], this could REVOLUTIONIZE the software industry!!