Catch up on stories from the past week (and beyond) at the Slashdot story archive


Forgot your password?
Databases Programming Software IT Technology

Is the Relational Database Doomed? 344

DB Guy writes "There's an article over on Read Write Web about what the future of relational databases looks like when faced with new challenges to its dominance from key/value stores, such as SimpleDB, CouchDB, Project Voldemort and BigTable. The conclusion suggests that relational databases and key value stores aren't really mutually exclusive and instead are different tools for different requirements."
This discussion has been archived. No new comments can be posted.

Is the Relational Database Doomed?

Comments Filter:
  • Hey! (Score:4, Insightful)

    by MightyMartian ( 840721 ) on Friday February 13, 2009 @05:20PM (#26849453) Journal

    Hey, read my article! Just to make sure you do, I'll pull a Dvorak and put in some incredibly sensational headline about how RDBMs are dewmed!!!!!! BWAHAHA, feed my advertisers!!!!

    (Tune in ext week, when I write about how C programming is going to become extinct in the light of fantastic new development tools like C# and Ruby on Rails!!!)

  • Re:Karma Whoring (Score:2, Insightful)

    by Anonymous Coward on Friday February 13, 2009 @05:26PM (#26849529)

    This isn't digg. Posting that doesn't guarantee you +5

  • Re:new record (Score:4, Insightful)

    by bFusion ( 1433853 ) on Friday February 13, 2009 @05:26PM (#26849547) Homepage
    Well the '?' means that there's a question. The summary gave the conclusion to that question.
  • Re:Hey! (Score:5, Insightful)

    by dkleinsc ( 563838 ) on Friday February 13, 2009 @05:27PM (#26849567) Homepage

    Especially when the claim is as ridiculous as this one.

    There's a reason relational databases took over the world of databases: They provide a good combination of flexibility and structure to efficiently represent data. Which is what databases are supposed to do.

  • by thammoud ( 193905 ) on Friday February 13, 2009 @05:29PM (#26849587)

    Leave us RDBMS dinosaurs alone. String Name/Value pairs, that is a great innovation. In other news, Sun will be dropping all types from the Java object system and rely on the VOID type. Idiots.

  • by qbzzt ( 11136 ) on Friday February 13, 2009 @05:48PM (#26849859)

    In headlines, "?" implies that something is a serious question, whose answer is likely to be yes. One that makes it worth spending the time to read the article.

    Imagine the headline said "Does Obama Smoke Crack?" and the article had a bunch of stuff about the president, with a last paragraph saying: "There is absolutely no reason to thing that President Obama has ever smoked crack."

  • Ridiculous (Score:4, Insightful)

    by Eravnrekaree ( 467752 ) on Friday February 13, 2009 @05:49PM (#26849877)

    Really rational is the best way to take a data set and be able to access it in various ways. Many of the other concepts are indeed regressions and reintroduce problems a relational database solves. Relational allows you to able to display and view data in various different ways and apply the dataset in new ways, ways that may not have originally been a part of the original design of the application. Every time we hear someone harp about some new database technology that reintroduces all of the problems of the past, but relational is still the best and most versatile way to store your data in a way that allows for query flexibility.

  • by Renegade Iconoclast ( 1415775 ) on Friday February 13, 2009 @05:51PM (#26849895)

    Turns out, there's something called a "skateboard." You can use it to travel as far as the Quickie Mart, with nothing but your feet to propel it.

    In conclusion, skateboards and automobiles aren't the same thing, so probably not.

  • Re:ah, stupid. (Score:4, Insightful)

    by poot_rootbeer ( 188613 ) on Friday February 13, 2009 @05:57PM (#26849993)

    If you really wanted to have a database just do key / store values, you could quite easily do that in any rdms.

    Sure, but it's not likely that a key/value store implemented within a general-purpose RDBMS can achieve the same raw performance that a system designed to do nothing but implement a key/value store -- nor the distributability, for that matter.

  • Re:Hey! (Score:5, Insightful)

    by Just Some Guy ( 3352 ) <> on Friday February 13, 2009 @06:04PM (#26850065) Homepage Journal

    There's a reason relational databases took over the world of databases: They provide a good combination of flexibility and structure to efficiently represent data.

    Especially since so many databases really are inherently relational. The textbook example of 1-customer:n-invoices, 1-invoice:n-items plays out quite a bit in the workplace.

  • by jernejk ( 984031 ) on Friday February 13, 2009 @06:12PM (#26850187)
    form the article: "For example, a relatively simple SELECT statement could have hundreds of potential query execution paths, which the optimizer would evaluate at run time. All of this is hidden to us as users, but under the cover, RDBMS determines the "execution plan" that best answers our requests by using things like cost-based algorithms." So, you have no idea how optimizers work and how you can access tuning information, and you'd like to tell us RDBMSs are bad? Get of my lawn! (yay, I'm getting old)
  • by poot_rootbeer ( 188613 ) on Friday February 13, 2009 @06:19PM (#26850271)

    they think Nissan makes the Civic!

    This lack of data integrity could have been prevented if they had used a relational database...

  • by sl0ppy ( 454532 ) on Friday February 13, 2009 @06:21PM (#26850285)

    The relational database is not going anywhere and nothing in that article is based on any firm understanding of managing data.

    no, the relational database is not going anywhere, you are correct. but, that does not mean that there aren't instances where a non-relational database, with the addition of map/reduce, aren't extremely useful.

    non-relational databases have been around for decades, and are in use for quite a number of applications involving rapid development and storage of very large records. couple this with map/reduce, and you have the ability to scale quickly with very large datasets.

    scaling quickly is a very difficult problem to solve with an RDBMS - you either need to continue to throw more hardware at the problem, to the point of diminishing returns, or re-architect your data at the cost of possible significant downtime, while still attempting to serve up the data in a timely manner. i've been deep in the bowels of oracle RAC, fighting to get just 5% more speed out of a query over a billion rows and realizing that i have to start over with a new schema, just to squeeze more data out. compare that to simply adding another machine and letting the map functionality run across one more cpu before returning it for the reduce.

    Is the notion of a "join" obsolete? No, but it is typically impractical in a high volume system. You would probably use denormalization as a strategy.

    once again, correct, but having to denormalize to a snowflake or a star isn't always the best solution. you're taking the best parts of the relational database model, and throwing them out - normalization, referential integrity, just to squeeze more out of something that may not be the best tool for the job.

    do you hammer with a wrench? i have before, and i managed to hurt my thumb.

  • Re:WTF? (Score:3, Insightful)

    by Lord Ender ( 156273 ) on Friday February 13, 2009 @06:32PM (#26850401) Homepage

    If "key/value" databases do become more popular, they certainly might eat in to relational database mindshare. 90% of web applications use RDMSs merely as persistent data storage--the fact that they are "relational" doesn't matter at all; the fact that a separate SQL language is needed to get the data (rather than using language-native data structures as an interface) is even a negative for RDMs.

    As a web app developer, I'm excited that something other than SQL is getting attention. RDMSs won't go away because they have properties data miners, for example, need. But they aren't ideal for the simple persistent data stores most apps call for.

  • by digitig ( 1056110 ) on Friday February 13, 2009 @06:41PM (#26850485)

    In headlines, "?" implies that something is a sensationalized question, whose answer is "almost certainly, no".

    Fixed that for ya.

  • by DragonWriter ( 970822 ) on Friday February 13, 2009 @07:02PM (#26850711)

    So we can establish that a SQL relational database can do *everything* a simpler system can do.

    In terms of expressive power, sure, but no one is arguing that distributed key/value stores are going to gain against RDBMS's because they have superior expressive power. What is being argued is that they will do so because they have superior scalability and distribution properties, and that in many real-world applications those are more important than the having the full expressive power of relational algebra. Particularly as you get ones that can provide ACID guarantees, that becomes a compelling selling point in many applications where RDBMS's would otherwise be used simply because they are the only available tool, but where distributed key/value stores are a better tool.

  • by lgw ( 121541 ) on Friday February 13, 2009 @07:35PM (#26851031) Journal

    What do you mean by "informatin, not just data"? It seems like you have specific, personal definitions of those words that others might not share.

    If you make sorting the responsibility of the client, what do you do with large result sets? You can't sort chunked data client-side, as you have to sort before chunking. There should be *some* answer for result sets that don't fit in memory (client or server). I'd be happy with only being able to get results in a certain order if I've already built an index accorind to that ordering criteria, or something equally elaborate, but what's an index in your scheme?

  • Re:Ridiculous (Score:2, Insightful)

    by Grapedrink ( 1298113 ) on Friday February 13, 2009 @07:47PM (#26851201)

    I agree with you on a lot of points, particularly people coming up with stupid solutions and creating new problems, but how is the rest of this insightful? Sure, relational is a good general fit databases, but it sounds like you are saying the fact that you can query and modify it using something like SQL in most implementations makes it great?

    Exactly how is that easier than some other ways, such as building an object database? Can't you just write a few lines of code that are far more expressive than any SQL ever could be in a language like Common Lisp, Smalltalk, Python, Ruby, etc? Isn't that more accomodating than a relational model which limits your options due to performance vs. flexibility vs. integrity vs. extensibility vs. scalability? How does SQL give you more ways to manipulate things than a map, collect, slice, reduce, anonymous function/lambda, etc?

    I use both relational and object databases (preference to object dbs in all honesty). For an object database, my process both in use and development is to write and modify like it sounds, objects. Instances of objects in those classes are automatically stored for me and even in most implementations, class level data as well. I simply write my code and trudge along and do not worry about some ridiculous ORM. If I need transactions, I have them at the object level which I would want anyway even with a relational DB.

    If I need a query, it is done in a well-known language that I used to write the application. I can of course see if there was no application, it might annoying to do this and relational can make some of that easier, but that is rarely the case. Further, I don't hit as many bumps where I need to denormalize my data to do reporting or data warehousing. I simply once again write code as normal to get what I want.

    A great example is try storing an organizational hierarchy in a database. Query it for basic info such as a list of a manager and all subordinates and superiors. Now try to ask it for the full path between employees. Keep asking it questions about the hierarchy. In just about every relational db it is a fail. Oracle for instance even realized things like this and added "Connect By." Storing the data itself is a nightmare and you end up needing something like nested sets, self joining queries, cursors (never), handing it off to an application (aka relational failure), or materialized path.

    You run into other similar problems where you see hackish solutions in the realtional world like table inheritance. Why have it if a relational database is so good? It is there because relational completely fails here, just like object databases fail elsewhere. There is no ideal solution, and for general cases both work great in my experience, even giving an edge for web applications to object dbs.

    There are so many areas where either the relational model itself, or SQL fails. If you have not hit them, then you have not used relational databases as much more than a glorified spreadsheet. The amount of time I spend tuning my queries in a relational db is ridiculous, even for relatively simple data. Hints, denormalization, columns as rows, cursors, triggers, user defined functions, and other such devices are all crutches for relational dbs. Of course some of those are also caused by bad devs of course, but it need not be that hard in the first place.

    Anyway, I am not trying to slam the relational model. Rather, I think you are wrong to say it's the best and most flexible. Like all things, it depends what you are doing, and in my own experience object databases have been far easier to work with and maintain. I must save months of work every time I use one, but general ignorance often forces me to use either object or relational. If people better understood the strengths of each and paid more attention to each specific task rather than marketing, we would all be happier. It's sad that complaints about tools for example are even valid points. If you market the hell out of something and it just becomes the standard for whatever reason, then of course it is going to win in areas like that. You would think with all the anti-Microsoft rhetoric around here, people would get it.

    For now, I'll continue to use both and enjoy them for different reasons.

  • by plopez ( 54068 ) on Friday February 13, 2009 @07:55PM (#26851293) Journal

    There really isn't a true implementation of the relational model as per Codd and Date.

    Also, SQL is a nightmare. A badly designed programming language which is not quite functional and not quite procedural and so needs a bunch of hacks to work properly. And then there is the issue of NULLS. And the fact that you can end up with ugly bag operations and path dependencies in SQL.

    And just to start yet another flame war (Iknow, I just know some one is going to mod me as a troll today) key/value is just another way of saying "network database".

    And another thing which I will probably get hammered for, if you normalize a DB properly you will get you objects almost for free. And vice versa. Where I see people having problems is that they either are :

    1) lazy about defining and understanding their data
    2) or likewise for their objects
    3) or both.

    If you do it properly will will get a nice set of multidimensional objects and fact/attribute tables which are orthogonal and lean. Easy to understand, search, join, build, compose, decompose, signal and track.

    As opposed to a snarled up hacked together, overloaded, over inherited nightmare with hidden dependencies which I have seen too many times.

    OK, you can slam me now.

  • by zenlunatics ( 516752 ) on Friday February 13, 2009 @08:11PM (#26851477) Homepage
    so your hate for Obama is strong enough to wish that the entire country has a bad 4 years? gee, thanks.
  • by DragonWriter ( 970822 ) on Friday February 13, 2009 @08:12PM (#26851489)

    If you are willing to get rid of ACID like the other solutions, there are no limitations.

    The other solutions (see below) do not, in all cases, "get rid of ACID".

    Please site one example, just one, where a simple key/pair data system is the "better" solution for a high volume site than a more powerful database like PostgreSQL wouldn't do a better job.

    Scalaris, a distributed transactional key/value store that does not get rid of ACID, is one of the "other solutions" (and one that has been demonstrated, by replicating Wikipedia on a distributed cluster, to scale better, at least, than Wikipedia's existing MySQL platform).

  • by DragonWriter ( 970822 ) on Friday February 13, 2009 @08:19PM (#26851539)

    Long ago, hardware made much more of a difference than it does today and was one reason relational databases "won" out.

    Hardware makes just as big of a difference today, which is why distributed key/value stores are gaining currency at the moment. The hardware-related difference that was a big win for relational databases was their efficient use of disk space when normalized; the hardware-related difference that is a big win for distributed key/value stores now is their efficient scalability by distribution across multiple nodes.

    I am going to tear my eyes out if I see "yet another tuple store or graph db." Welcome to the last century, please try again.

    The big thing isn't "tuple stores or graph dbs" its distributed tuple stores, and, even better, distributed transactional tuple stores. Not a whole of them from the last century.

  • by Grapedrink ( 1298113 ) on Friday February 13, 2009 @08:29PM (#26851627)

    Microsoft has CLR code running on top of MS SQL but it sucks performance wise. Oracle has Java. That's about as close as we have gotten, but both are just crutches.

    Unfortunately, you are right that SQL is terrible and not going away. The status quo, industry, and marketing will make sure we suffer for years to come.

  • by emurphy42 ( 631808 ) on Friday February 13, 2009 @10:31PM (#26852561) Homepage

    Paradox's query-by-example

    *looks up* GUI query builder? Highly appropriate for simple things (e.g. Crystal Reports), but absolutely terrible for more complex things.

  • by Just Some Guy ( 3352 ) <> on Friday February 13, 2009 @11:04PM (#26852775) Homepage Journal

    Database operations do not need to look like code or algorithms, the only reason they do is to provide jobs for database programmers.

    From Wikipedia []:

    Relational database theory uses a different set of mathematical-based terms, which are equivalent, or roughly equivalent, to SQL database terminology.

    SQL looks like SQL because it's based on set theory. As an exercise, invent your own language that's as powerful (read: also based on a strong theoretical basis) but simpler. See you in a couple of decades!

  • by kilodelta ( 843627 ) on Friday February 13, 2009 @11:56PM (#26853071) Homepage
    Reading this I keep seeing OOP in there, and data as an object class.

    This is just the OOP crowd trying to not learn SQL and do things their way. It won't replace a full RDBMS. And an RDBMS can scale quite nicely if you know what the hell you're doing.
  • by horza ( 87255 ) on Saturday February 14, 2009 @01:34AM (#26853527) Homepage

    let me guess, you don't like mssql because it's microsoft? what a fucking sheep, mssql is a great database.
    oh and i've used all the others and for you to suggest mysql over mssql tells a lot...

    MSSQL? Isn't that the only database that isn't cross platform these days? Why would anybody want to use MSSQL outside of .Net developers? On a side note, why is it that only MSSQL appears to get crippled by worms and none of the others?


  • by Estanislao Martínez ( 203477 ) on Saturday February 14, 2009 @01:50AM (#26853599) Homepage

    Yes, these newer simple key/value databases like BigTable and CouchDB are effectively a subset of RDBMS functionality, so of course the same thing can be implemented relationally by just not using features.

    What worries me about these arguments, however, is that they're missing a point that's very similar to yours here: these high-performance key-value databases can be implemented as features in an RDBMS. Basically, if you have a technology that allows some limited type of database to be distributed across tons of nodes and to be queried really fast, well, that's a kind of limited-functionality materialized view [] with a special engine to access it. So put it in as a subsystem to the full RDBMS, and use your plain old full-featured relational engine as the system of record that solves the concurrent transactional update and data integrity problems, and have it also push out the deltas to the specialized store that supports the the high-performance distributed querying.

    Nobody is denying that there are many applications where you don't need all that the relational model provides, and that those applications can be made to perform faster by not providing certain features. What people repeatedly fail to understand is that this is not a refutation of the relational data model, because it is a logical and general data model that's capable of modeling the data in such applications, and does not dictate the implementation.

  • by Matt Perry ( 793115 ) <> on Saturday February 14, 2009 @02:26AM (#26853753)

    do you hammer with a wrench? i have before, and i managed to hurt my thumb.

    Not usually, but I have done so before. If it hurts your thumb, you're holding it wrong.

  • Re:Ridiculous (Score:3, Insightful)

    by xelah ( 176252 ) on Saturday February 14, 2009 @08:28AM (#26855083)

    Sure, relational is a good general fit databases, but it sounds like you are saying the fact that you can query and modify it using something like SQL in most implementations makes it great?

    If you're a DBA, system administrator or tester - or if you simply have to do something ad-hoc and dodgy as a quick fix on a live system - then this makes it not so much great as absolutely fantastic. You can do things like:

    • Look at the most time consuming queries and analyze them, optimize them, add/remove indexes or move tables or indexes between different sets of disks. And when you do, the query plans will change because they have been frozen in to the application code.
    • Make ad-hoc changes, or generate ad-hoc reports (or run a query from cron, say) without having to write a little program every time.
    • Examine the data following your software screwing up, and fix it.
    • Run the queries your software has generated and check the results. Correct the query and try again.
    • Fetch a list of currently held locks, or examine the queries which have resulted in deadlocks being reported to the log.
    • Add columns to support admin or reporting functions (or a second application) without worrying about the effect on the (still running) original application.
    • Write a reporting system which programmatically generates queries and has the DBMS do the difficult bit of working out query plans.

    These aren't specific to relation databases or SQL, of course. However, having a query language is amazingly useful.

    I'm surprised you're complaining about having to tune your queries. A lot of databases and SQL have shortcomings, but it's really not that hard if you know your database well (and haven't chosen, say, MySQL). You must still have a query plan with your object databases - it's just implied by your code. (I'm assuming you're not using some sort of alternative query language, because you're comment suggests otherwise and you'd only have to tune that instead). It won't adapt to changing data or indexes, and you're going to have a lot of work to do if you want to duplicate some of the more sophisticated techniques a modern database will use. Worse still, you're going to have to change your application, add some sort of profiling and run it in-place or in a test harness to work out why it's taking as long as it does. And when you want to try a different plan you have to rewrite your code.

    It's the ORM layer that's the real pain in the arse (assuming you're using OOD, and assuming you actually want a direct mapping between your object model and relational model). Things like Hibernate and judicious use of code generate make it a lot easier, but you still need to know what's going on and you still need to (and can!) choose between navigating among objects (letting the ORM do the queries) and generating a hand-written query. To some extent an ORM (and the RDBMS vs OODBMS choice) is just a reflection of the different requirements of on-disk vs in-memory representations of objects. On-disk storage is all about efficient and flexible querying, retrieval, (distributed) concurrency, storage and management of huge data-sets, whereas in-memory storage is all about assigning behaviour and navigating relationships between smaller sets of objects whilst carrying out that behaviour.

    In any case, the original article is just silly. How does taking all the formal structure away make any difference to the fundamental scalability restrictions - your applications need for data consistency (across nodes) and concurrency control? I work in ticketing. It's not the relational model that causes scalability problems, it's the fundamental fact that 100k people are competing for access to 10k seat statuses, that when we check per-person ticket limits or assign seats we need 100% up-to-date data, that we regularly need to fetch the status of all the seats in a block for display, etc. I believe that concurrency and scalability concerns

  • Re:WTF? (Score:3, Insightful)

    by ultranova ( 717540 ) on Saturday February 14, 2009 @12:51PM (#26856491)

    If "key/value" databases do become more popular, they certainly might eat in to relational database mindshare.

    A "key/value" database is simply a relational database with a single table and two columns. It doesn't make any sense to build a separate server program for what current database servers can already easily do.

    90% of web applications use RDMSs merely as persistent data storage--the fact that they are "relational" doesn't matter at all; the fact that a separate SQL language is needed to get the data (rather than using language-native data structures as an interface) is even a negative for RDMs.

    I'm a bit uncertain what you're saying here. Surely the fact that the server can do more than what you need doesn't hinder your program? The same goes for SQL language; surely the fact that commands sent to the database are text strings isn't a negative? In any case, you can (and probably should) separate database access into a module of its own, offering whatever API you desire for the rest of the program.

    As a web app developer, I'm excited that something other than SQL is getting attention. RDMSs won't go away because they have properties data miners, for example, need. But they aren't ideal for the simple persistent data stores most apps call for.

    However, they can handle such data stores in a very simple fashion. A pair of "setvalue(key, value) / getvalue(key)" is trivially easy to implement on top of SQL language. It just doesn't make sense to pour resources into developing a less capable database server.

  • by jadavis ( 473492 ) on Saturday February 14, 2009 @01:19PM (#26856659)

    We need to stop referencing data by where it is and start referencing it by what it is.

    You say that without any explanation of your apparent position that the relational model requires you to reference data by "where it is".

    You seem to think that the semantics of your system are somehow richer -- providing "information" rather than "data".

    Do you even know what a relation is?

  • by EastCoastSurfer ( 310758 ) on Saturday February 14, 2009 @02:20PM (#26857181)

    First, all applications have bugs that open them up to security flaws. Picking on MSSQL in that area is a non-starter.

    What you're missing are all of the tools that come with a MSSQL license. SISS and MSAS are two big ones that are hard to replace with open source tools (Pentaho is interesting). If all you're looking to replace is a pure data store then yeah, postgre is what I would move to. When you start replacing all functionality offered by MSSQL it gets a little more complicated.

Thus spake the master programmer: "Time for you to leave." -- Geoffrey James, "The Tao of Programming"