Forgot your password?
typodupeerror
Databases Software Programming IT Technology

"Slacker DBs" vs. Old-Guard DBs 267

Posted by kdawson
from the close-enough-for-web-work dept.
snydeq writes "Non-relational upstarts — tools that tack the letters 'db' onto a 'pile of code that breaks with the traditional relational model' — have grabbed attention in large part because they willfully ignore many of the rules that codify the hard lessons learned by the old database masters. Doing away with JOINs and introducing phrases like 'eventual consistency,' these 'slacker DBs' offer greater simplicity and improved means of storing data for Web apps, yet remain toys in the eyes of old guard DB admins. 'This distinction between immediate and eventual consistency is deeply philosophical and depends on how important the data happens to be,' writes InfoWorld's Peter Wayner, who let down his old-guard leanings and tested slacker DBs — Amazon SimpleDB, Apache CouchDB, Google App Engine, and Persevere — to see how they are affecting the evolution of modern IT."
This discussion has been archived. No new comments can be posted.

"Slacker DBs" vs. Old-Guard DBs

Comments Filter:
  • Is it just me or did this article go out of its way to insult people who use "traditional" RDBMSs?

    I mean, I'm well versed in SQL and data consistency et al, but I'm still more than willing to consider new technologies. What the hell?

  • by qoncept (599709) on Tuesday March 24, 2009 @01:35PM (#27315613) Homepage

    Now that disk space is so cheap and many of the data models don't benefit as much from normalization, ...

    You don't want to store the same data in multiple places. Your query might run faster, but your data integrity is going to suck.

    And, uh, I have the pleasure of working now with a huge data warehouse that hasn't normalized status codes, so instead of quickly searching for an integer, the queries run slow as hell scanning char fields. It's not good.

  • Laziness Rules (Score:5, Insightful)

    by ergo98 (9391) on Tuesday March 24, 2009 @01:37PM (#27315651) Homepage Journal

    Slacker DBs like CouchDB and SimpleDB, have taken off for the simple reason that most developers have absolutely mediocre database knowledge or skills, and rather than learning it's just as easy to just wave it all off as obsolete.

    It's no surprise that the creator of CouchDB, for instance, hadn't a clue about databases when he began his project. All of that built up knowledge just ignored while someone invented their own, and it's as rational as rolling your own encryption from scratch without the slightest clue about encryption algorithms or theories.

  • by Anonymous Coward on Tuesday March 24, 2009 @01:38PM (#27315667)

    Why the need to make it 'old guard' vs 'new guard'... seems like flamebait for fanboys.
    "tastes great" vs "less filling", or just explain the merits of both and leave it there.
    It's like forum kiddies arguing raid 5 vs raid 1+0.

    Yes, if the database is important, you want the most CAREFUL management available. Obviously.
    But if these -db apps work fine, and your data isn't corporate mission critical, who cares?

    Seems to me convenience and interoperability score higher for most small datasets, am I wrong?

  • a base of data (Score:5, Insightful)

    by poot_rootbeer (188613) on Tuesday March 24, 2009 @01:44PM (#27315743)

    "tools that tack the letters 'db' onto a 'pile of code that breaks with the traditional relational model"'

    If "database" were intended to mean only "relational database", we wouldn't have had any need for the latter term...

  • by alen (225700) on Tuesday March 24, 2009 @01:47PM (#27315781)

    the article is right that in some cases it doesn't matter if a transaction is lost. but in any case where money is involved it's a must. you can't just start a fund from your Oracle or SQL Server savings to pay for mistakes because it will kill your brand and you may lose a lot of future business. and any savings will be eaten up by the extra cost to hire people to solve all the data problems

    i've seen this. no constraints on the data that is orginally put in, not enough referential integrity and you get customers opening up a lot of trouble tickets and you end up hiring people to clean up the data every time a mistake is found

  • Re:Laziness Rules (Score:3, Insightful)

    by phoenix321 (734987) * on Tuesday March 24, 2009 @02:02PM (#27315991)

    Problem is, you're re-inventing the wheel several times over in the process. Hint: "a flatfile and maybe a little more" could very well be all the storage technology invented today only a few years down the road.

    At first, all you need is to store key:value pairs. That works with a flat file or with Oracle. Then you need some consistency checks, which are can be modelled fast in Oracle or reasonably fast in your software. Then you need some triggers, which could be written fast in Oracle and not-so fast in your software. And so on until you have progressed through the whole platform effect with several squeaky wheels invented and thousands of hours wasted.

    Any project worth doing that involves storing key:value pairs is worth a real database. Take the tiniest, lowliest member of the crowd as long as it can somehow speak SQL and allows to be linked and unlinked into the project. Everything else will require at least a medium rewrite at some point when you switch over to a real database. You could of course extend everything upon a glorified flatfile until your reinvented wheels strangles all your progress.

  • Harsh? (Score:3, Insightful)

    by Bobb Sledd (307434) on Tuesday March 24, 2009 @02:09PM (#27316115) Homepage

    I'm a DB admin, and I use things that aren't toys; but what I've heard here is kinda harsh.

    Look, it's all about "right tool for the right job." Why do you need a nuclear-powered drill that can make a tunnel from here to China, when really all you needed was a shovel?

    For most daily projects that have small amounts of data, they may be using something like Crystal Reports or Excel or SPSS that just does all the number-crunching client-side anyway. You don't always need Oracle or [favorite DB flavor] for that.

  • Re:Laziness Rules (Score:5, Insightful)

    by Ambiguous Puzuma (1134017) on Tuesday March 24, 2009 @02:12PM (#27316161)

    If you want "a little more" than a simple flat file, perhaps SQLite [sqlite.org] is the answer? The people on the Firefox team seem to think so, for example.

    SQLite has been a pleasure to use for a small personal project involving a few Perl scripts. Granted my background is with SQL Server and Oracle, so perhaps I'm not the target audience, but I found it extremely easy to use and surprisingly efficient--and I didn't need to set up a server or anything. I didn't even need to explicitly create a database!

  • by dacut (243842) on Tuesday March 24, 2009 @02:18PM (#27316257)

    MySQL strives to provide RDBMS and ACID semantics, though its quality of service (QoS) may fall short. By contrast, these "slacker" databases don't even try to support RDBMS or ACID; even if they operated perfectly, they won't provide RDBMS/ACID.

    I work for one of the companies in question (no, I don't speak for them). We rely heavily on a combination of these "slacker" dbs, Berkeley dbs, memcached, Oracle, flat files, and tape backups. Each fills a niche. I wish these articles would quit trying to create a false dichotomy.

  • Re:Laziness Rules (Score:4, Insightful)

    by diamondsw (685967) on Tuesday March 24, 2009 @02:25PM (#27316365)

    >Damien Katz, CouchDB's creator ... worked on Lotus Notes prior to that...

    That's not exactly a ringing endorsement.

  • How does it work for searching though? If I just have my "freespace" file and my pointers to records, does a search for some piece of user requested data have to hit every record or is there a hash somewhere for the data contained in the record? You don't mention it in your description.

    It seems that the biggest advantage to a relational DB is that the syntax for accessing it is well known, SQL. It has a human read-able interface and while sometimes whonky to work with for complex operations, it provides the simplest cross-platform way to access data. I don't need to know which data blocks hold the data, I just ask the database for them "SELECT slashdotid, name FROM users where slashdotid 20000"... and I get rows of data.

    Could I just read it from a file? Yes. Would it be simpler? Maybe. But what if I have 200001 records, then I have to do some magic sorting in my program, and I have to manage memory for them, and disk space, etc. It is simpler to let the DB handle that mess and I just ask for the data I need.

    It breaks up the process of programming into data storage and data manipulation/presentation. DB's for storage, my bad python for manipulation and presentation.

    --Donald

  • by Anonymous Coward on Tuesday March 24, 2009 @02:41PM (#27316625)

    Well, fuck them very much!

    --AC

  • by Prototerm (762512) on Tuesday March 24, 2009 @02:45PM (#27316687)

    You may have seen in the news recently how in the last decade or so Wall Street ignored some of the hard-won regulations and guidelines developed in the wake of the Great Depression.

    We all know what happened as a result.

    The same is true when dealing with data. You don't ignore the rules completely, or follow them only when you feel like it, or when you have time. As the old joke goes, Quality is *not* Job 1.1.

    If the data isn't important enough to store correctly, then it's not important enough to be stored at all.

  • by Thaelon (250687) on Tuesday March 24, 2009 @02:45PM (#27316699)

    Databases at a very abstract level are just data structures. Choosing a relational database when you don't need that much functionality is just as wrong as choosing a flat file when you need a database.

    Knowing the ins & outs of your data structures is still a vital skill of programming.

  • by Anonymous Coward on Tuesday March 24, 2009 @02:49PM (#27316735)

    Maybe the fascination with relational databases is that you can easily work with the data in there.

    What you describe just sounds like a file system. A specialized one, but it doesn't really support more than a filesystem does. Everything works fine if you have the key to the data. You can read the data, do your stuff, and update the data. But what if your problem it to find the key? Like you want to know which orders are overdue? Doesn't sound like the freespace file will help me there. Sounds like I have to implement the whole searching by myself.

    When I am searching for a database solution then probably because I really need that searching and I want it to be fast, and I don't want to do it myself.

    What you suggest doesn't sound like a database. It sounds more like an allocation scheme any database could use under the hood. What you suggest may suffice if my requirement is a high performance filesystem. But I don't see how it supports even the most basic database operations. I don't say your solution is bad. It just doesn't solve the same problem as a database.

  • by plopez (54068) on Tuesday March 24, 2009 @02:50PM (#27316755) Journal

    so you start a small project, "we just need a few hundred/thousand records, a few key value links and the occasional transaction". so you start with a slacker DB. A slacker DB far too often implies a slacker hack software d00d.

    Then it grows. Instead of educating themselves (Q: what's the difference between those who can't read and those who don't? A: nothing. ) and finding a better DB solution they thrash around trying to hack in DB functions into their code.

    So they lose consistency etc. Soon they have a polluted DB that breaks all the time. Often they are proud of the heroics of the wasted effort they put into it. A good programmer know how to be correct form of lazy: do not reinvent the wheel.

  • by LWATCDR (28044) on Tuesday March 24, 2009 @02:56PM (#27316827) Homepage Journal

    Okay how do you find the data without a record number? I can see the value of the system but it also seems very inflexable.
    I do agree that way to many programmer use MySQL for a file system, flat files, configs, and goodness knows what else.
     

  • Re:Laziness Rules (Score:3, Insightful)

    by sl0ppy (454532) on Tuesday March 24, 2009 @03:07PM (#27317043)

    Everything else will require at least a medium rewrite at some point when you switch over to a real database. You could of course extend everything upon a glorified flatfile until your reinvented wheels strangles all your progress.

    not really. i think that you (and, unfortunately, the FA) are missing the point that the map and reduce functionality, while powerful, have one major advantage: scalability. simply put, a query can be, by definition of the map function, broken up into several discrete operations and performed simultaneously on the data.

    while this can be done in Oracle, using RAC, to some extent, the cost and complication is a major barrier to entry. Cache-Fusion, while typically good, can also end up being a liability when the cost based optimizer attempts to split up the query into atomic tasks in order to correctly parallelize the query. for instance, on one application of RAC (multiple multi-core servers, fibrechannel disks, and oracle clustered filesystem), across 100,000,000+ rows, when heavy writes were occurring, it was cheaper computationally to force a full disk scan, using hints, than to rely on Cache-Fusion to figure out what data was stale and what data was fresh. this was discovered after several days spent neck deep in tkprof output.

    conversely, map, by design, already does this.

  • Re:Laziness Rules (Score:2, Insightful)

    by Trifthen (40989) on Tuesday March 24, 2009 @03:35PM (#27317629) Homepage

    That's what I don't quite understand about all this. It's been the case for a while now that:

    1. If you want a full RDBMS, use Oracle, or PostgreSQL, or a similar ACID + SQL92 compliant DB.
    2. If you don't really care, use MySQL.
    3. If you want ridiculous speed, and actively hate your data, use SQLite.
    4. If you have one file, or maybe two, use BerkeleyDB or similar.
    5. Flat files are fine for config.

    I'm not sure we need yet another category here. Then again, we're now seeing things surfacing like database sharding [codefutures.com] which currently limits all data interaction to whichever application managed the data distribution. It would be nice to see a DB capable of hiding such things behind the classic SQL engine so not every client app and API requires the chosen sharding method implemented in possibly mutually exclusive and buggy ways.

  • Re:Laziness Rules (Score:3, Insightful)

    by tepples (727027) <<tepples> <at> <gmail.com>> on Tuesday March 24, 2009 @03:52PM (#27318087) Homepage Journal

    If you want ridiculous speed, and actively hate your data, use SQLite.

    Care to explain why SQLite requires one to "actively hate [one's] data"?

  • by petermgreen (876956) <plugwash.p10link@net> on Tuesday March 24, 2009 @10:34PM (#27324507) Homepage

    the thing that always puzzled me about berkerlydb is it's incessent format breakage requiring dumps and restores.

    On a database server at least data upgrading can be handled centrally but on a file based DB where datafiles can be scattered anywhere a lack of a stable data format seems like a fatal flaw.

  • and in the Orders table once per order.
    I'd disagree on this one, it seems to me like it would be a good idea to record the customers name and address (probablly both billing and shipping) at the time of an order even if they later change the details on thier account.

  • Pointless (Score:2, Insightful)

    by EvilIntelligence (1339913) on Wednesday March 25, 2009 @09:01AM (#27329039)
    As a DB admin myself, I find these "Us vs Them" arguments to be ultimately pointless. A company will choose a database based on the application's needs. If "immediate consistency" is needed they will choose a standard relational database. If "eventual consistency" is acceptable, the company may opt for one of the other "not-so-relational" databases. The fact that there are other options is actually a good thing. The "old guard" needs to find the positives and embrace change, or run the risk of being left behind in an evolving world of technology.

Forty two.

Working...