Forgot your password?
typodupeerror
Databases Programming

Why Some Devs Can't Wait For NoSQL To Die 444

Posted by Soulskill
from the must-be-the-insurance-policy dept.
theodp writes "Ted Dziuba can't wait for NoSQL to die. Developing your app for Google-sized scale, says Dziuba, is a waste of your time. Not to mention there is no way you will get it right. The sooner your company admits this, the sooner you can get down to some real work. If real businesses like Walmart can track all of their data in SQL databases that scale just fine, Dziuba argues, surely your company can, too."
This discussion has been archived. No new comments can be posted.

Why Some Devs Can't Wait For NoSQL To Die

Comments Filter:
  • by BadAnalogyGuy (945258) <BadAnalogyGuy@gmail.com> on Sunday March 28, 2010 @11:31AM (#31647550)

    People who don't like SQL should get their heads out of their asses and use MySQL, a robust and enterprise-ready database.

    Interesting thesis...

    • Re:Article summary (Score:5, Insightful)

      by digitalunity (19107) <digitalunityNO@SPAMyahoo.com> on Sunday March 28, 2010 @11:37AM (#31647606) Homepage

      My experience has made me believe PostgreSQL is better in every respect. It's more stable, has more features and is easier to use. The article wasn't specifically pro-MySQL.

      The article is largely correct. The movement to ditch SQL databases is really naive. SQL scales just fine, if you know how to use it right. Look at Oracle solutions. All their fancy eBusiness software is still Oracle SQL DB backed and some of the biggest companies in the world are using it.

      SQL isn't the problem, it's a tool. Bad programmers are the problem.

      • Re:Article summary (Score:5, Interesting)

        by RedMage (136286) on Sunday March 28, 2010 @11:50AM (#31647704) Homepage

        We're using both - about five days from our "go-live", and things look good. We just use what makes sense for each part of our application.
        For us, this means PostreSQL for the parts that must be transactional ACID, and Amazon's S3 and SimpleDB for parts that don't. In practice, for the 1.0 release, this means things like notes, user accounting, and documents are in S3 and SDB. The rest is plain ole SQL.

        Not that there wasn't a learning curve with our developers - we're a bunch of old-time enterprise type developers, so "letting go" and moving out of the traditional SQL world took a little thought and proving time. We'll use the first few months to learn more about doing architecture this way.

        We've had the language wars - lets avoid the SQL/NOSQL wars please. I'm tired.

        • by MooUK (905450)

          Would it not have been less complex to use PosgreSQL for everything, or was there enough difference to be worth the complexity?

          • Re: (Score:3, Interesting)

            by RedMage (136286)

            Would it not have been less complex to use PosgreSQL for everything, or was there enough difference to be worth the complexity?

            Turns out, yes and no. We're distributed already, so it would have entailed setting up another DB anyway, and all the management infrastructure around that. AWS also seemed like a good fit for things that were essentially document-oriented and it seemed that it would be efficient for this kind of data model.

      • by amorsen (7485)

        SQL isn't the problem, it's a tool. Bad programmers are the problem.

        Relational databases are quite useful. It's too bad they're hampered by such a lousy syntax though. It's like if we all decided to stick with COBOL but added closures and templates and whatnot...

        • Re:Article summary (Score:5, Insightful)

          by deniable (76198) on Sunday March 28, 2010 @12:02PM (#31647804)
          Some of us are simply looking to not use the relational model for *every* bit of data in the system. Application global, put it in a table. Uploaded files, put them in a table. User data, get it from LDAP, nah, create our own table and get somebody to feed it manually. Given the number of apps I've seen that use SQL as a simple key/value store, it's no wonder that there are techniques to avoid the overhead completely.
        • by timeOday (582209)
          I agree, the syntax of PROLOG, for example, seems much simpler, more powerful, and makes more sense to me. There have been many attempts to fuse it with SQL, but nothing past the scale of a few researchers from different universities working together.

          But when all is said and done, you can get familiar with most of SQL in a couple weeks. No doubt mastering all the intricacies of Oracle takes years, but not, I think, due to the SQL syntax.

          • Re:Article summary (Score:5, Interesting)

            by jc42 (318812) on Sunday March 28, 2010 @03:43PM (#31649648) Homepage Journal

            ... the syntax of PROLOG, for example, seems much simpler, more powerful, and makes more sense to me.

            Yeah, wouldn't it be wonderful if instead of all the complex cruft usually needed to find the data you need in that morass, you could just write a prolog expression and let the interpreter resolve it? But when I mention this to Team Leaders, they inevitably look at me like I'm from Mars. They have no idea what prolog is or does. (And I'm actually from a planet much farther away than Mars. ;-)

            But when all is said and done, you can get familiar with most of SQL in a couple weeks.

            True, perhaps, and I did that years ago. But that doesn't deal with the major problem with SQL: In my experience, every relational database I've ever worked with was in the grips of a set of professional RDB priests, and you didn't do anything in SQL without their blessing. If they didn't approve of what you were trying to do (typically because they couldn't be bothered to listen to you), it wouldn't get done during your lifetime.

            So I've learned to cultivate them as an acolyte. I write my "prototype" to use flat files, typically small files full of name:value pairs, sometimes with the name part the file name and the value the contents, and a directory tree of multiply-linked files to classify stuff. I agree with their criticism of this, and say that I'd be happy to convert the code to use their DB when they have the time to help me get those subroutines working right. While they chew on that, I get the project working with the flat files, and get some users using it. When the priest finally face the fact that the project works without their help, they finally deign to help.

            But I've never seen them actually get the SQL working to the point that it can supplant the flat files. The parts that do work are always so slow that turning on the "useDB" switch makes it too sluggish to actually use. In some cases, I can get around this by writing "pre-pass" code to extract the common data sets from the DB and write it to flat files, which the interactive software can read through quickly.

            It has long seemed to me that SQL and RDBs in general are Good Ideas. But unless we can find a way to end the stranglehold of the DB priesthood in an organization, it's all sorta hopeless for a mere "developer" to even consider jumping into the mess. It's better to just develop stuff that works, and let the DB experts handle the task of porting it to the DB. That way, we developers can keep our hands clean of all the theology, and actually develop stuff that works.

            Of course, this is all heresy to the True Believers ...

            • Re:Article summary (Score:5, Insightful)

              by ppanon (16583) on Sunday March 28, 2010 @07:07PM (#31651338) Homepage Journal

              It's not heresy. However, I have seen a lot of crap data models produced by developers (even worse than what I come up with as GUI designs). I have also seen developers produce SQL that looked OK at first glance but performed abysmally under certain conditions (and have even saved the odd project by finding those and fixing them when the system started dying under load). If you access a SQL database like you would a set of flat files, it is never going to give you the performance that a flat file access will give you for raw throughput because you've got all the extra communications latency. However if you re-write your search and extract queries to pull your data in a single SQL statement instead of a statement for each of your N tables involved in the result, then SQL is going to kick ass as soon as you start getting enough data and users placing enough queries that all the indexes and caching can pay off.

              Flat files will work better for certain types of unstructured data, but most people who get crap performance out of SQL databases just don't understand how to use SQL databases properly. Which is why those True Believers tend to get upset about crap SQL implementations: because those tend to bog down a SQL server and slow down all the well-written apps too.

              No, the real problem with most SQL DBAs is that they haven't adapted to agile methodologies. They still want the data model to be spring fully armored from Zeus' head according to classic waterfall planning. What they need to do is to get some data modelling tools that support round trip engineering so that they can make changes as the developer needs them and have upgrade scripts checked into source control along with the code on new builds. Right now there's only a few tools like ErWin and Data Architect that support that kind of development, and they tend to be ridiculously expensive. The one exception is DeZign for Databases Professional which is comparatively cheap. A lot of companies will lay out a lot of cash for developer tools but won't fork over the dough necessary for their data modelers/DBAs to properly support developer activity. So yeah, the DBAs tend to be a little reticent to do all that work by hand. While there are some developers who still use notepad or gedit by choice, nobody seems to expect them to do it, or to have the same productivity as someone with a decent tool chain.

            • Re: (Score:3, Interesting)

              by Hangtime (19526)

              From the immortal words of Joe Celko in response to a similar question you discuss and one of the most true statements ever written:

              My SQL program is trying to compete with a flat file system.

              If you want to get data to a single user, in a fixed format, you will
              lose. The reason we have databases is not speed. Databases are for sharing
              data (concurrency control and all that jazz), and keeping data integrity
              (normal forms, constraints and all that jazz).

              You can get to the ground floor a lot faster by jumping down an empty
              elevator shaft instead of waiting for the car to arrive. However, there
              are trade-offs ...
              --CELKO--

              If data has little to no value for you then you do not need a relational database. However, if data is of any importance to you then you have to think beyond a flat file. Flat files, hierarchal databases have been around since the dawn of computing. Relational databases were brought about to solve concurrency and integrity problems inherent in these models not to make your appl

        • Re:Article summary (Score:5, Interesting)

          by TheLink (130905) on Sunday March 28, 2010 @01:53PM (#31648732) Journal
          The syntax might be crap, but it's far easier to get everyone to standardize on SQL to talk to DBs.

          "NoSQL" stuff is fine if your company is simple in structure - very few products/services, and it has to write most of that stuff itself anyway.

          When you have many different departments with their own different apps (in house and 3rd party), and they all want to access the same bunch of databases, SQL just becomes the "standard API or language" you use to talk to them. In contrast say you have some custom "NoSQL" DB, it's going to be harder to find stuff that talks to it (you might have to write your own connectors).

          It's just like "English", the syntax might be crap, but it's far easier to get 3rd parties and other departments to use it. In contrast if you use Lojban, despite its supposed advantages you're probably going to have to get translators (or worse - train your own translators) whenever you need to deal with outsiders who don't speak it.
      • Re:Article summary (Score:5, Insightful)

        by slim (1652) <john@hartn u p .net> on Sunday March 28, 2010 @11:54AM (#31647754) Homepage

        Look at Oracle solutions. All their fancy eBusiness software is still Oracle SQL DB backed and some of the biggest companies in the world are using it.

        Yep, "nobody ever got fired for choosing Oracle".

        But to get performance and fault tolerance for Oracle, you need to throw a lot of money at it -- high end hardware, RAC licenses etc. Whereas some of the NoSQL DBs promise lots of scalability on clusters of cheap hardware -- situations where failing hardware is the norm.

        If your application suits it (i.e. your data fits the name/value system, and eventual consistency is adequate) why not use something fast and cheap?

      • by Nerdfest (867930)
        Sure you can scale SQL databases. The real point is that it takes a lot more work to do it than with a NoSQL database, and in some cases the advantages of SQL aren't worth the hassle. It depends on what problem you're trying to solve and what your other constraints are.
      • by c-reus (852386) on Sunday March 28, 2010 @12:03PM (#31647812) Homepage

        Oracle database license prices scale very well, too.

        • Re:Article summary (Score:5, Interesting)

          by BitZtream (692029) on Sunday March 28, 2010 @04:05PM (#31649838)

          Considering that by the time you 'need' Oracle, the price of Oracle is a drop in the bucket.

          The only people that ever complain about the price of Oracle are the people who will never have the need to use it because they'll never have the traffic to it to require it.

          Sorry you haven't got to play with the big boys, but in general if you spend your time worrying about how much 'software costs' your business sucks. Software costs, even for Oracle, are trivial compared to the other costs that go into it.

          An Oracle DB serving internet facing customers for instance is going to cost an order of magnitude more for bandwidth in the first year than the cost of an Oracle license to deal with it.

          But you go ahead, keep pretending you have some sort of clue and are witty by pointing out its expensive. If you ever make it to that scale, the last thing on your mind will be the price of an Oracle license.

      • Re:Article summary (Score:5, Interesting)

        by squiggleslash (241428) on Sunday March 28, 2010 @12:20PM (#31647932) Homepage Journal

        There's a fairly obvious reason for NoSQL vs Pro-SQL, and it's this: SQL is absolutely the worst database query language ever invented... apart from all the others.

        Virtually no-one who's spent any time analyzing and working with large amounts of data has a good word to say about SQL. It was designed from the start as a language that would be integrated into others, and yet simple real world realities make that impossible, with 99% of implementations being of the "Build a large string, and pass that string to "the SQL connector" to be parsed and interpreted" form. Its handling of null and the empty string is incomprehensible and useless, in part because nobody involved ever had the cajones to do what needed to be done with both. There is no standardized set of data types in the real world. Simple issues with unstandardized case dependencies can make an application that works with Oracle and only uses standard "select" statements not work under, say, PostgreSQL. And these are the surface level technical issues: talk to any relational database guru and they'll come up with numerous philosophical issues too.

        To this you add another component that's always an issue: the entirely haphazard way in which relational databases are implemented on most operating systems, whereby the DBMS is another application, that manages its own files, and needs to be coached with kind words and a happy smile in order to get anything done. Does your app use a database for something back-endy, like, for example, MythTV does for its settings and lists of channels and TV programs? Well, either forget it, or be prepared to put your users through hell as they have to ensure that the entirely separate DBMS is installed and that usernames and passwords are set up for your application's use.

        And so, naturally, people hate them. With a passion. To the point that anyone sane is going to put it low on the list for any application, even when it's entirely appropriate. Of course your multiuser databases in your enterprise environment should be stored using an enterprise grade RDBMS, and as nobody's come up with anything better, you should be talking to it using SQL.

        ...and you should be talking to it carefully. Ideally, those writing the application core should be handing over the database access to someone who can abstract each query properly. Because SQL sucks. It just sucks less than anything else designed to do the same thing.

        • Re:Article summary (Score:5, Insightful)

          by seanadams.com (463190) * on Sunday March 28, 2010 @12:37PM (#31648076) Homepage

          Does your app use a database for something back-endy, like, for example, MythTV does for its settings and lists of channels and TV programs? Well, either forget it, or be prepared to put your users through hell as they have to ensure that the entirely separate DBMS is installed and that usernames and passwords are set up for your application's use.

          sqlite is underrated and would be ideal for many such applications.

          • Re: (Score:2, Interesting)

            by Anonymous Coward

            ... were it not for the fact that SQLite is at least two orders of magnitude slower than any other database, including ones written by first year comp sci students.

            • Re: (Score:3, Informative)

              by Phroggy (441)

              ... were it not for the fact that SQLite is at least two orders of magnitude slower than any other database, including ones written by first year comp sci students.

              But if MythTV takes twice as many milliseconds to read a channel listing, it really doesn't matter. Nobody's suggesting that SQLite can replace a real database server in all cases, but performance and scalability are completely unimportant in some applications.

            • Re: (Score:3, Interesting)

              ... were it not for the fact that SQLite is at least two orders of magnitude slower than any other database, including ones written by first year comp sci students.

              One of the following two things are missing in your post:

              1) A reference to back such a bold claim.

              2) A qualifier along the lines of "... with many concurrent writers".

        • Re:Article summary (Score:5, Insightful)

          by mcrbids (148650) on Sunday March 28, 2010 @01:56PM (#31648764) Journal

          Virtually no-one who's spent any time analyzing and working with large amounts of data has a good word to say about SQL.

          I've spent 10 years developing intensively relational applications with SQL. I love it!

          It was designed from the start as a language that would be integrated into others, and yet simple real world realities make that impossible, with 99% of implementations being of the "Build a large string, and pass that string to "the SQL connector" to be parsed and interpreted" form.

          So... because people don't bother to to learn about things like prepared statements, the tool is bad? It's like saying that cars suck because they don't have cruise control!

          Its handling of null and the empty string is incomprehensible and useless, in part because nobody involved ever had the cajones to do what needed to be done with both.

          OK, so enlighten us with your brilliance! Share with us the ultimate answer of what should be done to differentiate a null (logically, "I don't know") with a blank string (logically, "We know there's nothing there") and what should be done differently?

          IMHO, the concept of "null" is a very useful one which allows a developer to differentiate between a blank answer and a no answer.

          There is no standardized set of data types in the real world. Simple issues with unstandardized case dependencies can make an application that works with Oracle and only uses standard "select" statements not work under, say, PostgreSQL.

          Woah, hold on there boy! You mean to say that features specific to one database engine won't work with another? Well spank my uncle and grease my kittens - this is amazing! Unless, of course, you stick to ANSI 92 syntax, which is pretty much 100% compatible. Yes, there's some regression testing you'll have to do against the different databases. Just like you have to do with HTML, XML, or any other standards-based language.

          (yawn)

          And these are the surface level technical issues: talk to any relational database guru and they'll come up with numerous philosophical issues too.

          Strange how you didn't manage to name even one?

          But here's the part of this whole "NoSQL vs SQL" debate - SQL is an interface API to a DBMS, it's not the database itself! You can use any number of technologies "under the hood" including those
          types of technologies commonly referred to as "NoSQL" and put an SQL interface in front! The whole idea that SQL is somehow the problem is just.... idiotic and betrays an astonishing lack of understanding by the programmer(s) involved.

          It's like saying that you should have a stick-shift car because automatic transmissions don't go as fast. It's just moronic. Arguing about NoSQL is like arguing with a tea party dolt about the "socialist" health car plan that just passed! (that was first drafted by the "right wingers" 15 years ago)

          It's argument from stupidity.

          • Re: (Score:3, Informative)

            by einhverfr (238914)

            OK, so enlighten us with your brilliance! Share with us the ultimate answer of what should be done to differentiate a null (logically, "I don't know") with a blank string (logically, "We know there's nothing there") and what should be done differently?

            Well, the way PostgreSQL handles it is that a NULL is stored as a NULL and treated as one (i.e. NULL || ' more text' evaluates to NULL). '' is stored as an empty string and processed as one (i.e. '' || ' more text') evaluates to ' more text'

            Really, that strik

        • Re:Article summary (Score:5, Insightful)

          by Kenneth Stephen (1950) on Sunday March 28, 2010 @03:10PM (#31649420) Journal

          Au contraire.

          While there are problems with SQL, 95% of its users are happy as a clam that it exists. The unhappy users are the ones who are pushing the boundaries of what SQL allows and those are the people who know SQL best. When you are writing SQL queries that span 200 lines of code, then, and only then do you begin to scratch at the limits of what SQL allows. Until then, you've only hit the limits of competence.

          I've been working with SQL for over 20 years now. I've worked with applications that didn't use RDBMS's. Some of them used flat files. Some of them used hierarchial databases. People who haven't had the same sort of experiences, haven't come to the realization of why SQL was invented - and that results in then making ill-founded statements like "SQL is absolutely the worst database query language ever invented". Utter tosh. SQL has its problems, but its one of the best. That's why it has left its competitors in the dust of time.

          I look around at all the frameworks that have evolved to not do SQL (EJB-QL, Hibernate, etc) and I laugh. None of those languages come close to handling the same breath and width of problems that SQL can be used to solve. Whenever I see advocates of these frameworks all puff up with fervour, I feel like shaking them and say "Your emperor has no clothes!". The list of problems these frameworks can't solve is so huge that one wonders why anyone works with them at all. But I suppose, there are plenty of people who work for small businesses who haven't encountered the kind of problems that big enterprises have.

          The parent poster that I'm responding to has apparently had an problems porting SQL code. But guess what? Even on the unix platform, applications written in C have had trouble being ported from one Unix to the next. People have worked around it. Nobody goes around arguing that "C is absolutely the worst programming language ever invented".

          • Re: (Score:3, Informative)

            by K. S. Kyosuke (729550)

            SQL has its problems, but its one of the best. That's why it has left its competitors in the dust of time.

            Oh, bullshit. SQL succeeded because it came from IBM, and what comes from IBM must be good by definition...or not? If we're talking about *relational* databases, then SQL is about as good a relational query language as COBOL is a general purpose language. C.J. Date wrote The Third Manifesto [wikipedia.org] for a reason.

      • Re:Article summary (Score:5, Interesting)

        by ducomputergeek (595742) on Sunday March 28, 2010 @12:41PM (#31648112)

        I don't have mod points, but I've found the same thing. It's the perfect development database if you think that your program is ever going to need to support Enterprise class stuff. On the small scale, I've found that it's fast enough. Is MySQL faster? Yes, but where I've tested it's not been enough to really matter compared to the other advantages of PostgreSQL. Primarily that it's ACID compliant. What we've found is that it works well until you start getting into databases that are GB in size. But then you can easily port the datatables to DB2 or Oracle and go. Especially if you designed the rest of the software to do this from the get go.

        In production, we moved all but one of our databases from MySQL to PostgreSQL. We were having problems with Innodb corrupted once every couple months. When it was announced that Oracle was bidding on Sun, we ported over to PostgreSQL, spent a couple weeks rewriting code, and we've not touched the Postgres database since. It's not corrupted and not even hiccuped once since we deployed. We run regular vacuuming and maintenance and that's it. It's been humming for well over a year and now is getting 400x's the use than we ever had with MySQL.

        The only thing that PostgreSQL was lacking has been HA support. There are number of 3rd party tools that run well, PGCluster, Slony, GridSQL, but this looks like PostgreSQL is going to support native replication, clustering, and HA with hot-standby...

      • Re: (Score:3, Interesting)

        by JamesP (688957)

        SQL isn't the problem

        Yes, it is

        Overhead caused by structuring your data the way relational dbs needs.
        Lack of flexibility
        Scalability capabilities (horizontal scaling is easier)
        Speed (see overhead)

      • Re: (Score:3, Interesting)

        SQL isn't the problem, it's a tool. Bad programmers are the problem.

        You could say the same about assembly language. You could also say the same about threads, and dismiss things like functional programming and the actor model as fads.

        I'll give you a simple example: Given a big transactional SQL database, if you want it to scale to more than a few machines, you're going to want to shard it. That's going to be a ton of manual work, figuring out what you can shard, what keys to shard it on, adjusting it later on the fly to ensure that each DB server has exactly what it can han

    • by Daengbo (523424) <daengbo@nospaM.gmail.com> on Sunday March 28, 2010 @11:40AM (#31647630) Homepage Journal

      "MySQL or PostgreSQL," for what it's worth. PostgreSQL is a pretty powerful database, and you should have to make a pretty good argument why leaving a well understood technology that powers a lot (an some of the largest parts) of the WWWeb needs to be trashed for something newer and less tested.

      • There are times... (Score:3, Interesting)

        by lenski (96498)

        Our development organization is heavily invested in PostgreSQL, finding it to be perfectly matched to almost all of our needs. It is exceptionally reliable, and is very (but not perfectly) manageable. (We've had issues in the past with mis-timed auto-VACUUM for instance which are now resolved.) We even found a small but significant corner-case bug which upon being reported, received immediate attention from the developers, resulting in a resolution in under 72 hours. I believe our use of this particular too

  • by sopssa (1498795) * <sopssa@email.com> on Sunday March 28, 2010 @11:31AM (#31647552) Journal

    This is like saying "I can't wait for memcached to die" just because your site doesn't need it. Fact is, some do. It's your own fault if you choose to apply unnecessary techniques.

    Don't change to newer fancy techniques if you don't understand what they are for and why would you need them.

    • Re: (Score:3, Insightful)

      by David Gerard (12369)

      memcached is most useful when the underlying app is hideously inefficient, e.g. it's pretty much essential to a MediaWiki installation that gets any appreciable number of users.

    • by outZider (165286)

      Well, no, not entirely. Not many sites out there run purely from memcached. Memcached is a component of a larger architecture. The fact remains that technologies like NoSQL are usually used/desired by people who have no understanding of system architecture, design an inefficient application, and then blame the database software for their poor decisions.

      • Re: (Score:3, Interesting)

        by Anonymous Coward

        Facebook.com, the highest-traffic site on the Internet, serves more than 95% of its data out of memcached. Twitter, Wikipedia, etc are major users too. And of course, Google serves its web index out of memory.

    • by RightSaidFred99 (874576) on Sunday March 28, 2010 @01:59PM (#31648786)
      No, he's saying he can't wait for the _hype_ over NoSQL to die.
  • by Anonymous Coward on Sunday March 28, 2010 @11:40AM (#31647634)

    XML text files all the way! /duck

  • Why? (Score:2, Insightful)

    Why should anything "die"? People choose solutions based on their individual merits. If something doesn't work, exchange it for something that does. I'm sure certain people find NoSQL-type databases perfect for their needs.

    In short, people should just shut up about other people's choices and get on with their own.
  • I liked the pictures. Is there a name for the muppet guy in the first one?
  • There's a place for SQL, but there are some cases where BigTable-like (ie. HyperTable) works better. Our company manages data using SQL, but when we present data to the users it's through a HyperTable implementation. SQL is easier to data management but HyperTable uses our server resources better.
  • by Anonymous Coward on Sunday March 28, 2010 @11:47AM (#31647684)

    It's really that simple. A standard dual socket server with the latest CPU's from Intel or AMD can handle hundreds of requests per second; if one isn't enough, just add more hardware, one month of salary can buy you another node, a year can buy you a whole cluster of rackable systems or a chassis full of blades. If it takes a few months extra for a team to solve the problem the NoSQL way, that's a few months of extra salary costs and missed sales.

    Slashdot runs on SQL. I run a site of 1M pages daily (1/3-slashdot according to Alexa) with just a single system with 2x Xeon E5420, Django/PostgreSQL at 10% load. Unless you attract enough attention to require scaling past 10M pages a day, you're wasting your time reinventing the wheel with NoSQL, just stick with a standard ORM, launch your site and start convincing customers and generate sales. You can survive a slashdotting just fine without spending so much time on those exotic tools.

    • by Vellmont (569020) on Sunday March 28, 2010 @12:45PM (#31648156)


      It's really that simple. A standard dual socket server with the latest CPU's from Intel or AMD can handle hundreds of requests per second;

      Hundreds of requests for WHAT per second?

      Your idea of "just throw hardware at the problem" isn't generalizable. Throw hardware at WHAT problem? For some problems, you're right. For others, you couldn't be more wrong. There's really no point in saying anything further.

    • by Lazy Jones (8403) on Sunday March 28, 2010 @12:46PM (#31648170) Homepage Journal

      Unless you attract enough attention to require scaling past 10M pages a day, you're wasting your time reinventing the wheel with NoSQL, just stick with a standard ORM, launch your site and start convincing customers and generate sales.

      Most of the buzz about these things comes from and is aimed at people who actually believe they'll build the next Facebook or Twitter. The fallacy is in their belief that it's the size/traffic of those sites that supposedly mandates NoSQL and not the simple data models. Some of the biggest, less spectacular projects out there run on PostgreSQL for example (Skype, Affilias = .info and .org).

    • I run a site of 1M pages daily (1/3-slashdot according to Alexa) with just a single system with 2x Xeon E5420, Django/PostgreSQL at 10% load.

      Good for you -- that says nothing about how much you're actually doing for each page.

      just stick with a standard ORM

      As a rule, I do. I use DataMapper in Ruby. It's just that DataMapper has pluggable backends, some for SQL databases, some for more exotic things.

    • Re: (Score:3, Insightful)

      by JamesP (688957)

      For the type of loads 'front-page' slashdot (and your site, most likely) gets, SQL fits fine. But even then, NoSQL may give you a run for the money.

      Now think of the loads incurred in the comment tree of slashdot.

      Also think how something like GMail or even Google Search would fit in an SQL scheme. It doesn't, not at least, with table juggling that would be very inefficient.

  • by Vellmont (569020) on Sunday March 28, 2010 @11:48AM (#31647692)

    So you're in surgery for 3 hours doing a kidney transplant, having used your trusty medium vascular clamp that have served you for the past 20 years. You're finally done and the patient is in recovery, so you sit down to relax with the latest copy of JAMA. They've got a great article about the latest development of Cardiac clamps, and you think to yourself "Why not use a heart clamp for kidney transplants!" Brilliant. So you order up some new clamps from MedicalClamps.com, and use them on your next patient. The surgery goes fine, but 3 months later the patient is back in your office with a failed kidney. You open 'em up, and it's obvious the clamp exerted too much pressure on the artery, damaging it in the process. Stupid carciac clamps! You're not a heart surgeon!

    • Re: (Score:3, Informative)

      I think this would have been better if you'd used a car analogy ... maybe something with hose clamps?
    • by gazbo (517111) on Sunday March 28, 2010 @12:43PM (#31648142)
      You missed out the bit where the article about cardiac clamps talks about how much better they are than the old-fashioned medium vascular clamp. And how every subsequent edition of JAMA has several articles all trumpeting the glory of the cardiac clamps over the now-outdated vascular clamps (although all of these articles are written by first-year med students who have never actually performed an operation - but they did once have a nose-bleed and chose to use a cardiac clamp to stop it).

      Analogies are FUN!

  • by RedMage (136286) on Sunday March 28, 2010 @12:01PM (#31647798) Homepage

    FTA:
    "In the meantime, DBAs should not be worried, because any company that has the resources to hire a DBA is likely has decision makers who understand business reality."

    Bad English aside, I just don't agree. Money != Reality. I have worked both sides of this coin - Startups with plenty of money but don't see the value in proper maintainance of the data store (one almost was put out of business by a disk failure), and very smart startups that are running lean but do understand the risks.

    That said, on the deeper level, why does business reality == SQL? Sure I can scale Oracle to support massive DB's (and have), but I could probably get more value from using Amazon's SimpleDB for things that don't require massive scaling. Use the right tool for the job - Hammers are for nails, etc. Do the design work up front, decide how its gonna work, and the right tool should present itself.

    • by pavon (30274)

      Sure I can scale Oracle to support massive DB's (and have), but I could probably get more value from using Amazon's SimpleDB for things that don't require massive scaling. Use the right tool for the job ...

      Isn't the entire point of these NoSQL databases that they offer better scalability at the cost of traditional ACID data guarantees? Why would you give up the flexibility and reliability of SQL if you didn't need massive scaling?

    • Re: (Score:3, Informative)

      by BitZtream (692029)

      If you're worrying about the cost of an Oracle license, what DB you use is irrelevent, you simply aren't large enough to make a wrong choice.

      When you are large enough for this to matter, the cost of Oracle or the cost of a handful of DBAs is the least of your concern.

      It blows my mind how much value slashdot geeks put on the cost of software. You guys have absolutely no fucking clue how much a single employee costs a company excluding salary do you? You've been spending far too much time living in the base

  • by SQL Error (16383) on Sunday March 28, 2010 @12:09PM (#31647850)

    Real business track their data with SQL databases, true. However, real businesses have small numbers of transactions relative to their value. If Walmart had the same revenue but the average sale was a tenth of a cent, their fancy SQL database would be smouldering rubble.

    That's what Facebook and Twitter and other large social media sites are facing. Just try running Twitter's volume and Twitter's page hits and API hits off MySQL. It doesn't matter how many replicas you run, it's not going to work. Maybe you could run it on a cluster of IBM Z-series mainframes running DB2 - but where is the money going to come from?

    Cassandra and HBase and the other distributed NoSQL database solve specific problems in specific ways. They won't work for Walmart, but they'll do the job just fine for Facebook and Twitter. If you have those specific scaling problems and can live with the restrictions (you lose ACID, indexes, and joins to varying degrees) then they'll work for you.

    If all you know is that your site is running slow, then implementing NoSQL is unlikely to improve things.

    • If you get to the size of Walmart doing anything, you have access to the capital to get a system from IBM or Oracle for OLTP and Teradata for data wearhousing.

  • Why should I give two shits about what database system someone else uses?
  • by SmallFurryCreature (593017) on Sunday March 28, 2010 @12:14PM (#31647886) Journal

    I think some developers keep looking for the holy grail. Some magical solution that will turn development from punching in code, to Star Trek: "Computer do my job for me please".

    Template languages, 4GL, NoSQL, Ruby on Rails... it is all part of an attempt to take the nasty out of development and they all... well... they all just don't really happen.

    Because deep down, with all the frameworks and generators, if you want your code to do what you want it to do, you are still writing out if statements a lot.

    And yes, OO and such also belong to this. Not the concept themselves, but the way most people talk about. OO means code re-use right?

    If you said yes, then you are a manager, go put on your tie, you will never be any good at coding.

    You can re-use all code. And it has been done for a long time.

    What, did you think that people who wrote basic for the C64 went "Oh I wrote this bit of code for printing, now I need the same functionality, I am going to write it all over again!"

    OO does make code re-use a bit easier BUT that is NOT the claim that people often make. Trust me, I ask this in interviews and it is always the same answer. Apparently you can't re-use functions. No way, no how. NEXT!

    I see two kind of developers. Those who hate their job and those who don't. The former want to be managers, get away from writing code as fast as possible. And they will leap on anything that seems to make their jobs easier. Meanwhile the rest of us go on with actually producing stuff.

    Just check, how many times do you get one of those managers wannabe introducing something they read in a magazine because it promises that you don't need to write another line of code ever!

    • "Computer do my job for me please"

      [HAL] Certainly, Small Furry Creature ... would you like fries with that? [/HAL]

    • by Vellmont (569020) on Sunday March 28, 2010 @12:36PM (#31648068)

      In some ways I agree with the general idea of your post. But stepping back a bit, code HAS gotten easier to write over the long term. I'd hope nobody would argue that writing a large application in a modern high level language is easier than writing it using 1970s technology in assembly. Those advancements in language came through a lot of trial and error (a lot of error). How many failed language exist that turned out to be dead ends (though spurred further advancements and refinements?). How do you know the technologies you mentioned won't turn into the next (your favorite productive language here)?

      You're right that endlessly pursuing the latest trend is just foolhardy, as most "new latest greatest technology" turn out to be duds. The point being those duds sometimes DO pan out. Anyone that thinks that relational databases are the end-all-be-all of persistent data storage hasn't done enough relational database development to understand some of the limitations.

    • Re: (Score:2, Informative)

      by tukang (1209392)

      OO does make code re-use a bit easier BUT that is NOT the claim that people often make. Trust me, I ask this in interviews and it is always the same answer. Apparently you can't re-use functions. No way, no how. NEXT!

      You can reuse functions but you can't extend them and that's where OOs reuse shines. It's very powerful to be able to lay out your code as a tree and control the reuse 'flow' at the nodes.

  • The company I work out is currently having a huge headache moving from files into databases. We currently store everything in XML which gives us a great amount of freedom and adaptability. However most database solutions fix you to a single (or handful) of data definitions. Which you can kind of re-create XML be defining all kinds of crazy relationships, it gets hugely convoluted (to say the least).

    I would LOVE to see a document/XML-live database. Just needs to do things that standard databases support (e.g

    • by Gorath99 (746654)

      Have you checked out something like XML DB [oracle.com]? I haven't used it much myself, but it sounds like it may meet your needs. It comes bundled with the XE database [oracle.com], which is free as in beer. (But XE has some limitations that the enterprise product doesn't have, of course.)

      Disclaimer: I work for Oracle.

    • by kuhneng (241514)

      Oracle and DB2 both support the SQL/XML standard and provide quite a bit of functionality for native handling of XML. Both can store structured / compressed representations in a native XML type (with or without a predefined schema) and use XPath-based indexes for efficient query execution.

      Wonderful stuff, and one of the few features I really miss back in the PostgreSQL world.

  • At first, I thought NoSQL like Cassandra should simply be used as a store for precomputed relationships. Then I thought NoSQL was just a structureless store that can scale in any given direction with no effort.

    Both sound interesting, but then the debate against NoSQL is just "well, SQL can already do all that, but you get data integrity with it. If it doesn't scale, then just build a manly man's server and it will".

    So, I dunno. The whole debate has gotten very religious very quickly and as a result, no one

  • More RDBMS dogma (Score:4, Insightful)

    by Angst Badger (8636) on Sunday March 28, 2010 @12:32PM (#31648048)

    Use the right tool for the job, except databases, eh?

    The simple fact of the matter is that not every app is aiming for Google's scale. (Not every app is web-based or even going to be web-based, though people seem to forget that.) And even some large-scale apps don't fit the relational model very well, medical records being one of the more outstanding examples.

    And yes, I have read Codd and Date and understand the relational model and its benefits very well, and it annoys me to no end when people break the relational model without realizing or understanding what it costs them. That said, sometimes those costs are acceptable, and sometimes an application requires features that the relational model does not (and in fact cannot) bring to the table.

    It may be, as with every other silver bullet fad, that what's at work here is the basic human tendency to become familiar with something, begin to see everything in terms of it, and then try to persuade anyone who'll listen that they are in possession of the all-singing, all-dancing solution to all problems. Today, it's Ruby, multi-touch interfaces, and functional programming. But not very long ago it was COBOL and CICS. And while one must acknowledge that progress has been made, it is equally obvious that progress will continue to be made and that "one size fits all" is always BS, even in clothing.

  • by cervo (626632) on Sunday March 28, 2010 @12:43PM (#31648134) Journal
    Many of the NoSQL sources scale better than a normal database and are available cheap. Oracle costs a fortune, and if you want to run Oracle on a cluster good luck. They also don't let you publish benchmarks without their permission. But most people I know who use Oracle claim it totally beats everything else (without further clarification). DB2 includes a cluster edition that is also quite good. It uses a shared nothing architecture. But none of these solutions are free. Also teradata is also cited as a good parallel database. If you are a start-up and your choice is a NoSQL solution that is almost free or 100,000+ for some commercial parallel database, which do you go to?

    But no matter what you will consume resources with a relationship database on ensuring consistency (which many times is what you want but not 100% of the time). Amazon's Dynamo works by not caring so much about consistency and trading consistency for availability of the overall service. For a shopping cart it is fine, but you wouldn't want to do your credit card processing using it. Google's GFS is optimized to do the file operations that google does the most. However there was an article in the ACM not that long ago comparing Map Reduce (Hadoop's implementation) against two parallel databases, and it lost. OF course the Parallel Databases were all not free....and hadoop is....

    So overall I'd say the decision comes down to price mostly (as it does with most startups). If you can make do with one server than sure do PostgreSQL (or mySQL...although they always tried to force licensing for commercial products even though it is GPL...). If you need a cluster, both have clustering solutions, but as far as I can tell they are not as good as the commercial Parallel databases. If you have lots of money then sure go with Oracle, it seems through word of mouth Oracle is the best for both parallel and stand alone in terms of performance. DB2 was good enough for a former job. They had terabytes in the mid 1990's using about 20 servers. Now that the hardware is much better I'm sure it scales even better.... But if money is a consideration, then go with an open source noSQL solution. A lot of people now swear by Cassandra, I haven't had a chance to check it out yet.
  • by RAMMS+EIN (578166) on Sunday March 28, 2010 @12:55PM (#31648234) Homepage Journal

    I'm still fuzzy on what NoSQL is supposed to be and what it is supposed to bring to the table.

    From what I've understood, it's basically a common banner for various different databases that all share the common property of not being relational databases and not providing ACID guarantees.

    If so, it seems to me that the whole NoSQL vs. RDMBS [wikipedia.org] debate is about a false dichotomy. There are some applications where a relational database is the right tool for the job, and there are some where a relational database is not the right tool for the job. In some of those latter cases, one of the NoSQL databases may be the right thing.

    This is nothing new. Non-relational databases have been used on Unix for a long time, and are even a standard part of POSIX (see for example the manpage for dbm_open [opengroup.org]). It's also long been known that, for example, Berkeley DB [oracle.com] can be a lot faster than an RDBMS - as long as your application doesn't make use of all the features an RDBMS provides. Lots of programs even don't use one of these database systems, but invent their own, custom format. Git [git-scm.com] is a very successful example of this.

    To me, it seems that what we are seeing here is loads of people who had learned to use relational databases for all their storage needs discovering that there are other ways to store data, and that one of those methods may work better than an RDMBS for a particular application. Well, yes. Does that surprise anyone? It sure doesn't surprise me. Does it mean that RDMBSes are now useless? Not at all. Does it mean you should use a non-relational storage system where this makes more sense? Of course! Now, can we please get back to work? I don't see the point of having a holy war over whether RDBMS or NoSQL is better, when common sense says that they both have their uses.

    • Re: (Score:3, Insightful)

      by ishobo (160209)

      Unfortunately the NoSQL people should have called their movement "nonrelational". You can have a relational database and not use SQL; the two are not dependent on each other as there are nonrelational databases that allows the use of SQL. Although the movement for the use of nonrelational databases may be new, the use of nonrelationals is not. My first exposre to a business class database was Pick in the 70s. There are plenty of these types of systems in use today. Nonrelationals have been going strong for

  • SQL performance (Score:3, Insightful)

    by garry_g (106621) on Sunday March 28, 2010 @01:06PM (#31648306)

    People complaining about SQL performance are most likely either using incorrectly scaled machines for the job, or believe they can throw a four-line SQL statement at the database and expect it to work out the optimization on its own ... query optimizers may be able to do a decent job on average, but once you go large databases (multi-million dataset tables), planing the query structure will go a long way preserving performance.
    Yes one can write complicated queries to return exactly what you want in one query, but in many cases doing some logic around it and using smart grouping/loops will outperform the complex query ...

  • Bullshit.

    ActiveRecord? Definitely. Rails as a whole? You might consider replacing it with another Ruby framework, but the same ideas are going to apply. Remember how Rails and Merb are merging? Merb tends to be ORM-agnostic, but the recommended Merb stack suggested DataMapper, which does support a few NoSQL databases.

    Even if you needed a different ORM per NoSQL database, it wouldn't marginalize Rails as a whole, but that simply isn't the case. Just use DataMapper, then plug in the flavor of the day.

    As an example, Rails (and DataMapper) run on Google App Engine [googlecode.com].

  • by Qwavel (733416) on Sunday March 28, 2010 @01:40PM (#31648610)

    The article focuses on NoSQL's claim to scalability, but isn't that just one of the features of (some of the) NoSQL options?

    Google, Amazon, and Microsoft all provide NoSQL storage as a service that is easy to use and cheap, particularly for getting started. Those are two pretty important features and I would imagine that it is those features, rather then dreams of needing vast scalability, that attract the many web startups.

  • by pydev (1683904) on Sunday March 28, 2010 @05:18PM (#31650398)

    It's easy to hit intrinsic performance limits with SQL databases even on small apps. And for people who aren't database experts, it's even easier since they don't know the hoops to jump through to make their SQL databases perform well. For the average programmer, it's easier to get good performance out of no-SQL databases.

    Using SQL databases programmatically is a fairly silly notion to begin with: SQL was originally intended as an easy-to-use query language for non-experts because people were having trouble with navigating data structures. But programmers are excellent at navigating data structures and designing efficient data structures. SQL is solving a problem that most programmers don't have, and you're paying a big performance penalty for that.

    Sometimes an SQL database is the right thing to use, sometimes it isn't. People really need to use their head instead of blindly picking one or the other solution.

Prediction is very difficult, especially of the future. - Niels Bohr

Working...