Why Some Devs Can't Wait For NoSQL To Die 444
theodp writes "Ted Dziuba can't wait for NoSQL to die. Developing your app for Google-sized scale, says Dziuba, is a waste of your time. Not to mention there is no way you will get it right. The sooner your company admits this, the sooner you can get down to some real work. If real businesses like Walmart can track all of their data in SQL databases that scale just fine, Dziuba argues, surely your company can, too."
Article summary (Score:5, Funny)
People who don't like SQL should get their heads out of their asses and use MySQL, a robust and enterprise-ready database.
Interesting thesis...
Re:Article summary (Score:5, Insightful)
My experience has made me believe PostgreSQL is better in every respect. It's more stable, has more features and is easier to use. The article wasn't specifically pro-MySQL.
The article is largely correct. The movement to ditch SQL databases is really naive. SQL scales just fine, if you know how to use it right. Look at Oracle solutions. All their fancy eBusiness software is still Oracle SQL DB backed and some of the biggest companies in the world are using it.
SQL isn't the problem, it's a tool. Bad programmers are the problem.
Re:Article summary (Score:5, Interesting)
We're using both - about five days from our "go-live", and things look good. We just use what makes sense for each part of our application.
For us, this means PostreSQL for the parts that must be transactional ACID, and Amazon's S3 and SimpleDB for parts that don't. In practice, for the 1.0 release, this means things like notes, user accounting, and documents are in S3 and SDB. The rest is plain ole SQL.
Not that there wasn't a learning curve with our developers - we're a bunch of old-time enterprise type developers, so "letting go" and moving out of the traditional SQL world took a little thought and proving time. We'll use the first few months to learn more about doing architecture this way.
We've had the language wars - lets avoid the SQL/NOSQL wars please. I'm tired.
Re: (Score:2)
Would it not have been less complex to use PosgreSQL for everything, or was there enough difference to be worth the complexity?
Re: (Score:3, Interesting)
Would it not have been less complex to use PosgreSQL for everything, or was there enough difference to be worth the complexity?
Turns out, yes and no. We're distributed already, so it would have entailed setting up another DB anyway, and all the management infrastructure around that. AWS also seemed like a good fit for things that were essentially document-oriented and it seemed that it would be efficient for this kind of data model.
Re: (Score:2)
SQL isn't the problem, it's a tool. Bad programmers are the problem.
Relational databases are quite useful. It's too bad they're hampered by such a lousy syntax though. It's like if we all decided to stick with COBOL but added closures and templates and whatnot...
Re:Article summary (Score:5, Insightful)
Re:Article summary (Score:4, Insightful)
I suppose it's the same argument when having to choose a development language. You got to pick 4GL languages, VB, Pascal, c++/Java/c#, assembly, and machine languages. The art of a great analyst is to know which to pick and when.
Re: (Score:2)
But when all is said and done, you can get familiar with most of SQL in a couple weeks. No doubt mastering all the intricacies of Oracle takes years, but not, I think, due to the SQL syntax.
Re:Article summary (Score:5, Interesting)
... the syntax of PROLOG, for example, seems much simpler, more powerful, and makes more sense to me.
Yeah, wouldn't it be wonderful if instead of all the complex cruft usually needed to find the data you need in that morass, you could just write a prolog expression and let the interpreter resolve it? But when I mention this to Team Leaders, they inevitably look at me like I'm from Mars. They have no idea what prolog is or does. (And I'm actually from a planet much farther away than Mars. ;-)
But when all is said and done, you can get familiar with most of SQL in a couple weeks.
True, perhaps, and I did that years ago. But that doesn't deal with the major problem with SQL: In my experience, every relational database I've ever worked with was in the grips of a set of professional RDB priests, and you didn't do anything in SQL without their blessing. If they didn't approve of what you were trying to do (typically because they couldn't be bothered to listen to you), it wouldn't get done during your lifetime.
So I've learned to cultivate them as an acolyte. I write my "prototype" to use flat files, typically small files full of name:value pairs, sometimes with the name part the file name and the value the contents, and a directory tree of multiply-linked files to classify stuff. I agree with their criticism of this, and say that I'd be happy to convert the code to use their DB when they have the time to help me get those subroutines working right. While they chew on that, I get the project working with the flat files, and get some users using it. When the priest finally face the fact that the project works without their help, they finally deign to help.
But I've never seen them actually get the SQL working to the point that it can supplant the flat files. The parts that do work are always so slow that turning on the "useDB" switch makes it too sluggish to actually use. In some cases, I can get around this by writing "pre-pass" code to extract the common data sets from the DB and write it to flat files, which the interactive software can read through quickly.
It has long seemed to me that SQL and RDBs in general are Good Ideas. But unless we can find a way to end the stranglehold of the DB priesthood in an organization, it's all sorta hopeless for a mere "developer" to even consider jumping into the mess. It's better to just develop stuff that works, and let the DB experts handle the task of porting it to the DB. That way, we developers can keep our hands clean of all the theology, and actually develop stuff that works.
Of course, this is all heresy to the True Believers ...
Re:Article summary (Score:5, Insightful)
It's not heresy. However, I have seen a lot of crap data models produced by developers (even worse than what I come up with as GUI designs). I have also seen developers produce SQL that looked OK at first glance but performed abysmally under certain conditions (and have even saved the odd project by finding those and fixing them when the system started dying under load). If you access a SQL database like you would a set of flat files, it is never going to give you the performance that a flat file access will give you for raw throughput because you've got all the extra communications latency. However if you re-write your search and extract queries to pull your data in a single SQL statement instead of a statement for each of your N tables involved in the result, then SQL is going to kick ass as soon as you start getting enough data and users placing enough queries that all the indexes and caching can pay off.
Flat files will work better for certain types of unstructured data, but most people who get crap performance out of SQL databases just don't understand how to use SQL databases properly. Which is why those True Believers tend to get upset about crap SQL implementations: because those tend to bog down a SQL server and slow down all the well-written apps too.
No, the real problem with most SQL DBAs is that they haven't adapted to agile methodologies. They still want the data model to be spring fully armored from Zeus' head according to classic waterfall planning. What they need to do is to get some data modelling tools that support round trip engineering so that they can make changes as the developer needs them and have upgrade scripts checked into source control along with the code on new builds. Right now there's only a few tools like ErWin and Data Architect that support that kind of development, and they tend to be ridiculously expensive. The one exception is DeZign for Databases Professional which is comparatively cheap. A lot of companies will lay out a lot of cash for developer tools but won't fork over the dough necessary for their data modelers/DBAs to properly support developer activity. So yeah, the DBAs tend to be a little reticent to do all that work by hand. While there are some developers who still use notepad or gedit by choice, nobody seems to expect them to do it, or to have the same productivity as someone with a decent tool chain.
Re: (Score:3, Interesting)
From the immortal words of Joe Celko in response to a similar question you discuss and one of the most true statements ever written:
My SQL program is trying to compete with a flat file system.
If you want to get data to a single user, in a fixed format, you will
lose. The reason we have databases is not speed. Databases are for sharing
data (concurrency control and all that jazz), and keeping data integrity
(normal forms, constraints and all that jazz).
You can get to the ground floor a lot faster by jumping down an empty ...
elevator shaft instead of waiting for the car to arrive. However, there
are trade-offs
--CELKO--
If data has little to no value for you then you do not need a relational database. However, if data is of any importance to you then you have to think beyond a flat file. Flat files, hierarchal databases have been around since the dawn of computing. Relational databases were brought about to solve concurrency and integrity problems inherent in these models not to make your appl
Re:Article summary (Score:5, Interesting)
"NoSQL" stuff is fine if your company is simple in structure - very few products/services, and it has to write most of that stuff itself anyway.
When you have many different departments with their own different apps (in house and 3rd party), and they all want to access the same bunch of databases, SQL just becomes the "standard API or language" you use to talk to them. In contrast say you have some custom "NoSQL" DB, it's going to be harder to find stuff that talks to it (you might have to write your own connectors).
It's just like "English", the syntax might be crap, but it's far easier to get 3rd parties and other departments to use it. In contrast if you use Lojban, despite its supposed advantages you're probably going to have to get translators (or worse - train your own translators) whenever you need to deal with outsiders who don't speak it.
Re:Article summary (Score:5, Insightful)
Look at Oracle solutions. All their fancy eBusiness software is still Oracle SQL DB backed and some of the biggest companies in the world are using it.
Yep, "nobody ever got fired for choosing Oracle".
But to get performance and fault tolerance for Oracle, you need to throw a lot of money at it -- high end hardware, RAC licenses etc. Whereas some of the NoSQL DBs promise lots of scalability on clusters of cheap hardware -- situations where failing hardware is the norm.
If your application suits it (i.e. your data fits the name/value system, and eventual consistency is adequate) why not use something fast and cheap?
Re: (Score:3, Interesting)
I would also fire anyone who specifies MSSQL - with immediate effect, and no severance pay: On grounds of insubordination, incompetence and reckless endangerment.
So it's a no-go on MSSQL for that Microsoft contract your company just got? Of course, you didn't specify the type of work your company does so this attitude comes across as being rather narrow-minded. And good luck on that no severance pay thing. "I'd fire anyone in my organization who suggested we callously disregard labor laws like that." :)
Re:Article summary (Score:4, Insightful)
MSSQL's lock escalation isn't as efficient as Oracle's, but that doesn't make it a toy.
Re:Article summary (Score:4, Funny)
Re:Article summary (Score:5, Insightful)
What product has Oracle ever dropped support for? What is your objection to MSSQL? SQL 2005/2008 are damn fine products which perform extremely well. Sounds to me like you're the one that is ignorant with blanket policies against industry standard tools.
Of course I run Oracle, MySQL, and MS SQL in my datacenter all without problems and some under nice and heavy loads. About the only sensible stance you have is with Postgresql which is far and away better than MySQL which in my opinion sucks pretty bad.
Re: (Score:3, Interesting)
Re: (Score:3, Informative)
MSSQL's TIMESTAMP is non-standard. so if you're trying to port 'standard' SQL code from the mythical standard DBMS in the sky, then you've got some work cut out for you.
Re:Article summary (Score:4, Informative)
Timestamp in mssql is a misnomer, it's not a timestamp at all. It's more of a binary format concurrency key.
This doesn't excuse the use of the name by MS, but once you realize that it makes the column useful again.
Re: (Score:3, Insightful)
Whenever you port from one RDBMS to another you're going to have to put up with all sorts of stuff like this. So I don't see this as a showstopper.
So that's why I was curious on what the OP's problem was. If it's just "MSSQL names stuff differently and I didn't bother to do a bit of research to find that out" then I can ignore that particular complaint about
Re:Article summary (Score:4, Informative)
This might explain some of the problems with it http://www.sqlhacks.com/pmwiki.php/Dates/Timestamp [sqlhacks.com]
Basicly MSSQL timestamp aint a timestamp.
Re:Article summary (Score:4, Informative)
Saying MSSQL doesn't have a proper timestamp is like saying that Oracle doesn't have a proper VARCHAR because Oracle only has a VARCHAR2 data type.
Re:Article summary (Score:4, Interesting)
Timestamp equivalent * Eventually, MS will convert the current timestamp of a unique row number, to an actual date and time. * Use ROWVERSION instead of timestamp. Row version provides the same functionality and the same value as the current timestamp.
MSSQL 2008 and above is fine, and we use timestamps almost to an atomic precision in medical imaging... eventually came right after that post
my 2cents.
--chitlenz
Re:Article summary (Score:4, Informative)
I'm fairly certain that SQL Server inherited its TIMESTAMP keyword from Sybase, and that usage of TIMESTAMP pre-date SQL-89 and SQL-92 usages of that keyword.
In short, they can't fix it properly, because it would break a ton of existing (very critical) applications that use the existing Sybase and MSSQL semantics of TIMESTAMP. Microsoft deprecated its usage of TIMESTAMP long ago, but they can't just change it without pissing off a lot of people. Oracle is in the same boat with many of its features that "violate" the ANSI standards.
It's sort of like bitching about IE6 not supporting CSS2 features. IE6 predated the CSS2 standards ratification. It's actually the fault of those writing the standards: they ignored widely-used software and practices. In this case, they chose to use the TIMESTAMP keyword when something like DATEWITHTIME would have been clearer and would not have collided with anybody.
In my experience, MSSQL is actually the most ANSI-compliant of the major commercial DBs.
Re:Article summary (Score:4, Interesting)
Given that Oracle has a java client and java is supported on OS/2 how did Oracle drop OS/2? Even with 10 and 11g you can still connect from a OS/2 box although I would say your application has some fundamental design flaws if workstations are directly connecting to a database.
Also, some the biggest general ledger applications deployed are running on MS SQL, that includes Great Plains and Navision.
As for Oracle Power Objects you have the same situation, Oracle has another product that achieves the same functionality and more and it evolved into that. Much like Oracle Forms and Reports 10g has no 11g version, Oracle didn't drop support for Forms and Reports services though, they came out with a new product and have a clear and rather easy transition path provided you have a good amount of Oracle infrastructure.
MSSQL timestamp is a really weak argument as well as there is nothing that forces you to use it's timestamp which we'll agree is different from what you get with Oracle, MySQL, and Postgresql. We get around that by converting to strings since we work with multiple platforms. Each of them have serious strengths and of course, serious weaknesses. I personally believe that the only product worthy of such animosity is mysql because the developers clearly knew nothing about databases in it's design. Naturally they even admit that. They learned along the way and have created a flexible product but it has all the problems that Oracle had 20 years ago and the MSSQL had 15 years ago. When you rely on your application for data integrity you will run into problems again and again and again.
Sounds to me like you weren't happy being forced off dying platforms, given how long Oracle extended support for both it seems you were quite stubborn. EOL for Power Objects was in 1995 and support actually ended in 2000. That is one seriously long transition period.
Re: (Score:3, Informative)
Re: (Score:2)
Re:Article summary (Score:5, Funny)
Oracle database license prices scale very well, too.
Re:Article summary (Score:5, Interesting)
Considering that by the time you 'need' Oracle, the price of Oracle is a drop in the bucket.
The only people that ever complain about the price of Oracle are the people who will never have the need to use it because they'll never have the traffic to it to require it.
Sorry you haven't got to play with the big boys, but in general if you spend your time worrying about how much 'software costs' your business sucks. Software costs, even for Oracle, are trivial compared to the other costs that go into it.
An Oracle DB serving internet facing customers for instance is going to cost an order of magnitude more for bandwidth in the first year than the cost of an Oracle license to deal with it.
But you go ahead, keep pretending you have some sort of clue and are witty by pointing out its expensive. If you ever make it to that scale, the last thing on your mind will be the price of an Oracle license.
Re:Article summary (Score:4, Funny)
Cos you know, its way cooler to just berate him for his obvious inferiority....
Comment removed (Score:5, Interesting)
Re:Article summary (Score:5, Insightful)
sqlite is underrated and would be ideal for many such applications.
Re: (Score:2, Interesting)
... were it not for the fact that SQLite is at least two orders of magnitude slower than any other database, including ones written by first year comp sci students.
Re: (Score:3, Informative)
... were it not for the fact that SQLite is at least two orders of magnitude slower than any other database, including ones written by first year comp sci students.
But if MythTV takes twice as many milliseconds to read a channel listing, it really doesn't matter. Nobody's suggesting that SQLite can replace a real database server in all cases, but performance and scalability are completely unimportant in some applications.
Re:Article summary (Score:5, Informative)
Two orders of magnitude is not 20x, it's 100x.
And for non-intensive applications, that's still fine.
And SQLite isn't actually that slow anyway [sqlite.org]. It's comparable.
Re: (Score:3, Funny)
But.. but... surely, if *one* order of magnitude is 10x, *two* orders of magnitude must be 20x ? Please ? *cough*
Re: (Score:3, Interesting)
... were it not for the fact that SQLite is at least two orders of magnitude slower than any other database, including ones written by first year comp sci students.
One of the following two things are missing in your post:
1) A reference to back such a bold claim.
2) A qualifier along the lines of "... with many concurrent writers".
Re:Article summary (Score:5, Insightful)
Virtually no-one who's spent any time analyzing and working with large amounts of data has a good word to say about SQL.
I've spent 10 years developing intensively relational applications with SQL. I love it!
It was designed from the start as a language that would be integrated into others, and yet simple real world realities make that impossible, with 99% of implementations being of the "Build a large string, and pass that string to "the SQL connector" to be parsed and interpreted" form.
So... because people don't bother to to learn about things like prepared statements, the tool is bad? It's like saying that cars suck because they don't have cruise control!
Its handling of null and the empty string is incomprehensible and useless, in part because nobody involved ever had the cajones to do what needed to be done with both.
OK, so enlighten us with your brilliance! Share with us the ultimate answer of what should be done to differentiate a null (logically, "I don't know") with a blank string (logically, "We know there's nothing there") and what should be done differently?
IMHO, the concept of "null" is a very useful one which allows a developer to differentiate between a blank answer and a no answer.
There is no standardized set of data types in the real world. Simple issues with unstandardized case dependencies can make an application that works with Oracle and only uses standard "select" statements not work under, say, PostgreSQL.
Woah, hold on there boy! You mean to say that features specific to one database engine won't work with another? Well spank my uncle and grease my kittens - this is amazing! Unless, of course, you stick to ANSI 92 syntax, which is pretty much 100% compatible. Yes, there's some regression testing you'll have to do against the different databases. Just like you have to do with HTML, XML, or any other standards-based language.
(yawn)
And these are the surface level technical issues: talk to any relational database guru and they'll come up with numerous philosophical issues too.
Strange how you didn't manage to name even one?
But here's the part of this whole "NoSQL vs SQL" debate - SQL is an interface API to a DBMS, it's not the database itself! You can use any number of technologies "under the hood" including those
types of technologies commonly referred to as "NoSQL" and put an SQL interface in front! The whole idea that SQL is somehow the problem is just.... idiotic and betrays an astonishing lack of understanding by the programmer(s) involved.
It's like saying that you should have a stick-shift car because automatic transmissions don't go as fast. It's just moronic. Arguing about NoSQL is like arguing with a tea party dolt about the "socialist" health car plan that just passed! (that was first drafted by the "right wingers" 15 years ago)
It's argument from stupidity.
Re: (Score:3, Informative)
Well, the way PostgreSQL handles it is that a NULL is stored as a NULL and treated as one (i.e. NULL || ' more text' evaluates to NULL). '' is stored as an empty string and processed as one (i.e. '' || ' more text') evaluates to ' more text'
Really, that strik
Re: (Score:3, Interesting)
What I find interesting is that one of the biggest users of a BASE[0] non-relational database (a NoSQL database), namely Facebook, who uses Cassandra [1], has created an SQL style query interface named FBQL. The interface includes some rather advanced SQL features like embedded sub-queries in addition to the traditional selecting on joined tables.
Then again, that may be due in large part to the fact that they are using a database schema that is all but identical to a normalized schema used in relation ACID
Re:Article summary (Score:5, Insightful)
Au contraire.
While there are problems with SQL, 95% of its users are happy as a clam that it exists. The unhappy users are the ones who are pushing the boundaries of what SQL allows and those are the people who know SQL best. When you are writing SQL queries that span 200 lines of code, then, and only then do you begin to scratch at the limits of what SQL allows. Until then, you've only hit the limits of competence.
I've been working with SQL for over 20 years now. I've worked with applications that didn't use RDBMS's. Some of them used flat files. Some of them used hierarchial databases. People who haven't had the same sort of experiences, haven't come to the realization of why SQL was invented - and that results in then making ill-founded statements like "SQL is absolutely the worst database query language ever invented". Utter tosh. SQL has its problems, but its one of the best. That's why it has left its competitors in the dust of time.
I look around at all the frameworks that have evolved to not do SQL (EJB-QL, Hibernate, etc) and I laugh. None of those languages come close to handling the same breath and width of problems that SQL can be used to solve. Whenever I see advocates of these frameworks all puff up with fervour, I feel like shaking them and say "Your emperor has no clothes!". The list of problems these frameworks can't solve is so huge that one wonders why anyone works with them at all. But I suppose, there are plenty of people who work for small businesses who haven't encountered the kind of problems that big enterprises have.
The parent poster that I'm responding to has apparently had an problems porting SQL code. But guess what? Even on the unix platform, applications written in C have had trouble being ported from one Unix to the next. People have worked around it. Nobody goes around arguing that "C is absolutely the worst programming language ever invented".
Re: (Score:3, Informative)
SQL has its problems, but its one of the best. That's why it has left its competitors in the dust of time.
Oh, bullshit. SQL succeeded because it came from IBM, and what comes from IBM must be good by definition...or not? If we're talking about *relational* databases, then SQL is about as good a relational query language as COBOL is a general purpose language. C.J. Date wrote The Third Manifesto [wikipedia.org] for a reason.
Re: (Score:3, Insightful)
Comment removed (Score:4, Interesting)
Re:Article summary (Score:5, Interesting)
I don't have mod points, but I've found the same thing. It's the perfect development database if you think that your program is ever going to need to support Enterprise class stuff. On the small scale, I've found that it's fast enough. Is MySQL faster? Yes, but where I've tested it's not been enough to really matter compared to the other advantages of PostgreSQL. Primarily that it's ACID compliant. What we've found is that it works well until you start getting into databases that are GB in size. But then you can easily port the datatables to DB2 or Oracle and go. Especially if you designed the rest of the software to do this from the get go.
In production, we moved all but one of our databases from MySQL to PostgreSQL. We were having problems with Innodb corrupted once every couple months. When it was announced that Oracle was bidding on Sun, we ported over to PostgreSQL, spent a couple weeks rewriting code, and we've not touched the Postgres database since. It's not corrupted and not even hiccuped once since we deployed. We run regular vacuuming and maintenance and that's it. It's been humming for well over a year and now is getting 400x's the use than we ever had with MySQL.
The only thing that PostgreSQL was lacking has been HA support. There are number of 3rd party tools that run well, PGCluster, Slony, GridSQL, but this looks like PostgreSQL is going to support native replication, clustering, and HA with hot-standby...
Re: (Score:3, Interesting)
SQL isn't the problem
Yes, it is
Overhead caused by structuring your data the way relational dbs needs.
Lack of flexibility
Scalability capabilities (horizontal scaling is easier)
Speed (see overhead)
Re: (Score:3, Interesting)
SQL isn't the problem, it's a tool. Bad programmers are the problem.
You could say the same about assembly language. You could also say the same about threads, and dismiss things like functional programming and the actor model as fads.
I'll give you a simple example: Given a big transactional SQL database, if you want it to scale to more than a few machines, you're going to want to shard it. That's going to be a ton of manual work, figuring out what you can shard, what keys to shard it on, adjusting it later on the fly to ensure that each DB server has exactly what it can han
The Actual Quote is (Score:5, Insightful)
"MySQL or PostgreSQL," for what it's worth. PostgreSQL is a pretty powerful database, and you should have to make a pretty good argument why leaving a well understood technology that powers a lot (an some of the largest parts) of the WWWeb needs to be trashed for something newer and less tested.
There are times... (Score:3, Interesting)
Our development organization is heavily invested in PostgreSQL, finding it to be perfectly matched to almost all of our needs. It is exceptionally reliable, and is very (but not perfectly) manageable. (We've had issues in the past with mis-timed auto-VACUUM for instance which are now resolved.) We even found a small but significant corner-case bug which upon being reported, received immediate attention from the developers, resulting in a resolution in under 72 hours. I believe our use of this particular too
Can't wait it to die? (Score:3, Insightful)
This is like saying "I can't wait for memcached to die" just because your site doesn't need it. Fact is, some do. It's your own fault if you choose to apply unnecessary techniques.
Don't change to newer fancy techniques if you don't understand what they are for and why would you need them.
Re: (Score:3, Insightful)
memcached is most useful when the underlying app is hideously inefficient, e.g. it's pretty much essential to a MediaWiki installation that gets any appreciable number of users.
Re: (Score:2)
Well, no, not entirely. Not many sites out there run purely from memcached. Memcached is a component of a larger architecture. The fact remains that technologies like NoSQL are usually used/desired by people who have no understanding of system architecture, design an inefficient application, and then blame the database software for their poor decisions.
Re: (Score:3, Interesting)
Facebook.com, the highest-traffic site on the Internet, serves more than 95% of its data out of memcached. Twitter, Wikipedia, etc are major users too. And of course, Google serves its web index out of memory.
Re:Can't wait it to die? (Score:5, Insightful)
I can't wait for databases to die (Score:5, Funny)
XML text files all the way! /duck
Re:I can't wait for databases to die (Score:5, Funny)
Remember, XML is like violence: if it doesn't solve your problem, use more!
Re:I can't wait for databases to die (Score:5, Funny)
Why? (Score:2, Insightful)
In short, people should just shut up about other people's choices and get on with their own.
Don't you dare tell me... (Score:5, Funny)
... that I can't tell others what to do!
...because "there can be only one!" (Score:5, Funny)
The whole of geek debating is based on the Highlander principle.
Re: (Score:3, Insightful)
No, just no.
That's about as information-free as one can get. I'd ask why, but then, I don't understand why I would have to, just saying "no" is void of context and explanation.
Picutres. (Score:2)
There's a place for SQL, but there are some... (Score:2, Informative)
Hardware is cheap. Developers aren't. (Score:5, Interesting)
It's really that simple. A standard dual socket server with the latest CPU's from Intel or AMD can handle hundreds of requests per second; if one isn't enough, just add more hardware, one month of salary can buy you another node, a year can buy you a whole cluster of rackable systems or a chassis full of blades. If it takes a few months extra for a team to solve the problem the NoSQL way, that's a few months of extra salary costs and missed sales.
Slashdot runs on SQL. I run a site of 1M pages daily (1/3-slashdot according to Alexa) with just a single system with 2x Xeon E5420, Django/PostgreSQL at 10% load. Unless you attract enough attention to require scaling past 10M pages a day, you're wasting your time reinventing the wheel with NoSQL, just stick with a standard ORM, launch your site and start convincing customers and generate sales. You can survive a slashdotting just fine without spending so much time on those exotic tools.
Re:Hardware is cheap. Developers aren't. (Score:4, Insightful)
It's really that simple. A standard dual socket server with the latest CPU's from Intel or AMD can handle hundreds of requests per second;
Hundreds of requests for WHAT per second?
Your idea of "just throw hardware at the problem" isn't generalizable. Throw hardware at WHAT problem? For some problems, you're right. For others, you couldn't be more wrong. There's really no point in saying anything further.
Re:Hardware is cheap. Developers aren't. (Score:5, Insightful)
Unless you attract enough attention to require scaling past 10M pages a day, you're wasting your time reinventing the wheel with NoSQL, just stick with a standard ORM, launch your site and start convincing customers and generate sales.
Most of the buzz about these things comes from and is aimed at people who actually believe they'll build the next Facebook or Twitter. The fallacy is in their belief that it's the size/traffic of those sites that supposedly mandates NoSQL and not the simple data models. Some of the biggest, less spectacular projects out there run on PostgreSQL for example (Skype, Affilias = .info and .org).
Re: (Score:2)
I run a site of 1M pages daily (1/3-slashdot according to Alexa) with just a single system with 2x Xeon E5420, Django/PostgreSQL at 10% load.
Good for you -- that says nothing about how much you're actually doing for each page.
just stick with a standard ORM
As a rule, I do. I use DataMapper in Ruby. It's just that DataMapper has pluggable backends, some for SQL databases, some for more exotic things.
Re: (Score:3, Insightful)
For the type of loads 'front-page' slashdot (and your site, most likely) gets, SQL fits fine. But even then, NoSQL may give you a run for the money.
Now think of the loads incurred in the comment tree of slashdot.
Also think how something like GMail or even Google Search would fit in an SQL scheme. It doesn't, not at least, with table juggling that would be very inefficient.
Re:Hardware is cheap. Developers aren't. (Score:5, Informative)
Pretty sure he meant 1M page views/day as he compares it to slashdot using alexa data.... Is reading comprehension really that hard? Context clues are your friend.
I run a site using django/postgres, we do about 100k page views/day on a 512Mb 10GB Virtual machine. Its not doing anything crazy like google, but yeah, we aren't close to needing more power yet. When we do, first thing we'll do is bump up RAM for increased cache space...
Some docs can't wait for Cardiac Clamps to die. (Score:5, Informative)
So you're in surgery for 3 hours doing a kidney transplant, having used your trusty medium vascular clamp that have served you for the past 20 years. You're finally done and the patient is in recovery, so you sit down to relax with the latest copy of JAMA. They've got a great article about the latest development of Cardiac clamps, and you think to yourself "Why not use a heart clamp for kidney transplants!" Brilliant. So you order up some new clamps from MedicalClamps.com, and use them on your next patient. The surgery goes fine, but 3 months later the patient is back in your office with a failed kidney. You open 'em up, and it's obvious the clamp exerted too much pressure on the artery, damaging it in the process. Stupid carciac clamps! You're not a heart surgeon!
Re: (Score:3, Informative)
Re:Some docs can't wait for Cardiac Clamps to die. (Score:5, Funny)
Analogies are FUN!
Resources vs. Smarts (Score:3, Insightful)
FTA:
"In the meantime, DBAs should not be worried, because any company that has the resources to hire a DBA is likely has decision makers who understand business reality."
Bad English aside, I just don't agree. Money != Reality. I have worked both sides of this coin - Startups with plenty of money but don't see the value in proper maintainance of the data store (one almost was put out of business by a disk failure), and very smart startups that are running lean but do understand the risks.
That said, on the deeper level, why does business reality == SQL? Sure I can scale Oracle to support massive DB's (and have), but I could probably get more value from using Amazon's SimpleDB for things that don't require massive scaling. Use the right tool for the job - Hammers are for nails, etc. Do the design work up front, decide how its gonna work, and the right tool should present itself.
Re: (Score:2)
Sure I can scale Oracle to support massive DB's (and have), but I could probably get more value from using Amazon's SimpleDB for things that don't require massive scaling. Use the right tool for the job ...
Isn't the entire point of these NoSQL databases that they offer better scalability at the cost of traditional ACID data guarantees? Why would you give up the flexibility and reliability of SQL if you didn't need massive scaling?
Re: (Score:3, Informative)
If you're worrying about the cost of an Oracle license, what DB you use is irrelevent, you simply aren't large enough to make a wrong choice.
When you are large enough for this to matter, the cost of Oracle or the cost of a handful of DBAs is the least of your concern.
It blows my mind how much value slashdot geeks put on the cost of software. You guys have absolutely no fucking clue how much a single employee costs a company excluding salary do you? You've been spending far too much time living in the base
The Article Is Right... And Wrong (Score:4, Insightful)
Real business track their data with SQL databases, true. However, real businesses have small numbers of transactions relative to their value. If Walmart had the same revenue but the average sale was a tenth of a cent, their fancy SQL database would be smouldering rubble.
That's what Facebook and Twitter and other large social media sites are facing. Just try running Twitter's volume and Twitter's page hits and API hits off MySQL. It doesn't matter how many replicas you run, it's not going to work. Maybe you could run it on a cluster of IBM Z-series mainframes running DB2 - but where is the money going to come from?
Cassandra and HBase and the other distributed NoSQL database solve specific problems in specific ways. They won't work for Walmart, but they'll do the job just fine for Facebook and Twitter. If you have those specific scaling problems and can live with the restrictions (you lose ACID, indexes, and joins to varying degrees) then they'll work for you.
If all you know is that your site is running slow, then implementing NoSQL is unlikely to improve things.
Re: (Score:2)
If you get to the size of Walmart doing anything, you have access to the capital to get a system from IBM or Oracle for OLTP and Teradata for data wearhousing.
Re: (Score:3, Insightful)
The point isn't (generally, there might be some pathological corner case) that the various web2.0 kiddies couldn't implement their stuff in SQL; but that they couldn't afford to do so. If you want to be able to serve large numbers of users in order to generate enough adsense pennies to keep the lights on until somebody buys you, your options are pretty much A). Software with a more or less zero per-node cost, running on commodity x86s with no exotic in
Someone tell me again... (Score:2, Insightful)
Re: (Score:2)
Some people just want the holy grail (Score:3, Insightful)
I think some developers keep looking for the holy grail. Some magical solution that will turn development from punching in code, to Star Trek: "Computer do my job for me please".
Template languages, 4GL, NoSQL, Ruby on Rails... it is all part of an attempt to take the nasty out of development and they all... well... they all just don't really happen.
Because deep down, with all the frameworks and generators, if you want your code to do what you want it to do, you are still writing out if statements a lot.
And yes, OO and such also belong to this. Not the concept themselves, but the way most people talk about. OO means code re-use right?
If you said yes, then you are a manager, go put on your tie, you will never be any good at coding.
You can re-use all code. And it has been done for a long time.
What, did you think that people who wrote basic for the C64 went "Oh I wrote this bit of code for printing, now I need the same functionality, I am going to write it all over again!"
OO does make code re-use a bit easier BUT that is NOT the claim that people often make. Trust me, I ask this in interviews and it is always the same answer. Apparently you can't re-use functions. No way, no how. NEXT!
I see two kind of developers. Those who hate their job and those who don't. The former want to be managers, get away from writing code as fast as possible. And they will leap on anything that seems to make their jobs easier. Meanwhile the rest of us go on with actually producing stuff.
Just check, how many times do you get one of those managers wannabe introducing something they read in a magazine because it promises that you don't need to write another line of code ever!
Re: (Score:2)
"Computer do my job for me please"
[HAL] Certainly, Small Furry Creature ... would you like fries with that? [/HAL]
Re:Some people just want the holy grail (Score:4, Insightful)
In some ways I agree with the general idea of your post. But stepping back a bit, code HAS gotten easier to write over the long term. I'd hope nobody would argue that writing a large application in a modern high level language is easier than writing it using 1970s technology in assembly. Those advancements in language came through a lot of trial and error (a lot of error). How many failed language exist that turned out to be dead ends (though spurred further advancements and refinements?). How do you know the technologies you mentioned won't turn into the next (your favorite productive language here)?
You're right that endlessly pursuing the latest trend is just foolhardy, as most "new latest greatest technology" turn out to be duds. The point being those duds sometimes DO pan out. Anyone that thinks that relational databases are the end-all-be-all of persistent data storage hasn't done enough relational database development to understand some of the limitations.
Re: (Score:2, Informative)
OO does make code re-use a bit easier BUT that is NOT the claim that people often make. Trust me, I ask this in interviews and it is always the same answer. Apparently you can't re-use functions. No way, no how. NEXT!
You can reuse functions but you can't extend them and that's where OOs reuse shines. It's very powerful to be able to lay out your code as a tree and control the reuse 'flow' at the nodes.
XML (of databases)? (Score:2)
The company I work out is currently having a huge headache moving from files into databases. We currently store everything in XML which gives us a great amount of freedom and adaptability. However most database solutions fix you to a single (or handful) of data definitions. Which you can kind of re-create XML be defining all kinds of crazy relationships, it gets hugely convoluted (to say the least).
I would LOVE to see a document/XML-live database. Just needs to do things that standard databases support (e.g
Re: (Score:2)
Have you checked out something like XML DB [oracle.com]? I haven't used it much myself, but it sounds like it may meet your needs. It comes bundled with the XE database [oracle.com], which is free as in beer. (But XE has some limitations that the enterprise product doesn't have, of course.)
Disclaimer: I work for Oracle.
Re: (Score:2)
Oracle and DB2 both support the SQL/XML standard and provide quite a bit of functionality for native handling of XML. Both can store structured / compressed representations in a native XML type (with or without a predefined schema) and use XPath-based indexes for efficient query execution.
Wonderful stuff, and one of the few features I really miss back in the PostgreSQL world.
The NoSQL debate never gives any real information (Score:2)
At first, I thought NoSQL like Cassandra should simply be used as a store for precomputed relationships. Then I thought NoSQL was just a structureless store that can scale in any given direction with no effort.
Both sound interesting, but then the debate against NoSQL is just "well, SQL can already do all that, but you get data integrity with it. If it doesn't scale, then just build a manly man's server and it will".
So, I dunno. The whole debate has gotten very religious very quickly and as a result, no one
More RDBMS dogma (Score:4, Insightful)
Use the right tool for the job, except databases, eh?
The simple fact of the matter is that not every app is aiming for Google's scale. (Not every app is web-based or even going to be web-based, though people seem to forget that.) And even some large-scale apps don't fit the relational model very well, medical records being one of the more outstanding examples.
And yes, I have read Codd and Date and understand the relational model and its benefits very well, and it annoys me to no end when people break the relational model without realizing or understanding what it costs them. That said, sometimes those costs are acceptable, and sometimes an application requires features that the relational model does not (and in fact cannot) bring to the table.
It may be, as with every other silver bullet fad, that what's at work here is the basic human tendency to become familiar with something, begin to see everything in terms of it, and then try to persuade anyone who'll listen that they are in possession of the all-singing, all-dancing solution to all problems. Today, it's Ruby, multi-touch interfaces, and functional programming. But not very long ago it was COBOL and CICS. And while one must acknowledge that progress has been made, it is equally obvious that progress will continue to be made and that "one size fits all" is always BS, even in clothing.
Price may favor noSQL for some applications (Score:5, Informative)
But no matter what you will consume resources with a relationship database on ensuring consistency (which many times is what you want but not 100% of the time). Amazon's Dynamo works by not caring so much about consistency and trading consistency for availability of the overall service. For a shopping cart it is fine, but you wouldn't want to do your credit card processing using it. Google's GFS is optimized to do the file operations that google does the most. However there was an article in the ACM not that long ago comparing Map Reduce (Hadoop's implementation) against two parallel databases, and it lost. OF course the Parallel Databases were all not free....and hadoop is....
So overall I'd say the decision comes down to price mostly (as it does with most startups). If you can make do with one server than sure do PostgreSQL (or mySQL...although they always tried to force licensing for commercial products even though it is GPL...). If you need a cluster, both have clustering solutions, but as far as I can tell they are not as good as the commercial Parallel databases. If you have lots of money then sure go with Oracle, it seems through word of mouth Oracle is the best for both parallel and stand alone in terms of performance. DB2 was good enough for a former job. They had terabytes in the mid 1990's using about 20 servers. Now that the hardware is much better I'm sure it scales even better.... But if money is a consideration, then go with an open source noSQL solution. A lot of people now swear by Cassandra, I haven't had a chance to check it out yet.
I'm Still Fuzzy on NoSQL (Score:5, Interesting)
I'm still fuzzy on what NoSQL is supposed to be and what it is supposed to bring to the table.
From what I've understood, it's basically a common banner for various different databases that all share the common property of not being relational databases and not providing ACID guarantees.
If so, it seems to me that the whole NoSQL vs. RDMBS [wikipedia.org] debate is about a false dichotomy. There are some applications where a relational database is the right tool for the job, and there are some where a relational database is not the right tool for the job. In some of those latter cases, one of the NoSQL databases may be the right thing.
This is nothing new. Non-relational databases have been used on Unix for a long time, and are even a standard part of POSIX (see for example the manpage for dbm_open [opengroup.org]). It's also long been known that, for example, Berkeley DB [oracle.com] can be a lot faster than an RDBMS - as long as your application doesn't make use of all the features an RDBMS provides. Lots of programs even don't use one of these database systems, but invent their own, custom format. Git [git-scm.com] is a very successful example of this.
To me, it seems that what we are seeing here is loads of people who had learned to use relational databases for all their storage needs discovering that there are other ways to store data, and that one of those methods may work better than an RDMBS for a particular application. Well, yes. Does that surprise anyone? It sure doesn't surprise me. Does it mean that RDMBSes are now useless? Not at all. Does it mean you should use a non-relational storage system where this makes more sense? Of course! Now, can we please get back to work? I don't see the point of having a holy war over whether RDBMS or NoSQL is better, when common sense says that they both have their uses.
Re: (Score:3, Insightful)
Unfortunately the NoSQL people should have called their movement "nonrelational". You can have a relational database and not use SQL; the two are not dependent on each other as there are nonrelational databases that allows the use of SQL. Although the movement for the use of nonrelational databases may be new, the use of nonrelationals is not. My first exposre to a business class database was Pick in the 70s. There are plenty of these types of systems in use today. Nonrelationals have been going strong for
SQL performance (Score:3, Insightful)
People complaining about SQL performance are most likely either using incorrectly scaled machines for the job, or believe they can throw a four-line SQL statement at the database and expect it to work out the optimization on its own ... query optimizers may be able to do a decent job on average, but once you go large databases (multi-million dataset tables), planing the query structure will go a long way preserving performance. ...
Yes one can write complicated queries to return exactly what you want in one query, but in many cases doing some logic around it and using smart grouping/loops will outperform the complex query
Rails "marginalized" by NoSQL? (Score:3)
Bullshit.
ActiveRecord? Definitely. Rails as a whole? You might consider replacing it with another Ruby framework, but the same ideas are going to apply. Remember how Rails and Merb are merging? Merb tends to be ORM-agnostic, but the recommended Merb stack suggested DataMapper, which does support a few NoSQL databases.
Even if you needed a different ORM per NoSQL database, it wouldn't marginalize Rails as a whole, but that simply isn't the case. Just use DataMapper, then plug in the flavor of the day.
As an example, Rails (and DataMapper) run on Google App Engine [googlecode.com].
Storage as a Service (Score:3, Insightful)
The article focuses on NoSQL's claim to scalability, but isn't that just one of the features of (some of the) NoSQL options?
Google, Amazon, and Microsoft all provide NoSQL storage as a service that is easy to use and cheap, particularly for getting started. Those are two pretty important features and I would imagine that it is those features, rather then dreams of needing vast scalability, that attract the many web startups.
even if you are not Google... (Score:3, Insightful)
It's easy to hit intrinsic performance limits with SQL databases even on small apps. And for people who aren't database experts, it's even easier since they don't know the hoops to jump through to make their SQL databases perform well. For the average programmer, it's easier to get good performance out of no-SQL databases.
Using SQL databases programmatically is a fairly silly notion to begin with: SQL was originally intended as an easy-to-use query language for non-experts because people were having trouble with navigating data structures. But programmers are excellent at navigating data structures and designing efficient data structures. SQL is solving a problem that most programmers don't have, and you're paying a big performance penalty for that.
Sometimes an SQL database is the right thing to use, sometimes it isn't. People really need to use their head instead of blindly picking one or the other solution.
Re: (Score:3, Insightful)
And as far as it being 'a product of the braindead and buzzw
Yes, it does. (Score:3, Interesting)