Yale Researchers Prove That ACID Is Scalable

Yale Researchers Prove That ACID Is Scalable 272

Posted by CmdrTaco on Wednesday September 01, 2010 @12:49PM from the i-could-prove-lunch dept.

An anonymous reader writes "The has been a lot of buzz in the industry lately about NoSQL databases helping Twitter, Amazon, and Digg scale their transactional workloads. But there has been some recent pushback from database luminaries such as Michael Stonebraker. Now, a couple of researchers at Yale University claim that NoSQL is no longer necessary now that they have scaled traditional ACID compliant database systems."

Yale Researchers Prove That ACID Is Scalable

This discussion has been archived. No new comments can be posted.

Search 272 Comments Log In/Create an Account

Comments Filter:

Re:I hate SQL and Databases in General... (Score:5, Informative)

by poet ( 8021 ) writes: on Wednesday September 01, 2010 @01:36PM (#33438188) Homepage

Spoken with proud ignorance.
Anyone who has properly scaled an application knows the database isn't the problem. If it was, it wouldn't take 12 applications servers to bring the thing to its knees. That said, most of your gripes equate to:
I am not a DBA and therefore I do not understand DBA and therefore I must complain.
Further SQL has nothing to do with ACID. AT ALL!

Re:Pfah. (Score:5, Informative)

by Trieuvan ( 789695 ) writes: on Wednesday September 01, 2010 @01:41PM (#33438296) Homepage

It is if you use innodb .

Re:NoSQL is also about arbitrary schemas (Score:1, Informative)

by Anonymous Coward writes: on Wednesday September 01, 2010 @01:52PM (#33438440)

ANSI defines a query mechanism for walking heirarchies in Common Table Expressions. Granted, the garbage that is MySQL has no support for this, but the latest releases of Postgresql, Oracle, DB2 and Microsoft SQL Server all do using the same syntax. I will say that SQL is not really the best suited for such things, but it does work.
As for schema-less data, there are a couple of solutions which I believe are all DB-specific. Microsoft SQL Server allows storage of XML data as well as SQL extensions utilizing XQuery to query into that data. It also supports indexing that data and using XML schemas to constrict the nature of that data if necessary. Microsoft SQL Server 2008 also added sparse table support which is built on top of XML storage which allows a table to have 30,000 columns and optimized for the majority of those columns being NULL on any particular row. I know that ANSI does get into XML storage a little bit but I'm not sure which DBs actually implement the standard, if any, especially to a level where it would be a workable solution.

Just in case anybody else doesn't know... (Score:4, Informative)

by elwin_windleaf ( 643442 ) writes: on Wednesday September 01, 2010 @02:01PM (#33438582) Homepage

From the Wikipedia Article (http://en.wikipedia.org/wiki/ACID [wikipedia.org])
"In computer science, ACID (atomicity, consistency, isolation, durability) is a set of properties that guarantee database transactions are processed reliably. In the context of databases, a single logical operation on the data is called a transaction."

Re:I hate SQL and Databases in General... (Score:3, Informative)

by davidbrit2 ( 775091 ) writes: on Wednesday September 01, 2010 @02:07PM (#33438658) Homepage

...because on every application I have ever worked on, the Database has always been the performance bottleneck.
What alternative have you seen that handles the same workload more efficiently? Flat files? I've seen plenty of database-related performance issues, but it's almost never inherent in the database - it's the idiot that wrote the lousy table-scanning code that's reading a couple rows out of a table with millions that's the problem.
Testing of DB applications is always a problem, because the running of tests generally changes the database, rendering tests unrepeatable without reseting the database.
If only you could start something like a "transaction", which you could then "roll back" after finishing the test, leaving the database in its original state. And if you could somehow "back up" the database and "restore" it on a test server, or under a different name. That would be awesome.
And don't get me started on stored procedures and the difficulty of using source code management with stored procedures.
Checking your create/change scripts into source control is no more difficult than checking your C source in prior to compiling it.
SQL is fixed in a syntax and written with naming conventions and styles that can best be described as neo-Cobal.
While I don't totally disagree on this point, calling SQL "fixed" is a bit like saying C# and Java are the same. I promise you any meaty SQL Server code will not run on Oracle without very significant changes that will have to be done by someone that will cost you a lot of money (and likewise with Oracle to SQL Server). The capabilities vary wildly by platform, and the syntax is only identical for the simplest of CRUD statements.
Last gripe: A traditional Relational database imposes ACID overhead on every application, even if you don't really need it or use it. This is like a programming language that imposes a SORT overhead on all your data structures even if you rarely or never need to sort them.
I have to give this one a LOLWUT. If you're using a big RDBMS, it's likely a multi-user system. If you've got multiple users and connections, you want ACID. This isn't like imposing sorting overhead on data structures, it's like imposing the basic memory protection, process isolation, and filesystem durability you find in any competent operating system. If you want to see what it's like without those protections, go use Mac OS 9 for a week or so, or an Access database used by a few dozen people over a network.

Re:Pfah. (Score:1, Informative)

by Anonymous Coward writes: on Wednesday September 01, 2010 @03:43PM (#33440080)

But then it's not scalable.
If you want ACID and scalability, use Postgres.

Re:I hate SQL and Databases in General... (Score:4, Informative)

by Just Some Guy ( 3352 ) writes: <kirk+slashdot@strauser.com> on Wednesday September 01, 2010 @03:58PM (#33440296) Homepage Journal
And don't get me started on stored procedures and the difficulty of using source code management with stored procedures.
That's easily solvable:
1. Create a subdirectory called "storedprocs" inside your SCM directory.
2. Inside that subdirectory, make files with names like "checkinvoice.sql" that store the sequence of commands required to create a stored procedure - one per file. Start each one with a statement like CREATE OR REPLACE FUNCTION myschema.checkinvoice([...]).
3. Manage those files with your SCM system. Group them by database, or by project, or by phase of the moon, or by whatever else makes sense to you.
4. To update every stored procedure you've ever written, or to build out a new database: cd storedprocs; psql < *.sql
Stored procedures don't have to be any more difficult to manage than any other code.
Summary (Score:5, Informative)

by azmodean+1 ( 1328653 ) writes: on Wednesday September 01, 2010 @04:07PM (#33440488)

Short Summary:
We make some claims about scaling ACID databases, but then don't support them.
Longer summary:
We don't like NoSQL and enjoy making baseless cracks about it such as it being a "lazy" approach.
In our paper we demonstrate that our unconventional version of an ACID database scales better than a traditional ACID database in a specific environment, while merely throwing away some robustness guarantees and changing how transaction ordering works.
No direct comparison to any NoSQL implementation is made.
So yea, I'm not holding my breath for companies to start migrating away from NoSQL.

Re:Pfah. (Score:5, Informative)

by DragonWriter ( 970822 ) writes: on Wednesday September 01, 2010 @05:01PM (#33441258)

Doesn't work so well if you've got a graph structure or a tree. If in a family tree, you want to find all 5'th descendants or all descendants of some guy, SQL won't make you happy.

A decade plus ago, and that would be true.
Standard SQL from SQL-99 on will, in fact, do this quite easily with via recursive Common Table Expressions. Now, some SQL-based DBMSs don't support enough of the standard to use this, but, current versions of, I believe, DB2, Firebird, PostgreSQL, and SQL Server all implement standard CTEs well enough to do those examples in SQL fairly directly, and Oracle has its own proprietary syntax (CONNECT BY) that works for the examples that you pose, though its less general than SQL-99 recursive CTEs.

Re:Pfah. (Score:5, Informative)

by QuoteMstr ( 55051 ) writes: <dan.colascione@gmail.com> on Wednesday September 01, 2010 @05:28PM (#33441658)

An ACID compliant RDBMS can't even get read access to the user, car, friend, picture and pet_survey_answer table set as long as any of the million users of the system is making a change to his data, even if the application only locks one table at a time for write access, let alone the problem of a million users trying to gain write access to the same table at the same time.
You have no idea what you're talking about, probably because your brain has been irreversibly warped by MySQL. Concurrent writing is widely-supported.
Hint: MVCC [wikipedia.org].

Re:Pfah. (Score:5, Informative)

by LurkerXXX ( 667952 ) writes: on Wednesday September 01, 2010 @07:14PM (#33443234)

An ACID compliant RDBMS can't even get read access to the user, car, friend, picture and pet_survey_answer table set as long as any of the million users of the system is making a change to his data, even if the application only locks one table at a time for write access, let alone the problem of a million users trying to gain write access to the same table at the same time.
Wow. Just wow. Any serious ACID complient RDBMS can do that with no problem.

Book: SQL Antipatterns (Score:3, Informative)

by Futurepower(R) ( 558542 ) writes: on Thursday September 02, 2010 @10:02AM (#33449394) Homepage

SQL Antipatterns [amazon.com] may interest you. As one of the reviews says, "An excellent guide to database design tradeoffs".

Re:Pfah. (Score:3, Informative)

by GWBasic ( 900357 ) writes: <{moc.uaednorwerdna} {ta} {todhsals}> on Thursday September 02, 2010 @04:46PM (#33456962) Homepage
So, remember, NoSQL means that's anything but SQL. It's not a standard; rather, it's an honest effort to try to experiment with different database techniques where traditional SQL just isn't meeting an industry need. Key-value databases aren't going to satisfy the "give me how many widgets we sold in June to evil inventors in the tri-state area" need; but they do satisfy the scalability need for sites that have millions of concurrent users.
Regarding Mongo, the NoSQL database that I use, it can answer the "give me how many widgets we sold in June to evil inventors in the tri-state area." Basically, instead of having 100 tables with foreign key relationships, you'll have 10 collections of "documents," which are really just data structures. You can query deeply into data structures and return partial data structures.
Let's assume I have an "invoices" collection. Each invoice has an array of "line items", and each item has a count. I can do the following in Mongo:

Give me all of the documents from the invoices collection that are for companies in the tri-state area and have widgets X, Y, and Z: http://www.mongodb.org/display/DOCS/Dot+Notation+(Reaching+into+Objects) [mongodb.org]
(Use the above query,) but instead of returning full documents, just return the parts of the documents that represent the "line items" that match widgets X, Y, and Z: http://www.mongodb.org/display/DOCS/Retrieving+a+Subset+of+Fields [mongodb.org]
(Use the above query,) but just return the counts of widgets sold: http://www.mongodb.org/display/DOCS/Aggregation [mongodb.org]
Again, NoSQL isn't a standard. It's basically experimenting with different ways of having a database with the hopes of finding one that's easier to work with. Mongo is a lot closer to SQL then things like Key-Value databases.
Re:Pfah. (Score:3, Informative)

by DragonWriter ( 970822 ) writes: on Friday September 03, 2010 @11:58AM (#33465490)

I think the issue is that SQL is procedural, and query languages are better when declarative.

SQL, as such, is declarative. Many RDBMSs include, in addition to SQL, an SQL-derived procedural scripting language (Oracle's PL/SQL, and so on.)

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Yale Researchers Prove That ACID Is Scalable 272

Yale Researchers Prove That ACID Is Scalable More Login

Yale Researchers Prove That ACID Is Scalable

Re:I hate SQL and Databases in General... (Score:5, Informative)

Re:Pfah. (Score:5, Informative)

Re:NoSQL is also about arbitrary schemas (Score:1, Informative)

Just in case anybody else doesn't know... (Score:4, Informative)

Re:I hate SQL and Databases in General... (Score:3, Informative)

Re:Pfah. (Score:1, Informative)

Re:I hate SQL and Databases in General... (Score:4, Informative)

Summary (Score:5, Informative)

Re:Pfah. (Score:5, Informative)

Re:Pfah. (Score:5, Informative)

Re:Pfah. (Score:5, Informative)

Book: SQL Antipatterns (Score:3, Informative)

Re:Pfah. (Score:3, Informative)

Re:Pfah. (Score:3, Informative)

Related Links Top of the: day, week, month.

Slashdot Top Deals

Slashdot