Oracle Claims Dramatic MySQL Performance Improvements 168

Posted by timothy on Thursday February 16, 2012 @03:01PM from the breathing-in-those-sulfurous-fumes dept.

New submitter simula67 writes "Oracle wins back some karma from the open source community by releasing MySQL cluster 7.2 with ambitious claims of 70x performance gains. The new release is GPL and claims to have processed over 1 billion queries per minute. Readers may remember the story about Oracle adding commercial extensions to MySQL."

This discussion has been archived. No new comments can be posted.

Oracle Claims Dramatic MySQL Performance Improvements

Load All Comments

Search 168 Comments Log In/Create an Account

Comments Filter:

Considering sub queries in IN statements. (Score:5, Insightful)

by micheas ( 231635 ) writes: on Thursday February 16, 2012 @03:12PM (#39064007) Homepage Journal

If you create a query in mysql with an IN statement in the where clause and you put a sub query as the in statement current versions will run the query once for each row of the primary table you are querying. Caching result alone would probably get the 70x speed up. I am suspect that there are other performance stupidities in mysql that are worked around by people doing a simple query and then using php/perl/python/java/etc to parse the result and generate the second query.

Share
twitter facebook
- Re:Considering sub queries in IN statements. (Score:5, Informative)
  
  by Anonymous Coward writes: on Thursday February 16, 2012 @03:22PM (#39064183)
  
  If you create a query in mysql with an IN statement in the where clause and you put a sub query as the in statement current versions will run the query once for each row of the primary table you are querying. Caching result alone would probably get the 70x speed up. I am suspect that there are other performance stupidities in mysql that are worked around by people doing a simple query and then using php/perl/python/java/etc to parse the result and generate the second query.
  yes, but many database engines have problems with inner select statements. Refactor your query to use a join an you'll be fine.
  
  Parent Share
  twitter facebook
  - Re:Considering sub queries in IN statements. (Score:5, Insightful)
    
    by K. S. Kyosuke ( 729550 ) writes: on Thursday February 16, 2012 @03:30PM (#39064299)
    
    yes, but many database engines have problems with inner select statements. Refactor your query to use a join an you'll be fine.
    And what's wrong with using an RDBMS with a non-braindead optimizer?
    
    Parent Share
    twitter facebook
    - Re:Considering sub queries in IN statements. (Score:4, Insightful)
      
      by cheater512 ( 783349 ) writes: <nick@nickstallman.net> on Thursday February 16, 2012 @04:56PM (#39065575) Homepage
      
      The poor RDBMS has to put up with crap like that over and over again.
      It will probably need therapy after you are through with it.
      The programmer is supposed to know how to use the tool effectively to get the best performance.
      In that example there is a stupidly obvious solution - use a join.
      
      Parent Share
      twitter facebook
      - Re:Considering sub queries in IN statements. (Score:5, Insightful)
        
        by K. S. Kyosuke ( 729550 ) writes: on Thursday February 16, 2012 @07:22PM (#39067561)
        
        The poor RDBMS has to put up with crap like that over and over again. It will probably need therapy after you are through with it.
        The programmer is supposed to know how to use the tool effectively to get the best performance. In that example there is a stupidly obvious solution - use a join.
        To me that sounds like fixing a bug in the wrong place. RDBMS is *not* supposed to be dumb. If it were supposed to be dumb, we would not be using SQL in the first place and we'd be passing our own execution plans in the DB engine's internal intermediate language instead. (Well, you can sort of do it with Firebird (BLR) and SQLite (VDBE), but it's not how it's supposed to be used.)
        
        Parent Share
        twitter facebook
        
        Re: (Score:2)
        
        by fatp ( 1171151 ) writes:
        
        RDBMS is *not* supposed to be dumb.
        
        It is supposed to do what it was asked - if someone asks it to do something dumb, it has no choice.
      - Re: (Score:3)
        
        by theshowmecanuck ( 703852 ) writes:
        
        This is going to happen less and less as programmers think they don't need to understand databases because they ORM like hibernate. I know of many programmers who don't want to know about sql beyond what they need to make their hibernate mapping work.
        
        Re: (Score:2)
        
        by apotheon ( 577944 ) writes:
        
        It's tough to blame those programmers who don't want to use SQL. It's a miserable language. Even one of its principle designers hates it.
        
        Re: (Score:2)
        
        by K. S. Kyosuke ( 729550 ) writes:
        
        There are actually two different issues here. First, the relational model. That one is quite fine, it has strong theoretical foundations (easy to do correctness proofs of programs, introduce execution time optimizations, parallel execution, etc., unlike with previous pointer-chasing models), can model a lot of things and is pretty useful overall. Now, whether SQL is a good front end to an RDBMS...that's about as clever a question as whether COBOL is a good programming language. ORMs take away a lot of pain
        
        Re: (Score:2)
        
        by theshowmecanuck ( 703852 ) writes:
        
        But it doesn't mean that if one of the main technologies in your stack is a database, that you shouldn't have to do it well. Or that most ORM programmers should get their heads out of their asses and learn the language and methods of the underlying technology so they don't fuck up the performance of the system by always allowing endless layers of abstraction to in fact form a layer of cold molasses in the system.
        
        Re: (Score:2)
        
        by theshowmecanuck ( 703852 ) writes:
        
        Funny, I thought you were disagreeing with me. You are right, SQL is a butt ugly construct. But I'm not sure if there is a more human readable AND understandable way to do it (you can't expect only math majors to query the database with nice neat set based mathematical constructs only understandable by math majors). And all I was trying to point out was that as ugly as the language is, my original points still stand: part of the problem with DB performance is programmers who think the closest they need to g
      - Re: (Score:2)
        
        by u38cg ( 607297 ) writes:
        
        Point one, there are queries that can only be written using a sub-query (unusual, but they exist); point two, computers are supposed to help, not hinder. What's the point of having a declaritive language if I can't declare a query in the form I think of it in?
- Re:Considering sub queries in IN statements. (Score:5, Informative)
  
  by wstrucke ( 876891 ) writes: on Thursday February 16, 2012 @04:01PM (#39064785)
  
  You're probably right, but that's not what this release is referring to. The NDBCLUSTER engine [mysql.com] separates "API" nodes from Data nodes. A server running MySQL with NDBCLUSTER enabled is considered an API node, but you can also have a C++ or Java or whatever API node that isn't MySQL. Data nodes are provisioned in one or more "node groups" with one or more data nodes in each group, though it would be dumb to have a single node group or a single node in a node group. Each node group splits the data somehow. You can force tables to exist on just one group, but by default if you have three node groups you would have approximately a third of your data in each group.
  
  Anyway -- prior to NDBCLUSTER 7.2 if you performed any join whatsoever the API node had to pull the complete tables from all data node groups prior to doing the join on itself and returning the result. This made join performance, simply put -- terrible. I've tested the same query on a standalone out of the box mysql server against a 7.1 cluster and had an 8 second query come back from the cluster in several minutes due to the join performance.
  
  NDBCLUSTER 7.2 adds what was called "push down joins" [blogspot.com] in development -- basically the data nodes now do the joins within their own sub-sets of the data for certain joins resulting in a dramatic improvement in performance, since now the API nodes just get the result from the network instead of the entire dataset.
  
  It really is an amazing improvement and is a result of the dynamic capabilities of MySQL. NDBCLUSTER was never designed for the type of data people are throwing at it, and with the recent improvements it might actually be viable for use on production web sites for more common usage scenarios.
  
  What I do not see addressed yet with 7.2 is the reload time -- if you have a cluster loaded with several GB of data it can take upwards of 20 minutes to reload the data and indices across the cluster. While the cluster is designed to be up 24x7 (even through upgrades), a single bug or memory error that takes it down can result in a prolonged outage. There are several of these [mysql.com] open in the bug tracker.
  
  Parent Share
  twitter facebook
- Re:Considering sub queries in IN statements. (Score:4, Informative)
  
  by Anonymous Coward writes: on Thursday February 16, 2012 @04:15PM (#39064979)
  
  If you create a query in mysql with an IN statement in the where clause and you put a sub query as the in statement current versions will run the query once for each row of the primary table you are querying. Caching result alone would probably get the 70x speed up
  We have exactly that in MariaDB 5.3 (a fork of MySQL) : http://kb.askmonty.org/en/subquery-cache. It won't always give the 70x speedup, sometimes you need to start execution from the subquery to get decent performance: http://kb.askmonty.org/en/semi-join-subquery-optimizations.
  I am sorry for shameless self-promotion, couldn't resist :-)
  
  Parent Share
  twitter facebook
- Re:Considering sub queries in IN statements. (Score:4, Informative)
  
  by jon3k ( 691256 ) writes: on Thursday February 16, 2012 @04:44PM (#39065359)
  
  Just fyi, It's called a correlated subquery.
  
  Parent Share
  twitter facebook
  - Re: (Score:2)
    
    by K. S. Kyosuke ( 729550 ) writes:
    
    Just fyi, It's called a correlated subquery.
    If the thing he's talking about had been correlated subquery, he wouldn't have been able to cache its results. Ergo, he must be talking about a lousy execution of a non-correlated subquery.
- Re: (Score:3)
  
  by StuartHankins ( 1020819 ) writes:
  
  If your living depends on you knowing how to do something well, you will learn to use the tools most effectively.
  
  Professional DBA's know this, and know how to code correctly to avoid the problem in the first place. Having it "work" vs having it work reliably, quickly, and in a manner which scales is what separates any computer user from a professional. Training, ambition, curiosity, and experience are what transforms you into that professional.
  
  So while it may be nice to have the machine try to outsmart yo
  - Re: (Score:2)
    
    by mosb1000 ( 710161 ) writes:
    
    While I agree it's a professional's job to know this stuff, I would also like to point out it's not a crime to build your RDBMS in such a way that queries run as fast as possible regardless of how the query was written. Indeed, anything less is an inferior product.
- Re: (Score:2)
  
  by Terrasque ( 796014 ) writes:
  
  Here's an actual example:
  http://code.google.com/p/django-tagging/issues/detail?id=160 [google.com]
  Basically, the subquery was run all the time, leading to an amplification of biblical proportions. The solution there was to put that subquery in another subquery (yo dawg), as shown in http://www.xaprb.com/blog/2006/04/30/how-to-optimize-subqueries-and-joins-in-mysql/ [xaprb.com]
  We also tried with a temp table, but this solution was cleaner :) The performance difference was around 1/100th of the time for a sample query.
  We also confirm
- - Re: (Score:3, Interesting)
    
    by Anonymous Coward writes:
    
    The latest (past year's worth) of PostgreSQL releases automatically rewrite subqueries as joins for you.
    However, MSSQL 2008 performance dropped significantly from prior versions due to the exact opposite --- subqueries are now looped through, where previously they were handled with hash tables internally.
    - Re: (Score:3)
      
      by rev0lt ( 1950662 ) writes:
      
      Following this thread, I decided to benchmark (with EXPLAIN ANALYZE) a simple IN vs JOIN statement on both PostgreSQL 8.2 and PostgreSQL 8.4. The statements are something as follows:
      
      SELECT * FROM STATEMENTS WHERE CLIENT IN (SELECT ID FROM CLIENTS WHERE STATUS=1);
      SELECT * FROM STATEMENTS JOIN CLIENT ON STATEMENTS.CLIENT=CLIENTS.ID WHERE CLIENTS.STATUS=1;
      
      The statements table has about 2.5 million rows and clients has about 30k rows. The match (status=1) results in roughly 10k rows.
      
      Now the interesting
      - Re: (Score:2)
        
        by swilly ( 24960 ) writes:
        
        You are doing a SELECT *, so when you use the JOIN you are getting more columns back then when you are using IN. Rewrite to specify which columns you want back, and I suspect they will perform equally.
        
        Re: (Score:2)
        
        by rev0lt ( 1950662 ) writes:
        
        I've rewritten the SELECT to return 3 columns, and in 8.4 using IN is a bit faster than using JOIN (but not by a relevant difference), so you are probably right.
  - Re: (Score:3, Informative)
    
    by micheas ( 231635 ) writes:
    
    SELECT id, title FROM page WHERE id IN (SELECT pageid FROM hotnews WHERE user = 8)
    The answer to this is to execute the sub query and pass the result back as an array. If you have 10,000 records in table page, you will see about a 2000x speed improvement if you are using php
    - Re: (Score:3)
      
      by K. S. Kyosuke ( 729550 ) writes:
      
      That's utter crap. Why involve the client? Why the extra roundtrips? If you need to do the thing you've just described, you must be doing something horribly wrong.
      - Re: (Score:2)
        
        by errandum ( 2014454 ) writes:
        
        Where do you see the client and the roundtrips? There would be only one (DB access once) and the the this method would be 100% transparent to the client.
        
        Re: (Score:2)
        
        by K. S. Kyosuke ( 729550 ) writes:
        
        I don't know, the mention of PHP and "passing something back", perhaps? It's not my fault if someone can't express himself clearly. Anyway, the query as it is written already does what the GGP suggests should be done "manually".
        
        Re: (Score:2)
        
        by errandum ( 2014454 ) writes:
        
        Yes, he could have phrased it better, but php is processed server side, the client only gets the final result. So no more round trips except the ones between server and database (and in most cases, they are on the same machine or cluster, so it's negligible).
        His method does speed up things if the query is executed with an IN and not with a JOIN... Even if it's not a best practice (lots of concurrency issues if you make things like that frequently).
        
        Re: (Score:2)
        
        by K. S. Kyosuke ( 729550 ) writes:
        
        "Yes, he could have phrased it better, but php is processed server side, the client only gets the final result."
        Uhm, no. For the DB server, the PHP process is the client. As it is a good custom to have the two machines separate (I really don't know where you got the idea of running these two things on a single machine, that's all but insane), you do get round trips over wire. These will definitely cost you more processing time than keeping it inside the DB server process memory space the whole time - you have to convert the inner query results to the wire protocol, send it through the TCP/IP stack, have it arrive int
        
        Re: (Score:2)
        
        by errandum ( 2014454 ) writes:
        
        And, as stated, it still runs faster than the IN statement (not the JOIN, as everyone already said). The server is the client... Never heard that one before, I guess the OP is not the only one that needs to express itself better. From where I come from never heard the server called client before.
        Second, for small businesses and the like, yes, the webserver and the DB server share resources. It's not only common, it's actually the default configuration option when you configure apache and php (so I'm guessin
        
        Re: (Score:2)
        
        by K. S. Kyosuke ( 729550 ) writes:
        
        And, as stated, it still runs faster than the IN statement (not the JOIN, as everyone already said).
        If you have that as a practical experience, then there must be something horribly broken in your DB server software.
        the time (sometimes minutes) that the server takes to process the IN, so you're still wrong.
        Ahhh, that's the "horribly broken thing" I'm talking about. I'm not wrong, since there is no practical reason for that to take minutes, unless, of course, the result of the subquery (SELECT pageid FROM hotnews WHERE user = 8) is really huge. Do you really have authors who have written millions of stories?
    - Re: (Score:2, Informative)
      
      by Anonymous Coward writes:
      
      That's just a crappy query... no wonder you've got bad performance.
      
      SELECT id, title FROM page JOIN hotnews on page.id = hotnews.pageid AND user = 8
      
      There. Enjoy your performance boost.
      - Re: (Score:2)
        
        by K. S. Kyosuke ( 729550 ) writes:
        
        The inner query was *perfectly fine*. It wasn't correlated with the outer query in any way. Nothing prevents the RDBMS to optimize the sequence of these to queries by first fetching the results from the inner query, hashing it and using it to filter the outer query. Optionally, the optimizer is free to rewrite it to a join on its own if the expected result set from the inner query is large.
      - Those queries (Score:2)
        
        by warrax_666 ( 144623 ) writes:
        
        are exactly equivalent. If your RDBMS is so crappy that it can't see that, then that's not the user's problem. SQL was supposed to be an abstraction, right?
    - Re: (Score:3)
      
      by viperidaenz ( 2515578 ) writes:
      
      I swear some people don't even know there is a JOIN keyword in the SQL language.
      - Re: (Score:2, Insightful)
        
        by Anonymous Coward writes:
        
        "some" people are perfectly aware of JOIN and have suffered one too many hairball, incomprehensible SQL statements with 10+ tables mashed together under FROM using JOIN. Sub-queries provide opportunities to factor out some of these into discrete parts that can be individually examined and tested.
        Please, stop discouraging the use and application of sub-queries. If you have encountered people that insist on using them despite MySQL's tragically bad optimizer, it is likely they have been exposed to powerful
        
        Re: (Score:2)
        
        by sg_oneill ( 159032 ) writes:
        
        Which would be fine if it wasnt for the fact the version with the JOIN isn't just faster, but simpler to understand.
        Seriously there is no excuse for bad SQL. Subqueries have their places, but using them like that is just bad programming.
        
        Re: (Score:3)
        
        by rev0lt ( 1950662 ) writes:
        
        I've just benchmarked 2 simple queries on PostgreSQL 8.2 and 8.4, and in 8.4 the IN clause is being translated to a hash semi-join, so it is faster (at least otn the simple testcase) than using JOIN.(I've already posted the queries, so if interested, scrolll up) While performance is important. using joins on complex queries usually result on an ugly mess, and I usually prefer maintainability over performance. Also, the parent is right - for complex expressions, it is way easier to test the individual subse
    - Re: (Score:2)
      
      by StuartHankins ( 1020819 ) writes:
      
      Before using an "in" clause here I'd probably use this (depends on the number of anticipated rows in both tables):
      select id, title from page as a where exists ( select 1 from hotnews b where b.user = 8 and b.id = a.id ) I would rearrange the fields in the "where" clause to match the index, highest selectivity first. If there were no suitable index I'd consider creating one, possibly a covering index.
    - - Re: (Score:2)
        
        by micheas ( 231635 ) writes:
        
        The version of mysql in debian no longer silently truncates strings by default.
        I found out as I was abusing the "feature" to not worry about sanitizing the length of data input and my import script suddenly stopped working last year and I had to do things a little less wrong.
        .
  - Re: (Score:2)
    
    by Admiral Llama ( 2826 ) writes:
    
    MySQL choked an on "...and (condition or condition)" doing a seq scan instead of filtering even though there was an index on the table that those conditions were on. I changed it to an "where in (...)" and I got the same result. I had to move the or conditional into being a derived table joined in.
    For the same type of query Informix does a dynamic hash join with no SQL thuggery necessary.
Finally! (Score:3)

by K. S. Kyosuke ( 729550 ) writes: on Thursday February 16, 2012 @03:14PM (#39064053)

I knew they would implement the Oracle storage backend into MySQL one day...

Share
twitter facebook
- Re:Finally! (Score:4, Interesting)
  
  by Admiral Llama ( 2826 ) writes: on Thursday February 16, 2012 @03:46PM (#39064539)
  
  No, rather it's amazing what happens when you architect the DB such that the entire thing is forcibly held in memory.
  
  Parent Share
  twitter facebook
  - Re: (Score:2)
    
    by K. S. Kyosuke ( 729550 ) writes:
    
    That was supposed to be a joke. :)
MySQL Cluster (Score:2)

by CastrTroy ( 595695 ) writes:

MySQL Seems like it could be interesting, but I can't get over how it requires the whole thing to be hosted in memory. I'm much more interested in Percona Cluster [percona.com] which is also based off MySQL.
- Re: (Score:2)
  
  by ducomputergeek ( 595742 ) writes:
  
  Using memory for db cluster nodes isn't new. Teradata has been doing it for a long time now in their data warehousing. It has its advantages on large databases and you have the money to afford the nodes you need for reliability.
- - Re: (Score:2)
    
    by tqk ( 413719 ) writes:
    
    ... Unless you're running WAMP or something.
    I hate people like you. [urbandictionary.com]
    dict wamp From V.E.R.A. -- Virtual Entity of Relevant Acronyms (June 2006) [vera]:
    WAMP Windows, Apache, MySQL, PHP (Windows, Apache, PHP, DB, SQL)
    Damnit, define your !@#$ing terms, and what a stupid acronym that is! The OS should be irrelevant.
    I really, really hate this century. :-P
    [Okay, I feel better now. Apologies for the outburst, but please do feel free to suck a rock and die.]
MySQL Cluster != MySQL (Score:5, Informative)

by Anonymous Coward writes: on Thursday February 16, 2012 @03:15PM (#39064083)

Not the same thing.

Share
twitter facebook
- Re: (Score:2)
  
  by Animats ( 122034 ) writes:
  
  MySQL Cluster used to be a pay-only product. Is it free now?
  (The setup with one MySQL master and N read-only slaves has been free for years. That powers Wikipedia, for example. It's not MySQL Cluster. A MySQL Cluster has multiple read-write machines.)
But of course it reads from RAM (Score:4, Interesting)

by photonyx ( 2507666 ) writes: on Thursday February 16, 2012 @03:20PM (#39064137)

The case when the data set is bigger than RAM amount has not been investigated (link here [blogspot.com], see the comments). The hard drive I/O speed would slow it dramatically, unless it's an expensive array of SSDs.

Share
twitter facebook
- Re: (Score:3)
  
  by fuzzyfuzzyfungus ( 1223518 ) writes:
  
  Nothing can really save your performance numbers once you have to hit the platters, so the question would be not 'does it suck more when it can't work purely in RAM?' but 'How good are the mechanisms available for minimizing(ideally automatically, but at least without too much black magic on the application programmers' parts) the frequency with which your setup ends up needing something that isn't in RAM and being bottlenecked by having to fetch it from disk?'...
  - Re: (Score:2)
    
    by tqk ( 413719 ) writes:
    
    How good are the mechanisms available for minimizing(ideally automatically, but at least without too much black magic on the application programmers' parts) the frequency with which your setup ends up needing something that isn't in RAM and being bottlenecked by having to fetch it from disk?
    Are we talking about a web back end db, or just a generic db? I ask as the page I'm replying to shows (according to AdBlockPlus) 47 separate entities running in the background. With all that !@#$ going on, what does DB fetch performance even mean?
    I think network performance is far and away the most determinant factor in this stuff. Optimize your queries to hell and back, but you'll still have to wait for the network. Zzzzzz ...
    "The DB's too slow!"
    Not necessarily. Why don't people understand Performance
- Re: (Score:3)
  
  by Bill, Shooter of Bul ( 629286 ) writes:
  
  Who the hell uses actual disks when performance is an issue?
  Also, who the hell uses ssd's when performance is an issue?
  The fast kids use PCI Express cards loaded with flash.
  Texas Memory systems or Fusion IO. Expensive, yes, but this is Mysql CLUSTER we are talking about. You don't use that unless you need the speed.
- Re: (Score:2)
  
  by sitkill ( 893183 ) writes:
  
  That's weird....I believe that when the dataset is bigger than ram, you just break it into different servers (or just get more ram). This isn't anything new, this is actually old OLD technology, and you can find cases of this everywhere (google, microsoft uses it for hotmail, etc), called the Brick architecture or something like that.
  
  Actually, a ton of research was done on these types of architectures by Jim Gray (who tragically was lost at sea a few years ago) at Microsoft. Not sure how many of these li
- Re: (Score:2)
  
  by Trogre ( 513942 ) writes:
  
  That would make it a RAED array.
That's great... (Score:2)

by Daetrin ( 576516 ) writes:

Now can they please work on some dramatic usability improvements so i don't have to cringe every time an Oracle support question comes up at work?
- Re: (Score:2)
  
  by jedidiah ( 1196 ) writes:
  
  If you can't be bothered, you can ignore a lot of the finer details in Oracle just as much as you would be prone to with any other RDBMS.
  - Re: (Score:2)
    
    by Daetrin ( 576516 ) writes:
    
    Uh, if it was something i could ignore then i wouldn't be cringing, i would be just shrugging and ignoring the problem. As long as some of our (big) customers insist on using Oracle we have to work with Oracle to support them. And they're not going to accept "we didn't implement/debug that feature in Oracle because working with Oracle is a pain" very well.
- Re: (Score:2)
  
  by tqk ( 413719 ) writes:
  
  Now can they please work on some dramatic usability improvements so i don't have to cringe every time an Oracle support question comes up at work?
  You may wish to question whether you're in the wrong line of work. RDBs aren't really all that complex. Stuff it in, then drag it out and present it in a pretty display format. What else is there to know?
  People are way too hung up on complexity. Life doesn't have to be that hard. You're making it harder than it has to be. Simplify.
  - Re: (Score:2)
    
    by Daetrin ( 576516 ) writes:
    
    I think you're pretty massively missing the point? It's not RDBs that are the problem, it's Oracle specifically. Try developing complex software using a SQL back end that supports both MS SQL and Oracle. Getting everything working on Oracle takes longer. And when there are problems it takes longer to debug on Oracle. Just getting Oracle installed properly was a huge pain in the ass compared to MS SQL.
    
    _Oracle_ is making it harder than it has to be. If i had the choice i'd simplify just by dropping support
    - Re: (Score:2)
      
      by tqk ( 413719 ) writes:
      
      It's not RDBs that are the problem, it's Oracle specifically.
      Ah. On that, we agree. Larry's one sharp snakeoil salesman.
Yes, I RTFA (sue me) (Score:4, Informative)

by billcopc ( 196330 ) writes: <vrillco@yahoo.com> on Thursday February 16, 2012 @03:30PM (#39064301) Homepage

If I read the sales pitch correctly, they just integrated Memcached as a backend storage module, so that it plays nicely wrt ACID compliance. Yeah, memory is 70x faster than disk I/O... big whoop!
Anyone running a sizeable MySQL installation already has heaps of RAM allocated to the InnoDB buffers/caches anyway. It sounds like Oracle compared a stock, distro-default MySQL to their memory-hungry tweaks. Yeah, DUH. I can get a 70x speedup too if I increase MySQL's memory usage from the default 64mb to 48 gigabytes.

Share
twitter facebook
- Re: (Score:3)
  
  by medv4380 ( 1604309 ) writes:
  
  If you're going to read an article about a benchmark you might actually read the benchmark too.
  Previously each data node exploited about 6 CPUs with 8 threads, in this benchmark we used about 14 CPUs with 24 threads. We've also removed a number of bottlenecks and the configuration of the threads is done in such a manner as to handle many different loads well.
  The little thing you read about memcached was about adding in a Native API which probably helped things, but getting the multithreading upgraded probably had more to do with it. They also were comparing two benchmarks that both were using just memory, and not a case where 1 used disk and the other used ram.
- Re: (Score:2)
  
  by dreemernj ( 859414 ) writes:
  
  It sounds like the RAM isn't actually the real boost. This is MySQL Cluster. The last time I used it, it kept everything in RAM already. I think since then they let you store non-indexed tables onto disk, but that's it. The speed boost sounds like it is for NDB, not InnoDB. NDB doesn't support foreign keys and removing that constraint probably helps its performance too.
  
  In this case, the actual boost is from AQL, which allows for a query to be split up by JOINs and sent out to different servers to run
  - - Re: (Score:2)
      
      by dreemernj ( 859414 ) writes:
      
      Yep, that's exactly what I said. But thank you for reiterating.
- Re: (Score:2)
  
  by lewiscr ( 3314 ) writes:
  
  DUH. I can get a 70x speedup too if I increase MySQL's memory usage from the default 64mb to 48 gigabytes.
  Just make sure you actually test it. Some of the performance recommendations in the manual don't scale from 64 MB to 48 GB. MyISAM Key Buffer and Query Cache, I'm looking at you.
  - Re: (Score:2)
    
    by billcopc ( 196330 ) writes:
    
    I, er... I think I shorthand a lot of this stuff, because it seems trivial to me as a programmer. Anything that works like a hashtable is going to experience diminishing returns, because you have to scan through that every-growing table. O(n^2) and stuff.
    On the other hand, anything that spares you from hitting a disk, that's a guaranteed win. What I commonly do with my DB servers, since my data sets are small but complex, is to give them enough memory to cache the entire database into RAM. The disk effe
Comment removed (Score:3)

by account_deleted ( 4530225 ) writes: on Thursday February 16, 2012 @03:35PM (#39064375)

Comment removed based on user account deletion

Share
twitter facebook
MySQL in-memory speed (Score:5, Interesting)

by Amadodd ( 620353 ) writes: on Thursday February 16, 2012 @03:35PM (#39064377)

Slightly off-topic, but I recently had the oppurtunity to test the speed of a MySQL in-memory database. I have some frequently queried read-only data that simply would not handle the load in MS SQL and was looking for an in-memory solution. MySQL provided the simplest implementation - simply tell the table to use memory storage and configure the server to allow the amount of data you want to host (~250MB in this case). You also have to remember to reload the data from normal InnoDB tables every time you restart the server. I used the same table structures, keys indexes and stored procedures (almost the same) to query the data and linked it through MS SQL so that my applications never new the difference. On exactly the same hardware the speed increase was at least 50X over MS SQL.

Share
twitter facebook
- Re: (Score:2)
  
  by bbbaldie ( 935205 ) writes:
  
  I've programmed PHP to use SQL Server and MySQL, the MySQL statements typically run in milliseconds, it usually takes SQL Server two or three seconds to respond.
  - Re: (Score:2)
    
    by rgbrenner ( 317308 ) writes:
    
    it usually takes SQL Server two or three seconds to respond
    then you're incompetent. stackoverflow uses SQL Server 2008, and serves 12-14 million pages a day. [brentozar.com]
    - Re: (Score:2)
      
      by gbjbaanb ( 229885 ) writes:
      
      that's 162 *pages* per second, not SQL queries. Thing is, how many of those are cached and simply re-served up again? How big are the servers that run these? How many servers do they use? How much Lucene do they use instead of SqlServer??? (you didn't watch the video did you? - 23 minutes in)
      Anyway, your argument is like saying Facebook takes 2 seconds to return my wall of data whereas MyPersonalWebsite with a PHP script I wrote returns instantly, therefore Facebook are useless losers.
      There might well be a
    - - Re: (Score:2)
        
        by SuiteSisterMary ( 123932 ) writes:
        
        Well, what did you expect?
        
        Re: (Score:2)
        
        by SuiteSisterMary ( 123932 ) writes:
        
        Sorry, they're still busy running a query on a guy, involving a table lock and several joins.
  - Re: (Score:2)
    
    by xaxa ( 988988 ) writes:
    
    I've programmed PHP to use SQL Server and MySQL, the MySQL statements typically run in milliseconds, it usually takes SQL Server two or three seconds to respond.
    Can you give an example?
    We're evaluating Postgres vs MS SQL at work, and at the moment haven't found any problem big enough to rule either out.
    - Re: (Score:2)
      
      by codepunk ( 167897 ) writes:
      
      problems other than cost you mean.
      - Re: (Score:2)
        
        by xaxa ( 988988 ) writes:
        
        problems other than cost you mean.
        We already have an MS SQL database for the HR, finance (etc) systems, so using that might be as cheap (or cheaper) than a new Postgres database, given existing staff (DBA) knowledge and some odd way they fudge the numbers. (Recurring costs for MS software are apparently negotiated and paid for centrally, rather than charged to the department using the software. Daft.)
        Essentially, arguing on grounds of inital/ongoing cost is difficult. My best cost argument is the no-cost license for Postgres means we coul
  - Re: (Score:2)
    
    by rev0lt ( 1950662 ) writes:
    
    Did you test the queries on the actual database server, or did you just run your script? Were you using regular drivers, or PDO? Was the SQL server on the same machine, or on a separate one?
    Just because your script didn't run well, doesn't mean that SQL Server was slower. Probably the PHP database modules had a lot to do with it - I guess there is much more attention given to MySQL integration than with SQL Server, or some misconfiguration (always 2-3 seconds? Maybe is a DNS issue, were you using hostnames
The 70x is for (distributed) joins (Score:5, Informative)

by jorela ( 84188 ) writes: on Thursday February 16, 2012 @03:56PM (#39064693)

which previously were quite poorly handled.
See http://www.clusterdb.com/mysql-cluster/70x-faster-joins-with-aql-in-mysql-cluster-7-2/?utm_source=rss&utm_medium=rss&utm_campaign=70x-faster-joins-with-aql-in-mysql-cluster-7-2

Share
twitter facebook
70x (Score:5, Funny)

by slasho81 ( 455509 ) writes: on Thursday February 16, 2012 @04:06PM (#39064835)

Developers: We've got some really good ideas for increasing performance of complex queries by...
Marketing: How much in the best conceivable case?
Developers: Oh, I dunno, maybe 70x.
Marketing: 70x? Is that good?
Developers: Yeah, I suppose, but the cool stuff is...
Marketing: Wow! 70x! That's a really big number!
Developers: Actually, please don't quote me on that. They'll make fun of me on Slashdot if you do. Promise me.
Marketing: We promise.
Developers: Thanks. Now, let me show you where the good stuff is...
Marketing (on phone): Larry? It's me. How big can you print me up a poster that says "70x"?

Share
twitter facebook
- - Re: (Score:2)
    
    by slasho81 ( 455509 ) writes:
    
    Larry is just a random sleazeball from Marketing. I didn't think of Ellison.
    - - Re: (Score:2)
        
        by socceroos ( 1374367 ) writes:
        
        [quote]Larry is sleazeball[/quote]
        ftfy
- - Re: (Score:2)
    
    by slasho81 ( 455509 ) writes:
    
    Marketers being marketers. :)
TPC-H (Score:2)

by Skinkie ( 815924 ) writes:

So if Oracle is so happy with MySQL. Give us a TPC-H [tpc.org] benchmark. Scalefactor: 100.
- Re:PostgreSQL (Score:5, Funny)
  
  by Hognoxious ( 631665 ) writes: on Thursday February 16, 2012 @03:35PM (#39064373) Homepage Journal
  
  Is PostgreSQL webscale? MongoDB is.
  
  Parent Share
  twitter facebook
  - Re:PostgreSQL (Score:5, Funny)
    
    by roman_mir ( 125474 ) writes: on Thursday February 16, 2012 @03:46PM (#39064523) Homepage Journal
    
    Any questions? [youtube.com]
    
    Parent Share
    twitter facebook
    - Re: (Score:3)
      
      by Sarten-X ( 1102295 ) writes:
      
      The first 27 seconds made a good point about comparing various products and evaluating them on their merits. The remaining 5 minutes was a mix of strawmen and fallacies. Is there supposed to be a point to that?
  - Re: (Score:3)
    
    by errandum ( 2014454 ) writes:
    
    http://www.xtranormal.com/watch/6995033/mongo-db-is-web-scale [xtranormal.com]
  - Re: (Score:2)
    
    by dkleinsc ( 563838 ) writes:
    
    Yes: It's strengths are reliability and price point (free), but it's pretty fast, has clustering capabilities, and has been used for large-scale web applications.
    Seriously, it's hard to go wrong with PostgreSql when you need a relational database. MongoDB, of course, is a very different animal intended for very different tasks.
    - Re: (Score:3)
      
      by Admiral Llama ( 2826 ) writes:
      
      Versus all the other factors you can throw in there for anything involving heavy lifting for an enterprise app, raw price point of the DB engine is pretty close to the bottom of the list.
      - Re: (Score:2)
        
        by jedidiah ( 1196 ) writes:
        
        Are you kidding? In an Oracle discussion?
        We're talking very large numbers here, and potentially more money than you will make in your entire lifetime.
        Some one definitely cares. They may bite their tongue and still buy Oracle but they do care. The numbers are not trivial.
- Re: (Score:2)
  
  by Daniel Phillips ( 238627 ) writes:
  
  Kubuntu did a nasty on me with a surprise, non-optional upgrade to kmail 4.7.2, which is just an unfinished work, to put it mildly. As a confirm glutton for punishment, I decided to make it work no matter what the coast. Anyway, it came with a MySQL backend that was sort-of working, but one day it just fell down and couldn't get up. So I replaced it with the experimental, unsupported PostgresQL backend, and after a little bit of pain learning how to administrate the server, it came up and so far has worked
  - Re: (Score:2)
    
    by gmack ( 197796 ) writes:
    
    At least when PostgreSQL starts working it stays working but I agree the same can't be said for MySQL. A recent upgrade changed the layout of mysql.user causing none of the users to be able to login. The oracle approved fix? mysqdump and restore but that just restored the bad layout of mysql.user. The actual fix involved dumping everything one database at a time and recreating mysql.user from a script. My best guess is that whoever designed the MySQL login system ignored SQL best practices and did the
  - - Re: (Score:2)
      
      by Daniel Phillips ( 238627 ) writes:
      
      If you start to manage lots of data ( and I think some people enter into this category for their email ), it make sense to use a DB.
      It makes sense on paper. It has yet to be proved that it makes sense for email, and in fact, so far kmail 7.2+ is pretty good evidence for the contrary. I am afraid you will have a difficult time finding anyone who has stayed with Kmail through its recent convulsions and has in fact forgotten about the performance issue. Just do a quick search on "kmail 7.3 slow" to convince yourself.
- Re: (Score:2)
  
  by account_deleted ( 4530225 ) writes:
  
  Comment removed based on user account deletion
- Re: (Score:2)
  
  by hawguy ( 1600213 ) writes:
  
  k. This is slightly laughable. 33,500 rows? in 87 seconds? that seems glacial. And 1.23 seconds being the new speed? that seems as expected.
  aside from comparing the speed of a Lada vs a common garden slug, how does this compare against other databases?
  It was actually only 2000 rows returned -- the source tables had 35K rows. But in fairness, this was for a join across 11 tables, and it's easy for a database to perform slowly at this task if it doesn't have a good query optimizer. A badly optimized query could end up doing thousands (or even millions) of full table scans of multi-thousand row tables.
  - - Re: (Score:2)
      
      by rev0lt ( 1950662 ) writes:
      
      I'm with you on this one. 1.23 seconds seems like a dreadful result, and I can't imagine why they thought it was a good example.
      The only heavy MySQL application I use is Magento (and it relies heavily on JOINs), and a small server can blow these results out of the water, on similar-sized tables, and with multiple concurrent queries (but only using 3 to 5 JOINs per query).
      - Re: (Score:2)
        
        by rev0lt ( 1950662 ) writes:
        
        The thing is that prior to this release, applications that were using SQL joins (involving more than 2 tables) where ruled OUT:
        Now a lot of more applications *can* use MySQL Cluster.
        Yeah, that's a less flattering remark than a 70x increase in speed, but probably more useful :D. I'm not familiarized with MySQL Cluster, and I'm surprised it is_that_ different from the server version.
- Re: (Score:2)
  
  by Larryish ( 1215510 ) writes:
  
  Is there a Virtualmin module for MariaDB yet?
  If so, I'll switch tonight.
- Re: (Score:2)
  
  by Lennie ( 16154 ) writes:
  
  Tried Drizzle, yet ?

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Considering sub queries in IN statements. (Score:5, Insightful)

Re:Considering sub queries in IN statements. (Score:5, Informative)

Re:Considering sub queries in IN statements. (Score:5, Insightful)

Re:Considering sub queries in IN statements. (Score:4, Insightful)

Re:Considering sub queries in IN statements. (Score:5, Insightful)

Re: (Score:2)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re:Considering sub queries in IN statements. (Score:5, Informative)

Re:Considering sub queries in IN statements. (Score:4, Informative)

Re:Considering sub queries in IN statements. (Score:4, Informative)

Re: (Score:2)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3, Interesting)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3, Informative)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2, Informative)

Re: (Score:2)

Those queries (Score:2)

Re: (Score:3)

Re: (Score:2, Insightful)

Re: (Score:2)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Finally! (Score:3)

Re:Finally! (Score:4, Interesting)

Re: (Score:2)

MySQL Cluster (Score:2)

Re: (Score:2)

Re: (Score:2)

MySQL Cluster != MySQL (Score:5, Informative)

Re: (Score:2)

But of course it reads from RAM (Score:4, Interesting)

Re: (Score:3)

Re: (Score:2)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

That's great... (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Yes, I RTFA (sue me) (Score:4, Informative)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Comment removed (Score:3)

MySQL in-memory speed (Score:5, Interesting)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

The 70x is for (distributed) joins (Score:5, Informative)

70x (Score:5, Funny)