Oracle Claims Dramatic MySQL Performance Improvements 168
New submitter simula67 writes "Oracle wins back some karma from the open source community by releasing MySQL cluster 7.2 with ambitious claims of 70x performance gains. The new release is GPL and claims to have processed over 1 billion queries per minute. Readers may remember the story about Oracle adding commercial extensions to MySQL."
Considering sub queries in IN statements. (Score:5, Insightful)
Re:Considering sub queries in IN statements. (Score:5, Informative)
If you create a query in mysql with an IN statement in the where clause and you put a sub query as the in statement current versions will run the query once for each row of the primary table you are querying. Caching result alone would probably get the 70x speed up. I am suspect that there are other performance stupidities in mysql that are worked around by people doing a simple query and then using php/perl/python/java/etc to parse the result and generate the second query.
yes, but many database engines have problems with inner select statements. Refactor your query to use a join an you'll be fine.
Re:Considering sub queries in IN statements. (Score:5, Insightful)
yes, but many database engines have problems with inner select statements. Refactor your query to use a join an you'll be fine.
And what's wrong with using an RDBMS with a non-braindead optimizer?
Re:Considering sub queries in IN statements. (Score:4, Insightful)
The poor RDBMS has to put up with crap like that over and over again.
It will probably need therapy after you are through with it.
The programmer is supposed to know how to use the tool effectively to get the best performance.
In that example there is a stupidly obvious solution - use a join.
Re:Considering sub queries in IN statements. (Score:5, Insightful)
The poor RDBMS has to put up with crap like that over and over again. It will probably need therapy after you are through with it.
The programmer is supposed to know how to use the tool effectively to get the best performance. In that example there is a stupidly obvious solution - use a join.
To me that sounds like fixing a bug in the wrong place. RDBMS is *not* supposed to be dumb. If it were supposed to be dumb, we would not be using SQL in the first place and we'd be passing our own execution plans in the DB engine's internal intermediate language instead. (Well, you can sort of do it with Firebird (BLR) and SQLite (VDBE), but it's not how it's supposed to be used.)
Re:Considering sub queries in IN statements. (Score:2)
RDBMS is *not* supposed to be dumb.
It is supposed to do what it was asked - if someone asks it to do something dumb, it has no choice.
Re:Considering sub queries in IN statements. (Score:3)
Re:Considering sub queries in IN statements. (Score:2)
Re:Considering sub queries in IN statements. (Score:2)
Re:Considering sub queries in IN statements. (Score:2)
Re:Considering sub queries in IN statements. (Score:2)
Re:Considering sub queries in IN statements. (Score:2)
Re:Considering sub queries in IN statements. (Score:5, Informative)
Anyway -- prior to NDBCLUSTER 7.2 if you performed any join whatsoever the API node had to pull the complete tables from all data node groups prior to doing the join on itself and returning the result. This made join performance, simply put -- terrible. I've tested the same query on a standalone out of the box mysql server against a 7.1 cluster and had an 8 second query come back from the cluster in several minutes due to the join performance.
NDBCLUSTER 7.2 adds what was called "push down joins" [blogspot.com] in development -- basically the data nodes now do the joins within their own sub-sets of the data for certain joins resulting in a dramatic improvement in performance, since now the API nodes just get the result from the network instead of the entire dataset.
It really is an amazing improvement and is a result of the dynamic capabilities of MySQL. NDBCLUSTER was never designed for the type of data people are throwing at it, and with the recent improvements it might actually be viable for use on production web sites for more common usage scenarios.
What I do not see addressed yet with 7.2 is the reload time -- if you have a cluster loaded with several GB of data it can take upwards of 20 minutes to reload the data and indices across the cluster. While the cluster is designed to be up 24x7 (even through upgrades), a single bug or memory error that takes it down can result in a prolonged outage. There are several of these [mysql.com] open in the bug tracker.
Re:Considering sub queries in IN statements. (Score:4, Informative)
If you create a query in mysql with an IN statement in the where clause and you put a sub query as the in statement current versions will run the query once for each row of the primary table you are querying. Caching result alone would probably get the 70x speed up
We have exactly that in MariaDB 5.3 (a fork of MySQL) : http://kb.askmonty.org/en/subquery-cache. It won't always give the 70x speedup, sometimes you need to start execution from the subquery to get decent performance: http://kb.askmonty.org/en/semi-join-subquery-optimizations.
I am sorry for shameless self-promotion, couldn't resist :-)
Re:Considering sub queries in IN statements. (Score:4, Informative)
Re:Considering sub queries in IN statements. (Score:2)
Just fyi, It's called a correlated subquery.
If the thing he's talking about had been correlated subquery, he wouldn't have been able to cache its results. Ergo, he must be talking about a lousy execution of a non-correlated subquery.
Re:Considering sub queries in IN statements. (Score:3)
Professional DBA's know this, and know how to code correctly to avoid the problem in the first place. Having it "work" vs having it work reliably, quickly, and in a manner which scales is what separates any computer user from a professional. Training, ambition, curiosity, and experience are what transforms you into that professional.
So while it may be nice to have the machine try to outsmart you while you work in a 4th generation language, learning how and why it works can be illuminating in other projects and other methods. You are able to do more because you know more.
Re:Considering sub queries in IN statements. (Score:2)
While I agree it's a professional's job to know this stuff, I would also like to point out it's not a crime to build your RDBMS in such a way that queries run as fast as possible regardless of how the query was written. Indeed, anything less is an inferior product.
Re:Considering sub queries in IN statements. (Score:2)
Here's an actual example:
http://code.google.com/p/django-tagging/issues/detail?id=160 [google.com]
Basically, the subquery was run all the time, leading to an amplification of biblical proportions. The solution there was to put that subquery in another subquery (yo dawg), as shown in http://www.xaprb.com/blog/2006/04/30/how-to-optimize-subqueries-and-joins-in-mysql/ [xaprb.com]
We also tried with a temp table, but this solution was cleaner :) The performance difference was around 1/100th of the time for a sample query.
We also confirmed that the problem still existed in MySQL 5.5
Re:Considering sub queries in IN statements. (Score:3, Interesting)
The latest (past year's worth) of PostgreSQL releases automatically rewrite subqueries as joins for you.
However, MSSQL 2008 performance dropped significantly from prior versions due to the exact opposite --- subqueries are now looped through, where previously they were handled with hash tables internally.
Re:Considering sub queries in IN statements. (Score:3)
SELECT * FROM STATEMENTS WHERE CLIENT IN (SELECT ID FROM CLIENTS WHERE STATUS=1);
SELECT * FROM STATEMENTS JOIN CLIENT ON STATEMENTS.CLIENT=CLIENTS.ID WHERE CLIENTS.STATUS=1;
The statements table has about 2.5 million rows and clients has about 30k rows. The match (status=1) results in roughly 10k rows.
Now the interesting part - In 8.2, using JOIN is quite faster, but in 8.4, using IN is actually faster than using JOIN (at least for these queries), because it uses a hash table and a hash join, insted of a full join.
Re:Considering sub queries in IN statements. (Score:2)
You are doing a SELECT *, so when you use the JOIN you are getting more columns back then when you are using IN. Rewrite to specify which columns you want back, and I suspect they will perform equally.
Re:Considering sub queries in IN statements. (Score:2)
Re:Considering sub queries in IN statements. (Score:3, Informative)
SELECT id, title FROM page WHERE id IN (SELECT pageid FROM hotnews WHERE user = 8)
The answer to this is to execute the sub query and pass the result back as an array. If you have 10,000 records in table page, you will see about a 2000x speed improvement if you are using php
Re:Considering sub queries in IN statements. (Score:3)
Re:Considering sub queries in IN statements. (Score:2)
Where do you see the client and the roundtrips? There would be only one (DB access once) and the the this method would be 100% transparent to the client.
Re:Considering sub queries in IN statements. (Score:2)
Re:Considering sub queries in IN statements. (Score:2)
Yes, he could have phrased it better, but php is processed server side, the client only gets the final result. So no more round trips except the ones between server and database (and in most cases, they are on the same machine or cluster, so it's negligible).
His method does speed up things if the query is executed with an IN and not with a JOIN... Even if it's not a best practice (lots of concurrency issues if you make things like that frequently).
Re:Considering sub queries in IN statements. (Score:2)
"Yes, he could have phrased it better, but php is processed server side, the client only gets the final result."
Uhm, no. For the DB server, the PHP process is the client. As it is a good custom to have the two machines separate (I really don't know where you got the idea of running these two things on a single machine, that's all but insane), you do get round trips over wire. These will definitely cost you more processing time than keeping it inside the DB server process memory space the whole time - you have to convert the inner query results to the wire protocol, send it through the TCP/IP stack, have it arrive into the PHP's DB driver, convert it into an array, have it reprocessed with your PHP script into another textual query, send the query into the DB server again, have it compiled and optimized by the DB server (as a new query with different constant data, it can't be even precompiled and cached), have it executed again (with the concomitant query execution overhead) - no, I really don't think that this can actually speed up anything. Having the server do a hash join of the inner query automatically is way faster, and any good database can do that. BTW, it will be faster even if the two processes do run on a single machine (which is something for which any system designer should be hanged by his balls, unless he's the sole (home) user of such a system).
Re:Considering sub queries in IN statements. (Score:2)
And, as stated, it still runs faster than the IN statement (not the JOIN, as everyone already said). The server is the client... Never heard that one before, I guess the OP is not the only one that needs to express itself better. From where I come from never heard the server called client before.
Second, for small businesses and the like, yes, the webserver and the DB server share resources. It's not only common, it's actually the default configuration option when you configure apache and php (so I'm guessing, not uncommon).
And there is no extra roundtrip for anything anyways. You do a query and then you filter them serverside and that's it. The time you described about passing everything through the process is negligible when you consider the time (sometimes minutes) that the server takes to process the IN, so you're still wrong.
The only thing you got right is that the join method will be faster. But that's what everyone has been telling everywhere from the start.
Re:Considering sub queries in IN statements. (Score:2)
And, as stated, it still runs faster than the IN statement (not the JOIN, as everyone already said).
If you have that as a practical experience, then there must be something horribly broken in your DB server software.
the time (sometimes minutes) that the server takes to process the IN, so you're still wrong.
Ahhh, that's the "horribly broken thing" I'm talking about. I'm not wrong, since there is no practical reason for that to take minutes, unless, of course, the result of the subquery (SELECT pageid FROM hotnews WHERE user = 8) is really huge. Do you really have authors who have written millions of stories?
Re:Considering sub queries in IN statements. (Score:2, Informative)
SELECT id, title FROM page JOIN hotnews on page.id = hotnews.pageid AND user = 8
There. Enjoy your performance boost.
Re:Considering sub queries in IN statements. (Score:2)
Those queries (Score:2)
are exactly equivalent. If your RDBMS is so crappy that it can't see that, then that's not the user's problem. SQL was supposed to be an abstraction, right?
Re:Considering sub queries in IN statements. (Score:3)
Re:Considering sub queries in IN statements. (Score:2, Insightful)
"some" people are perfectly aware of JOIN and have suffered one too many hairball, incomprehensible SQL statements with 10+ tables mashed together under FROM using JOIN. Sub-queries provide opportunities to factor out some of these into discrete parts that can be individually examined and tested.
Please, stop discouraging the use and application of sub-queries. If you have encountered people that insist on using them despite MySQL's tragically bad optimizer, it is likely they have been exposed to powerful databases that haven't been copping out on basic RDBMS functionality for over a decade. Instead, harangue the MySQL developers; they are the people at fault.
Re:Considering sub queries in IN statements. (Score:2)
Which would be fine if it wasnt for the fact the version with the JOIN isn't just faster, but simpler to understand.
Seriously there is no excuse for bad SQL. Subqueries have their places, but using them like that is just bad programming.
Re:Considering sub queries in IN statements. (Score:3)
Re:Considering sub queries in IN statements. (Score:2)
select id, title
from page as a
where exists (
select 1
from hotnews b
where b.user = 8
and b.id = a.id
)
I would rearrange the fields in the "where" clause to match the index, highest selectivity first. If there were no suitable index I'd consider creating one, possibly a covering index.
Re:Considering sub queries in IN statements. (Score:2)
The version of mysql in debian no longer silently truncates strings by default.
I found out as I was abusing the "feature" to not worry about sanitizing the length of data input and my import script suddenly stopped working last year and I had to do things a little less wrong.
.
Re:Considering sub queries in IN statements. (Score:2)
MySQL choked an on "...and (condition or condition)" doing a seq scan instead of filtering even though there was an index on the table that those conditions were on. I changed it to an "where in (...)" and I got the same result. I had to move the or conditional into being a derived table joined in.
For the same type of query Informix does a dynamic hash join with no SQL thuggery necessary.
Finally! (Score:3)
Re:Finally! (Score:4, Interesting)
No, rather it's amazing what happens when you architect the DB such that the entire thing is forcibly held in memory.
Re:Finally! (Score:2)
MySQL Cluster (Score:2)
Re:MySQL Cluster (Score:2)
Using memory for db cluster nodes isn't new. Teradata has been doing it for a long time now in their data warehousing. It has its advantages on large databases and you have the money to afford the nodes you need for reliability.
Re:MySQL Cluster (Score:2)
... Unless you're running WAMP or something.
I hate people like you. [urbandictionary.com]
dict wamp
From V.E.R.A. -- Virtual Entity of Relevant Acronyms (June 2006) [vera]:
WAMP
Windows, Apache, MySQL, PHP (Windows, Apache, PHP, DB, SQL)
Damnit, define your !@#$ing terms, and what a stupid acronym that is! The OS should be irrelevant.
I really, really hate this century. :-P
[Okay, I feel better now. Apologies for the outburst, but please do feel free to suck a rock and die.]
MySQL Cluster != MySQL (Score:5, Informative)
Not the same thing.
Re:MySQL Cluster != MySQL (Score:2)
MySQL Cluster used to be a pay-only product. Is it free now?
(The setup with one MySQL master and N read-only slaves has been free for years. That powers Wikipedia, for example. It's not MySQL Cluster. A MySQL Cluster has multiple read-write machines.)
But of course it reads from RAM (Score:4, Interesting)
Re:But of course it reads from RAM (Score:3)
Re:But of course it reads from RAM (Score:2)
How good are the mechanisms available for minimizing(ideally automatically, but at least without too much black magic on the application programmers' parts) the frequency with which your setup ends up needing something that isn't in RAM and being bottlenecked by having to fetch it from disk?
Are we talking about a web back end db, or just a generic db? I ask as the page I'm replying to shows (according to AdBlockPlus) 47 separate entities running in the background. With all that !@#$ going on, what does DB fetch performance even mean?
I think network performance is far and away the most determinant factor in this stuff. Optimize your queries to hell and back, but you'll still have to wait for the network. Zzzzzz ...
"The DB's too slow!"
Not necessarily. Why don't people understand Performance Analysis these days? Complex systems are complex. There's lots of variables involved.
Re:But of course it reads from RAM (Score:3)
Also, who the hell uses ssd's when performance is an issue?
The fast kids use PCI Express cards loaded with flash.
Texas Memory systems or Fusion IO. Expensive, yes, but this is Mysql CLUSTER we are talking about. You don't use that unless you need the speed.
Re:But of course it reads from RAM (Score:2)
Actually, a ton of research was done on these types of architectures by Jim Gray (who tragically was lost at sea a few years ago) at Microsoft. Not sure how many of these links are still active but: http://research.microsoft.com/en-us/um/people/gray/ [microsoft.com]
From my own experience, at a job a few years ago, we were dealing with a little over 1 petabyte of data, and the system was engineered to NEVER hit platter, and to always have data in ram.
Re:But of course it reads from RAM (Score:2)
That would make it a RAED array.
That's great... (Score:2)
Re:That's great... (Score:2)
If you can't be bothered, you can ignore a lot of the finer details in Oracle just as much as you would be prone to with any other RDBMS.
Re:That's great... (Score:2)
Re:That's great... (Score:2)
Now can they please work on some dramatic usability improvements so i don't have to cringe every time an Oracle support question comes up at work?
You may wish to question whether you're in the wrong line of work. RDBs aren't really all that complex. Stuff it in, then drag it out and present it in a pretty display format. What else is there to know?
People are way too hung up on complexity. Life doesn't have to be that hard. You're making it harder than it has to be. Simplify.
Re:That's great... (Score:2)
_Oracle_ is making it harder than it has to be. If i had the choice i'd simplify just by dropping support for Oracle and only work with MS SQL, but that's not my choice to make.
Re:That's great... (Score:2)
It's not RDBs that are the problem, it's Oracle specifically.
Ah. On that, we agree. Larry's one sharp snakeoil salesman.
Yes, I RTFA (sue me) (Score:4, Informative)
If I read the sales pitch correctly, they just integrated Memcached as a backend storage module, so that it plays nicely wrt ACID compliance. Yeah, memory is 70x faster than disk I/O... big whoop!
Anyone running a sizeable MySQL installation already has heaps of RAM allocated to the InnoDB buffers/caches anyway. It sounds like Oracle compared a stock, distro-default MySQL to their memory-hungry tweaks. Yeah, DUH. I can get a 70x speedup too if I increase MySQL's memory usage from the default 64mb to 48 gigabytes.
Re:Yes, I RTFA (sue me) (Score:3)
Previously each data node exploited about 6 CPUs with 8 threads, in this benchmark we used about 14 CPUs with 24 threads. We've also removed a number of bottlenecks and the configuration of the threads is done in such a manner as to handle many different loads well.
The little thing you read about memcached was about adding in a Native API which probably helped things, but getting the multithreading upgraded probably had more to do with it. They also were comparing two benchmarks that both were using just memory, and not a case where 1 used disk and the other used ram.
Re:Yes, I RTFA (sue me) (Score:2)
In this case, the actual boost is from AQL, which allows for a query to be split up by JOINs and sent out to different servers to run against their local tables in parallel.
Re:Yes, I RTFA (sue me) (Score:2)
Re:Yes, I RTFA (sue me) (Score:2)
DUH. I can get a 70x speedup too if I increase MySQL's memory usage from the default 64mb to 48 gigabytes.
Just make sure you actually test it. Some of the performance recommendations in the manual don't scale from 64 MB to 48 GB. MyISAM Key Buffer and Query Cache, I'm looking at you.
Re:Yes, I RTFA (sue me) (Score:2)
I, er... I think I shorthand a lot of this stuff, because it seems trivial to me as a programmer. Anything that works like a hashtable is going to experience diminishing returns, because you have to scan through that every-growing table. O(n^2) and stuff.
On the other hand, anything that spares you from hitting a disk, that's a guaranteed win. What I commonly do with my DB servers, since my data sets are small but complex, is to give them enough memory to cache the entire database into RAM. The disk effectively becomes a backup copy, in case I need to reboot the box. With RAM being so cheap these days, it's usually more cost effective than deploying more cluster nodes. Memcached is great if all you want is a key-value bucket, but for speeding up existing applications that (ab)use SQL, RAM is the undisputed king of performance.
Comment removed (Score:3)
MySQL in-memory speed (Score:5, Interesting)
Re:MySQL in-memory speed (Score:2)
Re:MySQL in-memory speed (Score:2)
it usually takes SQL Server two or three seconds to respond
then you're incompetent. stackoverflow uses SQL Server 2008, and serves 12-14 million pages a day. [brentozar.com]
Re:MySQL in-memory speed (Score:2)
that's 162 *pages* per second, not SQL queries. Thing is, how many of those are cached and simply re-served up again? How big are the servers that run these? How many servers do they use? How much Lucene do they use instead of SqlServer??? (you didn't watch the video did you? - 23 minutes in)
Anyway, your argument is like saying Facebook takes 2 seconds to return my wall of data whereas MyPersonalWebsite with a PHP script I wrote returns instantly, therefore Facebook are useless losers.
There might well be a 2 second delay if the server is repeating many different sql queries - I used to see a Oracle DB that spent most of its time waiting on the query cache - or if its hitting the DB with queries that lock a lot of data pages - I've seen an app that happily took a lock on nearly all the data in various tables escalating the lock to a full table lock. So yes, they're incompetent, but sometimes queries that work well on one DB don't work well on others, in particular queries that happily ran on Oracle killed Sql Server, simply because of the locking design these 2 DBs implemented (ok, so much so that Sql server now has row-level locking, but there could be others still in there causing problems).
Re:MySQL in-memory speed (Score:2)
Re:MySQL in-memory speed (Score:2)
Re:MySQL in-memory speed (Score:2)
I've programmed PHP to use SQL Server and MySQL, the MySQL statements typically run in milliseconds, it usually takes SQL Server two or three seconds to respond.
Can you give an example?
We're evaluating Postgres vs MS SQL at work, and at the moment haven't found any problem big enough to rule either out.
Re:MySQL in-memory speed (Score:2)
problems other than cost you mean.
Re:MySQL in-memory speed (Score:2)
problems other than cost you mean.
We already have an MS SQL database for the HR, finance (etc) systems, so using that might be as cheap (or cheaper) than a new Postgres database, given existing staff (DBA) knowledge and some odd way they fudge the numbers. (Recurring costs for MS software are apparently negotiated and paid for centrally, rather than charged to the department using the software. Daft.)
Essentially, arguing on grounds of inital/ongoing cost is difficult. My best cost argument is the no-cost license for Postgres means we could change our architecture, have ten times as many servers, have virtual servers, have lots of development servers -- anything -- without having to pay. I'm also concerned that although Microsoft sees us as "academic" at the moment, that might change -- we aren't a university, and the government is telling us we should be trying to make money with our skills/data where we can.
So, a good, technical reason is much better. Various limitations have ruled out MySQL (e.g. problems with multiple engines -- you may have to choose between full-text search and transactions, for example). Postgres has the edge over MS SQL at the moment -- partly flexibility, partly the GIS extensions, I'll elaborate if anyone's interested. But SQL Server at least claims to do these things too.
Re:MySQL in-memory speed (Score:2)
Just because your script didn't run well, doesn't mean that SQL Server was slower. Probably the PHP database modules had a lot to do with it - I guess there is much more attention given to MySQL integration than with SQL Server, or some misconfiguration (always 2-3 seconds? Maybe is a DNS issue, were you using hostnames or just IP's?).
The 70x is for (distributed) joins (Score:5, Informative)
which previously were quite poorly handled.
See http://www.clusterdb.com/mysql-cluster/70x-faster-joins-with-aql-in-mysql-cluster-7-2/?utm_source=rss&utm_medium=rss&utm_campaign=70x-faster-joins-with-aql-in-mysql-cluster-7-2
70x (Score:5, Funny)
Marketing: How much in the best conceivable case?
Developers: Oh, I dunno, maybe 70x.
Marketing: 70x? Is that good?
Developers: Yeah, I suppose, but the cool stuff is...
Marketing: Wow! 70x! That's a really big number!
Developers: Actually, please don't quote me on that. They'll make fun of me on Slashdot if you do. Promise me.
Marketing: We promise.
Developers: Thanks. Now, let me show you where the good stuff is...
Marketing (on phone): Larry? It's me. How big can you print me up a poster that says "70x"?
Re:70x (Score:2)
Re:70x (Score:2)
ftfy
Re:70x (Score:2)
TPC-H (Score:2)
Re:PostgreSQL (Score:5, Funny)
Is PostgreSQL webscale? MongoDB is.
Re:PostgreSQL (Score:5, Funny)
Any questions? [youtube.com]
Re:PostgreSQL (Score:3)
The first 27 seconds made a good point about comparing various products and evaluating them on their merits. The remaining 5 minutes was a mix of strawmen and fallacies. Is there supposed to be a point to that?
Re:PostgreSQL (Score:3)
http://www.xtranormal.com/watch/6995033/mongo-db-is-web-scale [xtranormal.com]
Re:PostgreSQL (Score:2)
Yes: It's strengths are reliability and price point (free), but it's pretty fast, has clustering capabilities, and has been used for large-scale web applications.
Seriously, it's hard to go wrong with PostgreSql when you need a relational database. MongoDB, of course, is a very different animal intended for very different tasks.
Re:PostgreSQL (Score:3)
Versus all the other factors you can throw in there for anything involving heavy lifting for an enterprise app, raw price point of the DB engine is pretty close to the bottom of the list.
Re:PostgreSQL (Score:2)
Are you kidding? In an Oracle discussion?
We're talking very large numbers here, and potentially more money than you will make in your entire lifetime.
Some one definitely cares. They may bite their tongue and still buy Oracle but they do care. The numbers are not trivial.
Re:PostgreSQL (Score:2)
Kubuntu did a nasty on me with a surprise, non-optional upgrade to kmail 4.7.2, which is just an unfinished work, to put it mildly. As a confirm glutton for punishment, I decided to make it work no matter what the coast. Anyway, it came with a MySQL backend that was sort-of working, but one day it just fell down and couldn't get up. So I replaced it with the experimental, unsupported PostgresQL backend, and after a little bit of pain learning how to administrate the server, it came up and so far has worked considerably better than the MySQL backend. I had to tell to use less mmapped memory, which is something no user should ever have to be expected to deal with, and it is beyond me why PostgreSQL couldn't figure that one out itself. Well, I suppose if I care enough I will send a patch, but the point is: if that is the biggest issue I had with it, that is damned sweet. So color me impressed with PostgreSQL, and actually, I will stand up and thank the nutsoid KDE dev who pulled this stunt because I finally have an excuse to get to know the free database I should have used a lot more all these years. And by the way, I am kind of curious whether having a full blown relational database sitting under my email client will actually result in smarter email handling some day. So far, not visibly smarter, just slower, fails to filter my spam and sometimes has races and needs to be told how to resolve them. I'll give it a while and see if some actual power shows up.
Re:PostgreSQL (Score:2)
At least when PostgreSQL starts working it stays working but I agree the same can't be said for MySQL. A recent upgrade changed the layout of mysql.user causing none of the users to be able to login. The oracle approved fix? mysqdump and restore but that just restored the bad layout of mysql.user. The actual fix involved dumping everything one database at a time and recreating mysql.user from a script. My best guess is that whoever designed the MySQL login system ignored SQL best practices and did the C equivalent of a "select * from mysql.user" rather than explicitly ask for the fields in the needed order.
Re:PostgreSQL (Score:2)
If you start to manage lots of data ( and I think some people enter into this category for their email ), it make sense to use a DB.
It makes sense on paper. It has yet to be proved that it makes sense for email, and in fact, so far kmail 7.2+ is pretty good evidence for the contrary. I am afraid you will have a difficult time finding anyone who has stayed with Kmail through its recent convulsions and has in fact forgotten about the performance issue. Just do a quick search on "kmail 7.3 slow" to convince yourself.
Comment removed (Score:2)
Re:33,500 rows? (Score:2)
k. This is slightly laughable. 33,500 rows? in 87 seconds? that seems glacial. And 1.23 seconds being the new speed? that seems as expected.
aside from comparing the speed of a Lada vs a common garden slug, how does this compare against other databases?
It was actually only 2000 rows returned -- the source tables had 35K rows. But in fairness, this was for a join across 11 tables, and it's easy for a database to perform slowly at this task if it doesn't have a good query optimizer. A badly optimized query could end up doing thousands (or even millions) of full table scans of multi-thousand row tables.
Re:33,500 rows? (Score:2)
The only heavy MySQL application I use is Magento (and it relies heavily on JOINs), and a small server can blow these results out of the water, on similar-sized tables, and with multiple concurrent queries (but only using 3 to 5 JOINs per query).
Re:33,500 rows? 70x (Score:2)
The thing is that prior to this release, applications that were using SQL joins (involving more than 2 tables) where ruled OUT:
Now a lot of more applications *can* use MySQL Cluster.
Yeah, that's a less flattering remark than a 70x increase in speed, but probably more useful :D. I'm not familiarized with MySQL Cluster, and I'm surprised it is_that_ different from the server version.
Re:MariaDB? (Score:2)
Is there a Virtualmin module for MariaDB yet?
If so, I'll switch tonight.
Re:MariaDB? (Score:2)
Tried Drizzle, yet ?