Slashdot Log In
PostgreSQL vs. MySQL comparison
Posted by
Hemos
on Mon Dec 18, 2006 08:25 AM
from the get-the-morning-starter dept.
from the get-the-morning-starter dept.
prostoalex writes "Ever find yourself wondering which open source database is the best tool for the job? Well, wonder no more, and let your tax dollars do the work in the form of Fermi National Accelerator Laboratory publishing this unbiased review of MySQL vs. PostgreSQL. After reading it, however, it seems that MySQL ranks the same or better on most of the accounts." My poor sleepy eyes misread the date of posting on here; caveat that this is more then 15 months old.
Related Stories
[+]
News: Pro MySQL 100 comments
Rob Lanphier writes "I'm sure there are plenty of people like myself who do a fair amount of
programming against MySQL databases, and consider it a feature of the
product that it's pretty easy to do without having to fuss much with
the actual database. Still, it's nice to look at what goes on under the
hood, even if smoke isn't pouring out from beneath it. Pro MySQL
by Michael Kruckenberg and Jay Pipes provides a broad
well-organized exploration of intermediate and advanced MySQL topics
that is a satisfying overview of the database management system." Read the rest of Rob's review.
[+]
PostgreSQL Slammed by PHP Creator 527 comments
leifbk writes "'The Web is broken and it's all your fault' says Rasmus Lerdorf, the creator of PHP. He talks about not trusting user input, and the brokenness of IE, which is all fine. Then he makes a statement about MySQL vs PostgreSQL: 'If you can fit your problem into what MySQL can handle it's very fast,' Lerdorf said. 'You can gain quite a bit of performance.' For the items that MySQL doesn't handle as well as PostgreSQL, Lerdorf noted that some features can be emulated in PHP itself, and you still end up with a net performance boost. Naturally, the PostgreSQL community is rather unimpressed. One of the more amusing replies: 'I wasn't able to find anything the article worth discussing. If you give up A, C, I, and D, of course you get better performance- just like you can get better performance from a wheel-less Yugo if you slide it down a luge track.'"
[+]
MySQL Quietly Drops Support For Debian Linux [UPDATED] 339 comments
volts writes "MySQL quietly deprecated support for most Linux distributions on October 16, when its 'MySQL Network' support plan was replaced by 'MySQL Enterprise.' MySQL now supports only two Linux distributions — Red Hat Enterprise Linux and SUSE Linux Enterprise Server. We learned of this when MySQL declined to sell us support for some new Debian-based servers. Our sales rep 'found out from engineering that the current Enterprise offering is no longer supported on Debian OS.' We were told that 'Generic Linux' in MySQL's list of supported platforms means 'generic versions of the implementations listed above'; not support for Linux in general." Update: 12/13 20:52 GMT by J : MySQL AB's Director of Architecture (and former Slash programmer) Brian Aker corrects an apparent miscommunication in a blog post: "we are just starting to roll out [Enterprise] binaries... We don't build binaries for Debian in part because the Debian community does a good job themselves... If you call MySQL and you have support we support you if you are running Debian (the same with Suse, RHEL, Fedora, Ubuntu and others)... someone in Sales was left with the wrong information"
This discussion has been archived.
No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
Full
Abbreviated
Hidden
Loading... please wait.
No Digg (Score:5, Informative)
2. This article is 2 years old. Everything in its comparisons is out of date.
Re:No Digg (Score:5, Informative)
We run Postgres for our main business application and the main limitations are of two forms:
1) Depth of community
The Postgres community is great - very responsive and knowledgeable, but its size is a limitation in a number of ways. The ODBC driver is a bit of stepchild to the main project, and some key functions like dblink that address missing features like cross-database selects are relegated to
For the same reason a key subset of its documentation is very sparse. Documentation for the core system is thorough, clear and concise, but anything in contrib or any projects like the ODBC or
2) Postgres is very close to being a true enterprise contender (unlike MySQL, which is evolving that direction but distinctly further off), but lacks some key features like XML handling, a more comprehensible approach to result sets (anyone who's dealt with rowtypes and casting resultsets can attest to the steep learning curve), and a userbase that has put the product through the wringer. Now that some corporate heads are getting interested (e.g. Sun, Red Hat, EnterpriseDB) hopefully some of these shortcomings will be addressed in short order.
Don't let this outdated, apples to oranges comparison fool you: Postgres is a very solid and usable database.
Parent
Old news (Score:5, Informative)
"Last modified: February 15, 2005."
Re:Old news (Score:5, Funny)
Parent
Re:Old news (Score:4, Funny)
Parent
Old and wrong (Score:5, Informative)
Re: (Score:3, Funny)
You can't say it is "old" and "wrong" when it is wrong because it is old.
Not similar to my experience (Score:5, Informative)
One comment spammer can completely annihilate it.
One developer I talked to once did some testing. On one simultaneous connection, mysql was way faster. By five or so, they were close. At ten, PostgreSQL was definitely winning. At a hundred, he was simply unable to get a single MySQL server to complete the test successfully, let alone do it quickly.
The impression I get is that PostgreSQL uses more robust algorithms, with higher constant costs and lower quadratic costs. In any event, never had any problems.
As noted elsewhere, these comparisons are quite old...
But in any event, in my own experience, mysql is a lot easier to blow up by overloading than postgres is, at least if you have a lot of writes going on. For pure-lookup functions, it might do better -- but a lot of modern database apps are pretty compulsive about saving at least something every time someone touches them. (For instance, modern vBulletin saves last visits, threads seen, and so on; all of that adds up to a huge load on the database server.)
Re: (Score:3, Informative)
Obviously you need to tune your environment (there are a plethora of options including table types which can impact things a LOT) to
Re: (Score:3, Insightful)
It seems to me that if you step back from the details, there is a fundamental difference in style between the two systems that could be summarized thus:
Postgres: emphasizes completeness, correctness, and conformance.
MySQL: emphasizes immediate practicality.
One style is not intrinsically better than the other. Given time their results may begin to converge, which I think is starting to happen. However, I am not surprised that many peop
MySQL is ridiculously easy to configure (Score:3, Insightful)
Re:MySQL is ridiculously easy to configure (Score:5, Insightful)
Parent
Re: (Score:3, Insightful)
Re:MySQL is ridiculously easy to configure (Score:5, Funny)
Short answer: No.
Longer answer: None at all.
Parent
Re: (Score:3, Insightful)
The problem is that you apparently missed his point entirely.
There are several problems (Score:5, Informative)
1 -- This article is years old.
2 -- This article is posted solely to stir up (repetitive) discussion.
3 -- This article pretends that MySQL is a real database, even though in order to do so it has to make gigantic leaps like considering data integrity to be not really all that important in a database.
4 -- This article trolled me.
I'd rather (Score:3, Insightful)
Outdated and Silly (Score:3, Informative)
Doesn't mean a thing and this is why ...... (Score:3, Insightful)
Crap! (Score:4, Informative)
I call pure, unadulterated crap on this one.
One of the major new features in Postgresql 8 was native Windows support. It runs just fine as a service.
This comparision is either very old news, incompetence in action, or, um! strongly biased.
Why we moved from MySQL to PG (Score:5, Interesting)
1) Postgresql is more full featured than MySQL
2) MySQL is faster in a read-mostly environment
That's pretty much the same as the anecdotal arguments have been for years.
In my job, we moved from mysql to postgres several years ago (around PG 7.0). At the time, we needed to make the move for performance reasons. We are in a read-write system, and MySQL's locking was killing us (this was before InnoDB was well established). The features are better too, as our developers were used to having data integrity features, server side programming, and all of the SQL92 constructs available. We also learned a bit about PG performance, which I'll share.
1) Run EXPLAIN ANALYZE on everything. Postgresql is touchier about query performance than MySQL was. This just needs to be a habit if you're using PG. (You really should do performance analysis no matter your DB. It's just a good practice). The biggest gain will be making sure you're using index scans rather than sequential scans.
2) Use persistent connections. Everyone likes to point out the forking issue with PG vs. MySQL's threaded. PG's connection handling is slow, there's no doubt about it. But there's an easy answer. Just limit how often you connect. If you can keep a connection pool, and just reuse those connections, you'll save this big hit.
3) Full vacuum and reindex regularly. We've found the docs to be a bit off on this. It indicates that you should run these occasionally. If you're in a read-write system, a full vacuum on a regular basis is very important. It really doesn't take that long if you do it regularly. Also, we've had trouble with indexes getting unbalanced (we see 50->90% tuple turnover daily). This has gotten better, but it doesn't hurt to let your maintenance scripts make things ideal for you. So, we run a full vacuum and reindex of our tables nightly through cron.
4) Get your shared memory right. PG's shared buffers is probably the most important config attribute. It controls how much of your DB is memory resident vs disk resident. Avoiding disk hits is a big deal for any DB, so get this right. If you can fit your whole DB in memory, then do it. If not, make sure your primary tables will fit. The more you use the shared memory, and the less you have to page data in/out, the better your overall performance will be.
Most DB systems seem to be read-mostly, so I can understand the performance comparisons focusing on that. In our read-write system though, the locking was the biggest issue and it tilted the performance comparison toward PG.
Re:Why we moved from MySQL to PG (Score:5, Informative)
> of our tables nightly through cron.
I've found that just enabling autovacuum seems to keep things in order. And you can tweak it for individual tables [postgresql.org] if you're so inclined.
Parent
Of course, MySQL is effectively two products... (Score:3, Insightful)
MySQL/MyISAM is the one with the massive legacy code base, the one that your open-source blogging software uses and probably the one that your web host supports. It beautifully hits the "sweet spot" for data-driven web sites with infrequent and simple updates, where trading integrity for "read only" performance is sensible. It does not even purport to compete with PostgreSQL on features - but it does offer fulltext searches, again
MySQL/InnoDB is the one that offers transactions, foreign keys etc. (ISTR it doesn't do fulltext indexes, though) - this is the "version" that bears comparison with PostgreSQL. I wonder how its user base compares?
(OK - you can mix InnoDB and MyISAM tables in a single database, but you can't use InnoDB if your web host hasn't installed it - heck, one provider I use is still on MySQL V3.23)
Flamewars have tended to pit PostgreSQL against a mythical database with the performance of MyISAM and the features of InnoDB...
As for the GUI software, the MySQL GUI Admin/query browser stuff is shinier than PgAdmin3 - but the MacOS version of the former is a complete crashfest! Neither of them steps up to the plate of providing a FOSS equivalent of (the good bits) of MS Access.
Ok, here is another outdated test (Score:4, Informative)
And this report is at least professional, which cannot be said about the one mentioned in the article.
http://dcdbappl1.cern.ch:8080/dcdb/archive/ttracz
more recent benchmarks (Score:5, Interesting)
They compare PostgreSQL 8.2 vs MySQL 4.1.20 and MySQL 5.1.20a.
Re:more recent benchmarks (Score:4, Insightful)
As the article shows, every time they double the number of cores, Postgres gains 75% in performance - like any good application should do. At 4 cores, it is already twice as fast as MySQL under reasonable concurrency; I'd like to see this test on a 8-core server - my guess is MySQL wouldn't be much faster than it is now and Postgres would perform at least 3 times better than MySQL.
Oh, and Postgres doesn't think 0000-00-00 is a valid date, which is nice too.
Parent
Postgres For Larger Datasets (Score:5, Interesting)
MySQL short on features (Score:5, Informative)
Does the Internet's favorite DBMS have an IP address datatype yet?
How about MAC address? CIDR block?
"An IP address is just a 32-bit unsigned int, duh. Any DBMS can store those."
Wrong. A datatype isn't just about storage, but also about operations. In PostgreSQL, when you do a SELECT across a table with IP addresses in it, you get them formatted and displayed as IP addresses, not as opaque ints. Likewise with CIDR blocks, like "192.168.42.0/23". There's also a comparison operator for asking whether an IP address is within a CIDR block.
If you're implementing a network registration system or an incident logging system, how much of your time do you want to waste staring at opaque ints like 3232246272 rather than IP addresses like 192.168.42.0 when you're trying to debug it?
MySQL is a bimbo, a fratboy: it's easy, but so shallow! The amount of time you save in one-time setup, you will lose many times over in all the little annoyances and deficiencies of a DBMS that was originally designed by folks who didn't really believe in DBMSes. Over time they've slowly been shamed into including many of the features they used to despise: transactions, relational integrity checks, and so on. But there's still so much missing ... not just essential integrity features, but little fiddly bits like good datatype support, the kinds of things that make your life easier (as a programmer or as a DBA) in the long run.
I love this part... (Score:4, Funny)
MysSQL has a much larger user base than PostgreSQL, therefore the code is more tested and has historically been more stable than PostgreSQL and more used in production environments.
"Claiming that your RDBMS is the best in the world because more people use it is like saying McDonalds makes the best food in the world."
Sorry, just an old joke that deserved retreading... ;)
Re:Foreign Keys (Score:5, Insightful)
Parent
Re:Foreign Keys (Score:5, Insightful)
Bingo!
It doesn't cease to amaze me, when the Mysql croud argues that "you don't really need those pesky integrity stuff, it just slows down the database."
Guess what guys; You're dead wrong!
Any DBA worth his salary will enforce data integrity on the lowest possible level, which means constraints (however implemented) on the object level.
Sure, you can let your coders in Bengaluru ensure that the primary key is unique instead of just applying a unique index and the same goes for referential constraints between tables. You can implement them in the application just fine until somebody overlooks some minor detail in the code and you're royally fucked!
Again! Foreign keys or triggers are not "niceties". They are essential in implementing an industry strength database; period!
Parent
Re: (Score:3, Insightful)
Re: (Score:3, Informative)
Re:Foreign Keys (Score:5, Insightful)
The database's function is to provide a RELIABLE storage for your data. Part of the whole reliability thing is making sure crap can't get in, because once it's there everything goes to heck.
For instance, let's take a shopping cart. Can an order be for a negative quantity? If your app doesn't work that way (it could, using a negative amount for returns for example), and you still allow it in the DB, then all your reporting goes to heck, as SELECT SUM... now returns the wrong thing.
A proper database is set up in such a way that every piece of data in it makese sense. This means for instance not having things like orders hanging around without in the void without being linked to some client. This is something easily ensured by foreign keys. Otherwise you have an utter mess - the total of the orders in the database doesn't match the sum of the orders of all clients!
If you put your checks in the database, you have a guarantee that when somebody else codes another frontend to it (say, you had a website and now are making a special version for PDAs), if the application does the wrong thing, the database simply won't let it happen. This may cost a bit of speed, but I assure you that peace, your sanity and your ASS (if you have a boss and he's got any sense, he's not going to like it at ALL if it turns out that reports don't match reality, and that reality can't be even easily extracted) is far, far more valuable.
Parent
Re:Foreign Keys (Score:4, Insightful)
None of this applies when somebody logs in with psql/enterprise manager/whatever and updates something in the database by hand. You can have all the OO and libraries you want, but it doesn't help if the new application doesn't use it. Yesterday we had code in VB6, today we have it in C#. Application is completely different. Guaranteeing that all the VB code will be exactly translated to C# is very, very hard.
On the other hand, the database remains being the same, and all the constraints it has don't care about which language, methodology or whatever is being used. VB, C#, Perl, PHP, are all automatically held to the same constraints.
And what's the problem with that? Use stored procedures and triggers then. Seriously, in a database of any size, forget about any attempts at compatibility with other databases. It only works on very, very trivial applications.
Just take postgres and mysql. PostgreSQL loves big transactions. The overhead for a transaction is high, but it's perfectly happy with large, long running transactions, and the more the better. PostgreSQL will be slow if you have a transaction per statement.
On the other hand, databases like mySQL want tiny transactions because the locks are really problematic. Leave a transaction uncommitted, and quickly things will grind to a halt. On the other hand, on postgresql the worst problem will be the lack of vacuum, which will gradually slow things down, but doesn't cause immediate problems.
If you make it for mySQL, without a redesign it'll suck on postgres and viceversa. If you try to make it for both, it'll be suboptimal on both.
Parent
Foreign keys are an enterprise feature (Score:5, Interesting)
Where foreign keys and the other referential integrity features really shine is in true enterprise scenarios, when you may have hundreds or thousands of applications, written in multiple languages, working against the same shared database(s).
In that scenario, the only viable way to duplicate the functionality of foreign keys at the application level is to have a middle layer which all other applications are required to go through. Realistically, that middle layer has to be implemented as a server, serving requests for object/record creation, update and delete over the network. Implementing it as a library to be linked into applications doesn't work well, because there are multiple applications accessing the database, and integrity enforcement needs to be centrally coordinated.
Implementing a middleware data server for an application isn't all that difficult, but integrating it into applications can be. Most application development environments know how to talk to databases, but don't automatically know how to talk to your application-specific, language-independent, data server. So now you're writing a client library for each app dev platform used in the enterprise, and dealing with things like integrating your custom interface with data-bound controls in the user interface. BTW, this is where people start resorting to e.g. SOAP, and projects start going off the rails (no pun intended, Ruby fans).
Luckily, as it turns out, there are already standardized, widely-available, well-supported systems that implement a centralized data serving service which enforces referential integrity. They're called databases. And foreign keys are an essential part of the service they provide.
Parent
Re:Foreign Keys (Score:5, Insightful)
Actually it shouldn't (in this context). Typically, one database will have several client applications attached to it. If data consistency is not checked at DB level, then:
Parent
Re:Foreign Keys (Score:5, Insightful)
Additionally, databases generally can do this faster than the application code. I can say this because databases are written in C and optimized and debugged for years. Applications are rarely (relatively) written in C and have not been debugged for years when released.
This is something that actually really pisses me off about Ruby, Rails, and ActiveRecord. ActiveRecord is an insane violation of everything that a database has been built to do. It breaks consistency, violates keys, ignores so many rules... And it's beats the crap out of a database to do what a database is designed to do and can handle much faster.
This is regardless of the flame wars of Postgres vs MySQL.
Parent
Re:Foreign Keys (Score:4, Insightful)
Parent
Re:Foreign Keys (Score:4, Insightful)
Correct. That extra layer of checks will probably actually slow things down a bit.
But foreign keys aren't about performance. They're about data integrity, which I would hope every database administrator or developer is more concerned with anyway. It doesn't matter how many requests/second your DBMS can handle if the data is fuxxored.
Your app should be checking itself anyway.
Yes, it should be catching "foreign key constraint violation" exceptions thrown by the DB interface and handling them appropriately. I hope that's what you meant.
Parent
Re:Foreign Keys (Score:5, Insightful)
When are you non-database types going to stop saying "Your app should be checking itself anyway."
This is an insanely inneficient method of execution. It's also highly presumptive.
Inneficient: If you are going to insert a record you have to first check to make sure it's not there. Then if it is there you have to change your INSERT to an UPDATE. This is dumb. Some databases do a INSERT OR UPDATE. but if they don't, it's faster to do an INSERT, handle failure, UPDATE. Alternatively -- UPDATE and INSERT on ZERO ROWS CHANGED. This means you have to run less than 2 queries on average. Your app should check method guarantees two SQL statements are executed every single time.
Dumb. Say you check for a record to exist. You get a "NO" answer. While you are preparing and executing your next INSERT, some other process or a thread inserts that same record into the databse. Now you have an error and you still don't know what to do. In short, you're in a pretty bad way.
Presumptive. In all my years of living I've never seen any company happy with the only interface to the data being through the application interface. Especially with a database on the back end. The business types, Marketing in partitular, love to screw with database information to try and identify trends, patterns, and correlations between the customer behaviour, product representation, and sales metrics. It is presumptive that the application can safely contain all of the business logic and you can assume that no one will ever come in the back end and change something -- thereby breaking all your business rules.
The other consideration is that the business logic contained in a database is going to run a heck of a lot faster on the database than anything you can dream up in your application, unless the application is written in C. Databases are generally written in C/C++. Applications are generally written in Java,Perl,Python,Ruby. None of these can compete with C. Add to that the fact that databases have been designed for years to do only one thing -- manage data. Do you seriously think you can out perform a decade of database optimization in a ruby script?
If you are going to base an application on data it would be useful to know how to capitalize on the features of a database rather than trying to repeat it. At the very least, you are less likely to introduce bugs.
Parent
Re:Foreign Keys (Score:5, Informative)
The client library is GPL. That means you cannot create a commercial program that uses it without using the commercial licensed version. Which is $200 per client
You can't even create a library and not ship mysql - the mysql site is very clear that they consider distributing a program that *uses* mysql as being exactly the same as distributing mysql itself:
http://www.mysql.com/company/legal/licensing/comm
Typical examples of MySQL distribution include:
* Selling software that requires customers to install MySQL themselves on their own machines.
Specifically:
* If you develop and distribute a commercial application and as part of utilizing your application, the end-user must download a copy of MySQL; for each derivative work, you (or, in some cases, your end-user) need a commercial license for the MySQL server and/or MySQL client libraries.
This makes mysql unusable for anything except large products. Our entire product only cost $70 for the single user version. No way in hell we're upping the price by $200 a copy.
Parent
Re:Foreign Keys (Score:4, Insightful)
Emphasis mine. In other words, You don't have to pay the $200 if your project is itself compliant with the GPL or similar license scheme.
"Comply with the GPL or pay us $200 to legally use our code or libraries" is not the same as saying "You have to pay us $200 if you plan to sell software you made using our code or libraries."
=Smidge=
Parent
Re:Foreign Keys (Score:4, Interesting)
Parent
Re:Foreign Keys (Score:5, Informative)
Parent
Re:Foreign Keys (Score:5, Informative)
WTF is with putting up an "unbiased comparison" between Postgres 7.2 and MySQL 5.0 when Postgres is now up to 8.2 and has most of their concerns addressed in that release, whereas MySQL is still at 5.0?
MySQL is a great database, if you need clustering but not referencial integrity or ACID compliance, that is.
Parent
Re:Foreign Keys (Score:4, Informative)
That'd be because the article was written in 2005. Unbiased? Maybe. Vague, unscientific and out of date? Definitely.
Parent
Re: (Score:3, Informative)
Is that the same referential integrity and ACID compliance afforded by using INNOdb as your table type in MySQL? ;o)
Re: (Score:3, Insightful)
It is the same thinking that probably made the retards at MySQL AB make a datatype that accepts 30th February as a date. (At least did, a few years ago.) Why EVEN include a datetime datatype if it isnt capable of the SIMPLEST validations ever.
Yes, I'm fuming. Those MySQL retards has made a generation of programmers think they can do SQL when they manage to put crap into MySQL. Gahhh, I hope their puny webapps will haunt them down sometime.
(I was once
Re: (Score:3, Insightful)