Forgot your password?
typodupeerror
Databases Programming Software IT Technology

First "Real" Benchmark for PostgreSQL 275

Posted by ScuttleMonkey
from the waiting-for-the-opportune-moment dept.
anticlimate writes "A new benchmark published on SPEC shows PostgreSQL's performance approaching that of Oracle's and surpassing or on par with MySQL (however the test-hardwares of the other DB systems are somewhat different). The test was put together by PostgreSQL's core developers working at Sun. They certainly are not unbiased, but this is the first 'real' benchmark with PostgreSQL — according to Josh Berkus's blog. The main difference compared to earlier benchmarks (and anecdotes) seems to be the tuning of PostgreSQL."
This discussion has been archived. No new comments can be posted.

First "Real" Benchmark for PostgreSQL

Comments Filter:
  • by Control Group (105494) * on Monday July 09, 2007 @04:17PM (#19805251) Homepage
    however the test-hardwares of the other DB systems are somewhat different

    Which makes the results pretty much useless. But, being the intrepid slashdotter I am, I went ahead and R'ed the FA anyway, in case I could glean some useful information from it.

    Which revealed that the linked article doesn't actually contain any information whatsoever about Oracle* or MySQL, much less benchmarks on named hardware.

    So...what am I supposed to get out of this, again? Or is this just supposed to be some kind of PostgreSQL love-in, so I should take my wet blanket elsewhere?

    *Well, the second link contains someone claiming that Oracle is only 15% faster...but without providing any actual data.
  • Mod parent way up! (Score:3, Insightful)

    by khasim (1285) <brandioch.conner@gmail.com> on Monday July 09, 2007 @04:23PM (#19805321)
    You cannot compare benchmarks without SOMETHING standard between them.

    Okay, if they can't match the hardware (why not?) then focus on price points. I notice that they're looking at "$65,500 for the hardware". That's a LOT of hardware at today's prices.

    I'm sure MySQL would (and will) come back with a "benchmark" on hardware costing $10,000.

    There is nothing "real" about this "benchmark".
  • by Ngarrang (1023425) on Monday July 09, 2007 @04:26PM (#19805363) Journal
    To paraphrase an old saying:

    There are lies, damned lies and benchmarks.
  • Re:Bad firehose! (Score:5, Insightful)

    by MrNaz (730548) on Monday July 09, 2007 @04:33PM (#19805443) Homepage
    I like yours better. The Slashdot editors need to have their balls cut off if they think the post that beat your onto the front page is better. Feel free to mod me down any time for bitching about this, but seriously, this post is SO much better than the one that made it.
  • by Ungrounded Lightning (62228) on Monday July 09, 2007 @04:35PM (#19805473) Journal
    however the test-hardwares of the other DB systems are somewhat different

    Which makes the results pretty much useless.


    Not necessarily.

    It's essentially useless for separating out how much of the performance difference is the result of the software's design, implementation, and tuning versus how much is due to the platform differences.

    But such tests CAN be used to examine the performance of competing ENTIRE SYSTEMS, to inform choices between them.

    They say: "Oracle on does THIS well, PostgreSQL on can be tuned so it does THAT well on the same benchmark."

    This lets administrators (presuming they have access to the hardware info) get a bang-for-the-buck comparison.

    For the rest of us, the interesting point is that PostgreSQL, running on its team's idea of realistic hardware, can produce performance in the same ballpark as Oracle running on Oracle's choice of hardware.

    (Whether the necessary remaining data (what are hardwares x and y? how was PostgreSQL tunde) is published now, later, or never, is a separate issue. B-) )
  • by Kalriath (849904) on Monday July 09, 2007 @04:36PM (#19805487)
    I dunno, I kinda like MSSQL. Hell, I use it alongside MySQL servers for my own projects (that, and having support for multiple platforms in your product is kinda a good idea). Sure, it's got horrific licensing (nowhere near as bad as Oracle's, though) but other than that, it's pretty good and reliable. I get the impression that the core of it wasn't written by MS way back when, though. And it sure wasn't built by the Windows team.
  • by Vellmont (569020) on Monday July 09, 2007 @04:53PM (#19805679)

    You cannot compare benchmarks without SOMETHING standard between them.

    The thing that's standard is the benchmarking software.

    If I were to buy a database server, do I really care which component of the solution is providing me with the great performance, or do I just want the performance? At the end of the day the only thing that really matters is the performance that comes out of the box.

    It doesn't really matter if "Postgresql" is faster than "MySQL", because they always run on a certain physical computer. What matters is "I need to accomplish X,Y and Z. I have A dollars to spend. Which solutions accomplishes X, Y and Z the best within my budget? You can't separate the software from the hardware and get an answer that's very meaningful.

    This benchmark isn't the last word on anything. Even a benchmark run on the exact same hardware means very little if you have a 2 core machine instead of 8.
  • by Control Group (105494) * on Monday July 09, 2007 @04:53PM (#19805683) Homepage
    Oh, I agree. A benchmark of whole systems can be just as (or more) useful as a benchmark of individual pieces of software, depending on what your goals are.

    But what's been presented here isn't even that. Links #1 takes us to a SPEC benchmark of PostgreSQL. It doesn't provide any information about anything else; there isn't anything to compare the benchmark to. Link #2 provides an unreferenced statement about Oracle's marginally superior performance on much more expensive equipment.

    So, perhaps, one can begin to draw conclusions about PostgreSQL vs Oracle in the contexts of full systems. But neither link #1 nor link #2 provide any information about MySQL (except the quote: "[t]his publication shows that a properly tuned PostgreSQL is not only as fast or faster than MySQL").

    Really, my criticism isn't of the benchmark (the data are the data, after all) or of the blog (one expects a vested PostgreSQL interest to comment on such a benchmark), but of the blurb here that either a) draws totally unwarranted conclusions, or b) depends on information it doesn't bother sharing.
  • If you want to setup a dedicated database server, you want to know what software with what hardware will run the fastest. So while the benchmarks may not be useful to people wanting to setup a small multi-purpose server, it can still be useful for some people.
  • DUH! (Score:5, Insightful)

    by Slashdot Parent (995749) on Monday July 09, 2007 @05:00PM (#19805785)

    Why this emaciated post made it while mine didn't I'll never know.
    Yours wasn't posted because it didn't contain enough flamebait.
  • Re:Elephant (Score:4, Insightful)

    by pavera (320634) on Monday July 09, 2007 @05:01PM (#19805797) Homepage Journal
    "Dolphin" also conveys "fun play thing" to me...

    I'd prefer the elephant that never forgets.
  • by LurkerXXX (667952) on Monday July 09, 2007 @05:32PM (#19806157)
    Actually I think PostgreSQL might have eaten a lot more of the MySQL market if they'd simply been faster to market with better admin tools and Windows support.

    Lots of folks went with MySQL early because of those factors. They also then tended to write all their PHP, etc, applications to only talk to MySQL, thus making folks who might have preferred PostgreSQL use MySQL to run the app that they needed to run. Once that happens you are kind of in a Catch-22 place. Folks won't write the apps for PostgreSQL until it's used by a larger chunk of the market, but it won't take that large chunk because all the 'cool' apps were written MySQL only, so they have to run MySQL instead of PostgreSQL

  • by kpharmer (452893) on Monday July 09, 2007 @05:35PM (#19806201)
    > MySQL, as others have pointed out, has better developer support and they know their target audience. They supply a fast,
    > easy to use database for those who don't need a whole lot.
    > Oracle supplies an enterprise level database that MySQL doesn't aspire to.
    > PostgreSQL doesn't know where to fit in.

    This is an oversimplification. Each vendor sees itself in all markets:
        - oracle/db2/sql server have free versions for tiny apps and very expensive versions for massive apps
        - mysql says it doesn't want to do what oracle does, but also says that this is less than 1% of the market - and knows that plenty of smallish databases are on oracle
        - postgresql like the others sees itself doing anything from very small databases to very large ones (often via Enterprise DB or other vendor extensions)

    And using a single product for multiple sizes isn't illogical: if you have any very large databases (hundreds of gbytes or more) then you probably have a few dozen little ones as well. It's *far* easier to manage them all on oracle/db2/sql server - even with the small additional licensing costs - than to have a frankenstein collection of products to manage.

    "Best tool for the job" is a good consideration when evaluating products (along with vendor viability, cost, etc, etc) - but once you've got a single tool in house to keep adding new products - each with their own licensing, support, patch, backup/recovery procedures, etc is a nightmare. Let alone actually federating your data - and having to test out how to virtualize or replicate data from oracle 10.x.x with mysql 5.y.y

    > Performance is one aspect of the price tag, but it is certainly not the only factor.
    Very true - and for that reason Postgresql has more going for it than many alternatives, like:
        - best licensing options - you don't need to pay a lawyer to go over your contract or license like you should if you use oracle or mysql commercially. And there's no fear that the vendor will change its license terms once you're locked in and start charging an arm and a leg.
        - very good foundation - postgesql isn't built from duct tape and bailing wire. The functionality within it is well tested and robust.
        - great support for standard database features - whether its subselects, stored procedures, triggers, etc - it's very simple to move from oracle to postgresql.
        - great ansi sql support - again, very standard sql - no unnecessarily propretary language elements.

    So, yeah - just because Postgresql is performing well on some benchmarks that doesn't mean you should immediately throw out oracle in favor of postgresql. On the other hand, you also shouldn't discard it because it is a good general purpose database solution.
  • by RelliK (4466) on Monday July 09, 2007 @06:10PM (#19806571)
    1) Ease of use: MySQL has readline, and supports lazier SQL programming. It has loads of syntactic sugar that lets you write simple things in whatever way makes sense to you. Adding users, changing passwords, backing up... it is all _easy_. Of course, postgres can do everything MySQL can too, the question is which is easier.

    PostgreSQL's command-line shell is far better than mysql. In addition to what mysql does, psql supports tab-completion and other nice things. It's been a while since I looked at mysql, but when I did the difference between the two respective command-line interfaces was like the difference between bash and sh.

    2) Extensive developer community. We use python and the MySQL/python integration is great. We have a few UDFs that are home-grown but some of them were just downloaded off the net and installed. I'm sure you can find far more for MySQL than for Postgres.

    Uh-huh. You are "sure". Several times I asked for support on postgresql mailing lists and the response has always been excellent. Usually I got answers within hours.

    I think one of the reasons that mysql became ubiquitous is that it had proper windows support early on. So, just like windows, everyone uses mysql because everyone else does, and they are willing to jump through hoops to work around all the deficiencies the platform has, simply because they don't know any better.

  • by moderatorrater (1095745) on Monday July 09, 2007 @08:37PM (#19807933)
    Great analogy. Just remember that there are far, far fewer moving vans in this world for a reason and that they sit next to the curb more than the pickups.
  • by arivanov (12034) on Tuesday July 10, 2007 @05:04AM (#19810769) Homepage

    That is the problem - it does not in real life. Application works in developer hands, goes out in the field and breaks (seen that one time too many). Millions if not billions of dollars have been put to make sure that RDBMS transactions are atomic and preserve data integrity. No application level interface abstraction has ever afforded the expense and could ever afford the expense to do that. In every single instance I have looked at application developers replacing SQL ACID with "bake-their-own" system I have found cases of data integrity violations. In modern multithreaded (or web server based) apps the most common result from this is race conditions which are probably the hardest to debug problem in software.

    The other common problem in using application level abstractions is performance. Once again - works in developer hands, goes in the field, gets real data loaded in it and all hell breaks lose. Similar reasons to ACID as the next biggest investment after data integrity in a database is in its ability to fine-grain lock data objects. If a developer tries to replace RDBMS locking in the application layer, he usually ends up with higher granularity lock that is more contended. In addition to that to avoid race conditions, developers usually deliberately create a bottleneck by muxing all RDBMs access to a single thread and a single access point to simplify locking. In fact probably one of the most beneficial uses of MySQL is its ability to support server-based fine-grained locks that are not tied to a specific data object. You can use these in global semaphors and global locking even in cases where POSIX locks do not work (f.e. across clusters).

    Overall, yeah, MySQL and your app "already works". For Proof of Concept - maybe (in fact I use it myself). For real stuff - no, not really, unless you put a lot of work in the application layer. I have done that on quite a few occasions and the performance gains can be staggering compared to ACIDising your brain with a proper RDBMS, but the effort is hardly worth it in most real life scenarios. It also makes it considerably less maintainable.

  • by kpharmer (452893) on Tuesday July 10, 2007 @12:16PM (#19814419)
    > unless SQL data representation grows up to modern non-fortran-like OO semantics MySQL will proliferate

    You do realize that in activities like reporting sql and its set-based operations are far, far, far faster and easier to work with than oo implementations, right?

    You can set up a typical star-schema and have certain tools (like microstrategy) immediately recognize it and generate queries for you. These queries will typically perform just fine and allow very powerful and fast drill-downs, drill-across, etc.

    And oo approach involving the marshalling of millions of objects related to entries in the database would take forever to build to this kind of flexibility and would run slow as molasses. Might make the oo purists happy, but the customers would never use the product. At least against non-trivial amounts of data.
  • by Doctor Memory (6336) on Tuesday July 10, 2007 @02:00PM (#19815889)

    Most developers nowdays go for a really trivial schema and an abstraction layer. At that point the only thing that matters is row speed on simple table operations and there MySQL or in-memory OO database frameworks with a simple backing store wipe the floor.
    Until, of course, they don't. All it takes is a couple of users who want to actually get information *out* of the database ("How many widgets do we typically sell in Poughkipsie in March? And when I say 'Poughkipsie', I mean the greater Poughkipsie metroplex.") and you're stuck building indexes and making joins in your code. Eventually your code either becomes unmaintainable, or collapses under its own bulk. Agile/XP developers like the DRY axiom: Don't Repeat Yourself. Why write code to do what the database already does?

    DBDs when you actually corner them to ask something meaningfull answer with SQL technobabble like in your post. To the average developer it sounds like fortran. And if it looks like fortran, walks like fortran and talks like fortran it gotta be fortran.
    Um, are you sure you want to be bringing up FORTRAN as a counterexample when discussing performance? I'm not an HPC enthusiast, but I don't recall seeing Java or Python mentioned in the same sentence as Linpack or STREAMS. It's all FORTRAN (77, some 90) or "C/C++" (the C++ is silent). Just another case of "pick the right tool for the job".

    From the point of view of a average software engineer SQL and especially stored procedures look like a blast from the past
    "Those who do not understand Unix are condemned to reinvent it, poorly" — Henry Spencer. Relational databases are based on set theory, and have proven their worth over the last thirty years. Neither the old CODASYL or the new object databases could compete. Anyone who claims to be a "software engineer" and can't understand SQL is a poser. If they can't learn old tech, how are they going to learn new tech? There is an "SQL for Dummies", you know...

    unless SQL data representation grows up to modern non-fortran-like OO semantics MySQL will proliferate
    In that case, why even use MySQL? Fall back to MyISAM and do all the work yourself! Hell, you are anyway, why even stick an SQL abstraction layer in there? Some fast serialization logic and you're good to go. For toy apps (like those presented in programming books), you can get away with stuff like that. When you get into real projects (something that requires more than one developer more than a year to do), hand-rolled "abstraction layers" that can't guarantee consistency or even simple reliability (quick: what does your data layer do with uncomitted writes when the kernel panics?) just don't cut it.

Parts that positively cannot be assembled in improper order will be.

Working...