Forgot your password?
typodupeerror
Databases Software

PostgreSQL 9.0 Released 344

Posted by Soulskill
from the milestone-achieved dept.
poet writes "Today the PostgreSQL Global Development Group released PostgreSQL 9.0. This release marks a major milestone in the PostgreSQL ecosystem, with added features such as streaming replication (including DDL), Hot Standby and other nifty items like DO. The release notes list all the new features, and the guide explains them in greater detail. You can download a copy now."
This discussion has been archived. No new comments can be posted.

PostgreSQL 9.0 Released

Comments Filter:
  • Cool (Score:5, Interesting)

    by iONiUM (530420) on Monday September 20, 2010 @07:16PM (#33643238) Homepage Journal

    I read the notes, noticed the Column and WHEN triggers. Is this in other SQL databases? If it is, I haven't seen it before. In any case, it's pretty cool that you can setup triggers on a conditional statement. That would really help me out in a lot of scenarios, as I work in the BI space, so alerting is a big deal.

  • Re:Cool (Score:3, Interesting)

    by rsborg (111459) on Monday September 20, 2010 @07:45PM (#33643502) Homepage

    I read the notes, noticed the Column and WHEN triggers. Is this in other SQL databases? If it is, I haven't seen it before. In any case, it's pretty cool that you can setup triggers on a conditional statement. That would really help me out in a lot of scenarios, as I work in the BI space, so alerting is a big deal.

    Isn't this just syntactic sugar? What's the difference between logic in the trigger determining when to issue the payload logic, and the logic outside the trigger... especially if the trigger (re)uses a parametrized stored proc for it's payload?

    Not that this isn't nice, but it may also lead to code scattering, increasing maintainability issues.

  • As always... (Score:5, Interesting)

    by jd (1658) <imipak@yaCOLAhoo.com minus caffeine> on Monday September 20, 2010 @07:47PM (#33643520) Homepage Journal

    The new features are much admired by all (and deservedly so), but a heavier footprint typically means poorer performance overall even if there's accelerated performance in specific areas or improved programming. I'd like to see a performance plot, showing version versus performance versus different types of system load, in order to see how well new stuff is being added in. It might be merged in great and the underlying architecture may be superb, but I would like to see actual data on this.

    Also, PostgreSQL and MySQL aren't the only Open Source SQL databases. Including variants and forks, you really need to also consider Ingres, Drizzle, MariaDB, SAP MaxDB, FireBird and SQLite. If you want to also compare against Closed Source DBs, then you'd obviously want to look at DB/2, Oracle, Cache and Sybase. I'd love to see a full comparison between all of these, feature-for-feature, with no bias for or against any specific development model or database model, but rather an honest appraisal of how each database performs at specific tasks.

    I like PostgreSQL a lot. I rate it extremely highly. However, without an objective analysis, all I have is my subjective perception. And subjective perceptions are not something I could credibly use in a workplace to encourage a switch. For that matter, subjective perceptions are not something I would consider acceptable for even telling a friend what to use. Perceptions are simply not credible and have no value in the real world.

  • Re:As always... (Score:3, Interesting)

    by jwpye (1905258) on Monday September 20, 2010 @08:10PM (#33643754)
    I think there has been some effort to bring a PostgreSQL "performance farm" online to show the differences in performance across versions of PostgreSQL, and to quickly identify regressions during development. I don't think it's up yet, but a search should reveal some details on the project.
  • Re:Cool (Score:5, Interesting)

    by lanner (107308) on Monday September 20, 2010 @08:17PM (#33643814)

    I hate to say it, but good/useful features like that will be abused by stupid DB guys who can't program.

    Once upon a time I worked in the entertainment industry and was working on a big MMO game project.

    Company X could not scale up their game clusters past about 1000 players. Somewhere between 1000 and 2000 players, the game would just start bogging down and in-game events piled up and everything trainwrecked and was unplayable.

    So, it turns out that most of the game logic was built off of complicated SQL stored procedures, triggers, logic, etc. Basically, they were using their hard drive as a processor.

    The problem was with the MS-SQL server disk IO Wait. CPU was okay on all of the systems, but they could just not imagine that the disks in the database server (only one DB server per cluster) could be the source of the problems. Every time there was an item dropped, crafted, or certain other special things happened, there was an atomic commit and that basically required writing to disk on the spot. Get enough of that going and you're whole 20-something CPU cluster sits with idle CPU while the DB server works it's hard drives.

    Company went chapter 11, all staff eventually let go, and later was sold off for nothing.

    I had pointed out this problem to them, but it was late in development and when you tell the people who are responsible for designing the product that they are idiots, well, they behave like idiots and don't really listen. Not that they could have fixed it anyway due to time and intellect restraints.

    Anyway, point of the story is that cool SQL features are cool. But don't use your hard drive as a processor.

  • Re:Cool (Score:1, Interesting)

    by MichaelDavKristopeit (1905312) on Monday September 20, 2010 @08:59PM (#33644230)
    i'm not saying conditional triggers shouldn't exist... and i'm agreeing that more cycles would be used if a trigger was used where a conditional trigger could have been used, but my point is that in such situations the same job could be done without using any triggers whatsoever.

    granularity is an odd label to denounce and replace scattering, as i can think of nothing that scatters more than grains.

    generally, conditional triggers could be used by a platform layer that would automate the creation and updating of these triggers, but in the name of portability, that layer could handle such scheduling and processing on it's own.

    if you're a farmer, do you want to "harvest the field" or "pick up every grain of corn"... as a programmer, do you want to "update the codebase" or "find every piece of the codebase you need to update"...

    it's not FUD... it's a waste of time.

  • by turing_m (1030530) on Monday September 20, 2010 @10:12PM (#33644716)

    The last time I looked at ooo.org Base (at least a year ago, if not longer), I found it surprisingly capable, even workable. Give it a go, and have some patience. I only really had a look at the forms though, but I used to use subforms a lot and I could do what I wanted to do with it. Did not really look at reports though.

    As far as business logic, put that in PostgreSQL.

  • Re:As always... (Score:5, Interesting)

    by greg1104 (461138) <gsmith@gregsmith.com> on Monday September 20, 2010 @11:27PM (#33645254) Homepage

    You've got the performance part backwards for PostgreSQL; it goes up with every release, sometimes a little, sometimes in a big way. See PostgreSQL history [suckit.blog.hu] for a comparison covering versions 8.0 to 8.4. The mild regression in 8.4 shown there is actually reversible; it's mainly because a query related parameter for how many statistics to collect and use for query planning was increased by default. That results in better plans for most real-world queries, but it detuned this trivial benchmark a little bit. You can get performance back to 8.3 levels just by turning the parameter back to the "optimized for trivial queries" default of the older versions if you care about that. Most people prefer the new default. In the real world, 8.4 is actually faster due to improved handling of background VACUUM tasks too, which don't show up in simple benchmarks either.

    I'm the current lead architect on building a PostgreSQL Performance Farm [2ndquadrant.com] to prevent regressions from popping into future versions of the code too. There is a recently completed beta client [github.com] for that purpose. We're in the process of working out how to integrate into future development, starting with 9.1, so that potential regressions are spotted on a commit by commit basis. I haven't seen any performance regressions between 8.4 and 9.0, only moderate improvements overall and large ones in specific areas that were accelerated.

    Now, if you use some of the new replication features aggressively, that can add some overhead to slow down the master. But that's true of most solution; the data coming off the master has to take up some time to generate. The way PostgreSQL 9.0 does it is is pretty low overhead, it just ships the changed blocks around. Theoretically some statement based solutions might have lower overhead, but they usually come with concerns about non-determinism on the slaves when replayed (random numbers, timestamps, and sequence numbers are common examples).

    Given the non-disclosure terms of most of the closed source databases, nobody can publish benchmarks that include them without going through something like the TPC or SPEC process. The last time that was done in 2007, PostgreSQL 8.2 was about 15% slower than Oracle [toolbox.com] running the same database-heavy workload. And note that it was PostgreSQL 8.3 that had one of the larger performance increases, so that was from just before a large leap forward in PostgreSQL performance.

    At this point, Oracle and most other commercial databases still have a large lead on some of the queries run in the heavier TPC-H benchmarks. Links to more details as to why are on the PostgreSQL wiki [postgresql.org]. It just hasn't been a priority for development to accelerate all of the types of queries required to do well in that benchmark, and nobody so far has been willing to fund that or the subsequent certification via the TPC yet. Sun was the only one throwing money in that direction, and obviously the parts of that left within Oracle will no longer do so.

  • Re:Meh (Score:1, Interesting)

    by Anonymous Coward on Monday September 20, 2010 @11:47PM (#33645370)

    No, I'm fairly sure you have that backwards.

    PostgreSQL 6 was the first version. PostgreSQL 7 (which added foreign key support, apparently) seems to have been released around 1999 / 2000. It's been enabled by default ever since. Same goes for ACID - the PostgreSQL 7 line had proper transactions support, and I believe it may have had that from the first version (~1996).

    InnoDB didn't show up in MySQL until late 2001. The default table type in MySQL is still MyISAM, which doesn't support foreign key constraints, and isn't ACID. InnoDB is... barely. It still acts as a dumb data store though - you can't really enforce any useful constraints on the data.

  • Re:Cool (Score:2, Interesting)

    by theshowmecanuck (703852) on Tuesday September 21, 2010 @12:35AM (#33645602) Journal
    Agreed. With the exception that I have seen some applications where the volume of transactions is so high (telecom), the cost for the database to check for referential integrity is high enough that the key columns were indexed but not with primary key or unique constraints. This is so the DB wouldn't have to eat up time checking for uniqueness. It was up to the application to ensure referential integrity. Since most applications will never see this volume of transactions, yes it is best left to the DB for referential checking etc, but not always. And as far as business logic, for sure you will make maintaining your applications more complicated if you splash your business logic across various platforms including triggers and stored procs on the database. The hardest thing to debug is not syntax, but the logic. If your business logic is split up all over the place, it is more of a pain to debug these kinds of issues. The larger the project and/or the larger the app, the more so (I'm talking about something enterprise-like, requiring from dozens to hundreds or more programmers, analysts, etc.). Not to mention the performance hit it can cause. Better to let the app handle the business of business logic. IMHO, stored procs and triggers are useful but should be used sparingly if at all. Too many are mesmerized by the siren song of marketing and consultants and college instructors who haven't worked on big systems who will tell you how good triggers and stored procs are.
  • Re:Cool (Score:5, Interesting)

    by GooberToo (74388) on Tuesday September 21, 2010 @12:41AM (#33645626)

    There is a difference between the engine checking a constraint versus a call into an interpreted language. One is doing less work. The other is doing more work. Which is ideal? Obviously less work is better. And that's before you even get into the PL/SQL code which is essentially doing the same work, but slower. Furthermore, all too often, triggers are called when there is no work to be done but you don't know that until the PL code decides this is the type of row change its interested, otherwise it should have really been a NOP. Whereas with the column trigger, the call to the PL code simply never takes place. So we not only save on the call but all of the wasted time inside of a trigger which ultimately decides its has nothing to do.

    Also, when the trigger should be called, in a row trigger, triggers frequently must evaluate which columns have changed before it can even determine if it cares about this row. Should it then decide it does care about this row, likely you've already passed through a mass of CASE and/or IF/THEN/ELSE codes, which ultimately states yet more CASE and/or IF/THEN/ELSE to determine exactly what it should now be doing now that its decided it does need to process this row. Or, you can call a much smaller section of code which is dramatically simplified because one, its only called when its pre-qualified (saving the creation of much redundant code) and two, since its now pre-qualified, we can immediately get to performing whatever logic the trigger in question should do when the column in question has changed.

    Those are worlds apart in performance, readability, maintainability. Not to mention the added granularity makes possible a reduction in the test matrix, regression tests, and even makes it more difficult (though far from impossible) to create a regression.

  • by theshowmecanuck (703852) on Tuesday September 21, 2010 @01:00AM (#33645708) Journal
    He is saying that the data integrity of Oracle and DB2 type RDBMS engines is far superior to MySQL. I think you will find many IT people who believe this in billion dollar companies who are concerned about maintaining database integrity. And so is the integrity of Postgres's engine. I think where Postgres falls down is with 'high availability'. It now has its new hot standby feature, but this new feature still doesn't support reasonable failover functionality. And in fact, after watching an EnterpriseDB webinar about the new hot standby feature, my impression is that it is still rather kludgy. I think a few more iterations and improvements are necessary, and they might have to change the way the transaction logs are captured/stored in order to make it really useful. I could be wrong, it happens, but it doesn't pass my own sniff test with respect to 'ready for prime time'.
  • by GooberToo (74388) on Tuesday September 21, 2010 @01:45AM (#33645902)

    Its extremely ironic that you changed just as PostgreSQL become considerably faster than MySQL. PostgreSQL has always been far more scalable. To now hear you brag that you've never looked back at a superior and faster database because of your steadfast and likely false belief that MySQL is faster, is rather amusing.

    One of the biggest problems with the MySQL user base is that they don't have any idea what "faster" means nor do they typically understand how to benchmark. Made worse, they constantly confuse speed with scalability. And made ever worse, most MySQL users take the MySQL benchmarks to heart when time and time again they are nothing but marketing lies. Most independent tests have historically had lots of problems even getting MySQL to stay running until the end. And when it actually does finish, its normally somewhere between the middle of the pack to dead last - and that's with all the other databases forced to use the lowest common feature which prevents them from using their advanced, much, much faster features.

    The bottom line is, MySQL is popular because it has buzzword compliance for people who almost always don't know any better; but most of all, was readily available on Windows at a time when everyone was looking for a free database to go to. PostgreSQL is popular because it has both buzzword compliance, is far more feature rich, almost always out performs MySQL, and underscores, not to mention truly understands, what ACID is all about - while providing a very rich set of features which MySQL is unlikely to ever match. And that's ignoring that MySQL's optimizer absolutely stinks for anything but the most simple of queries.

    The best rule of thumb is, think of MySQL as a really fast Access database. If you wouldn't use Access, ignoring database performance in the comparison, you should think really hard about using MySQL. There are so many superior and still free RDBMs compared to MySQL, its easy to see why so many get so frustrated when others insist on injecting an dramatically inferior solution into the equation, just because it has buzzwords.

  • by asdf7890 (1518587) on Tuesday September 21, 2010 @05:04AM (#33646642)

    no foreign keys! no transaactions! no ACID!

    One of the things that put me off mySQL some years ago was people both within the wider community and within the project team themselves seeming to claim that if you wanted such things you were doing things wrong. Not "we don't support that (yet)" but "you're being stupid" and if pressed the best you could raise them to was "here's a workaround that will achieve more-or-less the same thing with a chunk of extra work".

    I may be about to be told I'm being wrong headed (and perhaps petty) here as no doubt the entire development team has changed several times over the years, but I can't quite shake the thought that maybe those features were implemented just to shut people up rather than because the team actually understood their importance to those asking for them. If I need/want those properties, other things being equal, I'd rather use a database that has had those properties baked in for much longer.

  • Re:Cool (Score:3, Interesting)

    by TheRaven64 (641858) on Tuesday September 21, 2010 @07:27AM (#33647448) Journal
    A good optimiser can only do that when the underlying infrastructure supports it. This is only possible now that the engine supports firing triggers when a particular column is modified. You seem to be arguing that they should have implemented this support in the engine but not exposed it to the user. As the AC said, this means that you are recommending that they not implement the SQL spec, which is an interesting perspective. You're not a MySQL developer, by any chance?
  • by GoChickenFat (743372) on Tuesday September 21, 2010 @07:35AM (#33647508)

    The fact he makes such a statement means two things. One, he's never bothered to look. Two, only interested in FUD'ing.

    Of course, you are wrong. I've spent quite a bit of time in both SQL Server BOL and PG 8.1 manual. The PG manual may be one of the best OSS manuals but it is far from THE best manual. It's extremely weak in the areas of administration. And if you're brave enough to make a suggestion on one of the PG boards you are usually met with elitist resistance as if their acceptance of a suggestion is somehow a indication of failure. This is especially true if you reference a MS SQL example.

  • by TheRaven64 (641858) on Tuesday September 21, 2010 @07:42AM (#33647532) Journal

    I don't think SQLite belongs in that list. For most tasks I'd pick either SQLite or PostgreSQL, but I can't think of many applications where I would consider both. They are very different projects. The only time I have considered using both is when I wanted to have a large concurrently-usable data set stored in PostgreSQL and then a small single-user subset stored in SQLite on a handheld device. SQLite gives really great performance for single-user applications, but it lacks a lot of the more advanced parts of the SQL spec, doesn't really do concurrency well, and doesn't provide any kind of RDBMS, and is quite sloppy about data types. PostgreSQL provides a good implementation of the SQL spec and a lot of optimisations for massively multi-user setups.

    There are cases when PostgreSQL is too heavyweight to be the correct solution, and SQLite is good for these. There are cases when SQLite is too simple, and PostgreSQL is good for these. There are no situations where MySQL is the right tool for the job. There are probably situations where you need something beyond what PostgreSQL can provide, and Oracle or DB2 would fit there, but I've not encountered any so I've not had the excuse to play with either.

  • by georgeb (472989) on Tuesday September 21, 2010 @08:05AM (#33647724) Journal
    I will second that. I remember a discussion with a mysql dev where I was trying to raise the point that the db should not accept 30 feb as a valid date. A quite senior dev, backed up by numerous voices on the mailing list, was trying to convince me that it wasn't the db's job to check for that, quoting performance concerns. This was at least 10 years ago. But nonetheless, it's not a good start for a DB to have core developers like that. I don't like MySQL primarily because it cares about standard SQL just about as much as Microsoft does. I find the documentation to be abhorring. DDL is cumbersome. Working in commandline mysql is pure torture compared to psql (on linux! psql for windows is rubbish; not surprising considering the mess that windows commandline terminal is).
  • Re:Cool (Score:3, Interesting)

    by Splab (574204) on Tuesday September 21, 2010 @08:05AM (#33647726)

    We aren't using triggers for your readability, your maintenance or your speed, we are using triggers and constraints to ensure our dataset is correct.

    OP was bashing the programmers for always committing to hard drive, what was supposed to go there - and is going there in most MMOs - to me it sounded like the hardware setup was their downfall.

  • by MichaelSmith (789609) on Tuesday September 21, 2010 @08:34AM (#33648028) Homepage Journal

    I really just intended to remark on the attitude I see around database stories. Its worse than operating system prudes. You get the big database people sneering down at the mysql people and ignoring the fact that different applications require different tools.

  • by psyclone (187154) on Tuesday September 21, 2010 @01:55PM (#33653098)

    If you are using hand-rolled SQL, most MySQL queries will execute on Postgres without much modification. However, MSSQL will be vastly different.

    For example, look at these ugly MSSQL queries with explicit locking, which you will probably have to use as developers and DBAs can't seem to agree on a standard isolation mechanism:

    SELECT COUNT(UserID) FROM Users WITH (NOLOCK) WHERE Username LIKE 'foobar'

    and

    UPDATE Users WITH (ROWLOCK) SET Username = 'fred' WHERE Username = 'foobar'

    Also, there is no LIMIT / OFFSET keywords in MSSQL, you have to do crazy shit like:


    WITH results AS (
            SELECT
                    rowNo = ROW_NUMBER() OVER( ORDER BY columnName ASC )
                    , *
            FROM tableName
    )
    SELECT *
    FROM results
    WHERE rowNo between (@pageNumber-1)*@pageSize+1 and @pageNumber*@pageSize

    Source: http://stackoverflow.com/questions/187998/row-offset-in-ms-sql-server [stackoverflow.com]

    You will soon realize that the Express version is super-limited (4GB max size, 1 GB ram, 1 core, no replication, etc.)
    Source: http://www.microsoft.com/sqlserver/2005/en/us/compare-features.aspx [microsoft.com]

    Postgres is highly tunable, but the defaults (that ship with many OSes) are for small footprints. This is an older document, but still relevant with explanations and the annotated config guide (bottom of page). Throw 8 cores and 16GB ram at Postgres, tweak the conf a tiny bit, and the feature set and performance will surprise you.

    Tune Postgres: http://www.varlena.com/GeneralBits/Tidbits/perf.html [varlena.com]

    There's no reason to use MSSQL unless all of your development and applications are on Windows, and your development team can't use anything other than their IDEs in a limited way. Once you start using Postgres, and realize the power behind it, you'll never want to use anything else.

    If, for some strange reason, your company wants to spend money and buy DB support, go for a commercial vendor of postgres. Enterprise DB has some nice management features: http://www.enterprisedb.com/products/index.do [enterprisedb.com]

Make headway at work. Continue to let things deteriorate at home.

Working...