PostgreSQL 9.0 Released 344
poet writes "Today the PostgreSQL Global Development Group released PostgreSQL 9.0. This release marks a major milestone in the PostgreSQL ecosystem, with added features such as streaming replication (including DDL), Hot Standby and other nifty items like DO. The release notes list all the new features, and the guide explains them in greater detail. You can download a copy now."
Cool (Score:5, Interesting)
I read the notes, noticed the Column and WHEN triggers. Is this in other SQL databases? If it is, I haven't seen it before. In any case, it's pretty cool that you can setup triggers on a conditional statement. That would really help me out in a lot of scenarios, as I work in the BI space, so alerting is a big deal.
Re:Cool (Score:3, Interesting)
Isn't this just syntactic sugar? What's the difference between logic in the trigger determining when to issue the payload logic, and the logic outside the trigger... especially if the trigger (re)uses a parametrized stored proc for it's payload?
Not that this isn't nice, but it may also lead to code scattering, increasing maintainability issues.
As always... (Score:5, Interesting)
The new features are much admired by all (and deservedly so), but a heavier footprint typically means poorer performance overall even if there's accelerated performance in specific areas or improved programming. I'd like to see a performance plot, showing version versus performance versus different types of system load, in order to see how well new stuff is being added in. It might be merged in great and the underlying architecture may be superb, but I would like to see actual data on this.
Also, PostgreSQL and MySQL aren't the only Open Source SQL databases. Including variants and forks, you really need to also consider Ingres, Drizzle, MariaDB, SAP MaxDB, FireBird and SQLite. If you want to also compare against Closed Source DBs, then you'd obviously want to look at DB/2, Oracle, Cache and Sybase. I'd love to see a full comparison between all of these, feature-for-feature, with no bias for or against any specific development model or database model, but rather an honest appraisal of how each database performs at specific tasks.
I like PostgreSQL a lot. I rate it extremely highly. However, without an objective analysis, all I have is my subjective perception. And subjective perceptions are not something I could credibly use in a workplace to encourage a switch. For that matter, subjective perceptions are not something I would consider acceptable for even telling a friend what to use. Perceptions are simply not credible and have no value in the real world.
Re:As always... (Score:3, Interesting)
Re:Cool (Score:5, Interesting)
I hate to say it, but good/useful features like that will be abused by stupid DB guys who can't program.
Once upon a time I worked in the entertainment industry and was working on a big MMO game project.
Company X could not scale up their game clusters past about 1000 players. Somewhere between 1000 and 2000 players, the game would just start bogging down and in-game events piled up and everything trainwrecked and was unplayable.
So, it turns out that most of the game logic was built off of complicated SQL stored procedures, triggers, logic, etc. Basically, they were using their hard drive as a processor.
The problem was with the MS-SQL server disk IO Wait. CPU was okay on all of the systems, but they could just not imagine that the disks in the database server (only one DB server per cluster) could be the source of the problems. Every time there was an item dropped, crafted, or certain other special things happened, there was an atomic commit and that basically required writing to disk on the spot. Get enough of that going and you're whole 20-something CPU cluster sits with idle CPU while the DB server works it's hard drives.
Company went chapter 11, all staff eventually let go, and later was sold off for nothing.
I had pointed out this problem to them, but it was late in development and when you tell the people who are responsible for designing the product that they are idiots, well, they behave like idiots and don't really listen. Not that they could have fixed it anyway due to time and intellect restraints.
Anyway, point of the story is that cool SQL features are cool. But don't use your hard drive as a processor.
Re:Cool (Score:1, Interesting)
granularity is an odd label to denounce and replace scattering, as i can think of nothing that scatters more than grains.
generally, conditional triggers could be used by a platform layer that would automate the creation and updating of these triggers, but in the name of portability, that layer could handle such scheduling and processing on it's own.
if you're a farmer, do you want to "harvest the field" or "pick up every grain of corn"... as a programmer, do you want to "update the codebase" or "find every piece of the codebase you need to update"...
it's not FUD... it's a waste of time.
Re:Waiting for a capable PostgreSQL front-end (Score:3, Interesting)
The last time I looked at ooo.org Base (at least a year ago, if not longer), I found it surprisingly capable, even workable. Give it a go, and have some patience. I only really had a look at the forms though, but I used to use subforms a lot and I could do what I wanted to do with it. Did not really look at reports though.
As far as business logic, put that in PostgreSQL.
Re:As always... (Score:5, Interesting)
You've got the performance part backwards for PostgreSQL; it goes up with every release, sometimes a little, sometimes in a big way. See PostgreSQL history [suckit.blog.hu] for a comparison covering versions 8.0 to 8.4. The mild regression in 8.4 shown there is actually reversible; it's mainly because a query related parameter for how many statistics to collect and use for query planning was increased by default. That results in better plans for most real-world queries, but it detuned this trivial benchmark a little bit. You can get performance back to 8.3 levels just by turning the parameter back to the "optimized for trivial queries" default of the older versions if you care about that. Most people prefer the new default. In the real world, 8.4 is actually faster due to improved handling of background VACUUM tasks too, which don't show up in simple benchmarks either.
I'm the current lead architect on building a PostgreSQL Performance Farm [2ndquadrant.com] to prevent regressions from popping into future versions of the code too. There is a recently completed beta client [github.com] for that purpose. We're in the process of working out how to integrate into future development, starting with 9.1, so that potential regressions are spotted on a commit by commit basis. I haven't seen any performance regressions between 8.4 and 9.0, only moderate improvements overall and large ones in specific areas that were accelerated.
Now, if you use some of the new replication features aggressively, that can add some overhead to slow down the master. But that's true of most solution; the data coming off the master has to take up some time to generate. The way PostgreSQL 9.0 does it is is pretty low overhead, it just ships the changed blocks around. Theoretically some statement based solutions might have lower overhead, but they usually come with concerns about non-determinism on the slaves when replayed (random numbers, timestamps, and sequence numbers are common examples).
Given the non-disclosure terms of most of the closed source databases, nobody can publish benchmarks that include them without going through something like the TPC or SPEC process. The last time that was done in 2007, PostgreSQL 8.2 was about 15% slower than Oracle [toolbox.com] running the same database-heavy workload. And note that it was PostgreSQL 8.3 that had one of the larger performance increases, so that was from just before a large leap forward in PostgreSQL performance.
At this point, Oracle and most other commercial databases still have a large lead on some of the queries run in the heavier TPC-H benchmarks. Links to more details as to why are on the PostgreSQL wiki [postgresql.org]. It just hasn't been a priority for development to accelerate all of the types of queries required to do well in that benchmark, and nobody so far has been willing to fund that or the subsequent certification via the TPC yet. Sun was the only one throwing money in that direction, and obviously the parts of that left within Oracle will no longer do so.
Re:Meh (Score:1, Interesting)
No, I'm fairly sure you have that backwards.
PostgreSQL 6 was the first version. PostgreSQL 7 (which added foreign key support, apparently) seems to have been released around 1999 / 2000. It's been enabled by default ever since. Same goes for ACID - the PostgreSQL 7 line had proper transactions support, and I believe it may have had that from the first version (~1996).
InnoDB didn't show up in MySQL until late 2001. The default table type in MySQL is still MyISAM, which doesn't support foreign key constraints, and isn't ACID. InnoDB is... barely. It still acts as a dumb data store though - you can't really enforce any useful constraints on the data.
Re:Cool (Score:2, Interesting)
Re:Cool (Score:5, Interesting)
There is a difference between the engine checking a constraint versus a call into an interpreted language. One is doing less work. The other is doing more work. Which is ideal? Obviously less work is better. And that's before you even get into the PL/SQL code which is essentially doing the same work, but slower. Furthermore, all too often, triggers are called when there is no work to be done but you don't know that until the PL code decides this is the type of row change its interested, otherwise it should have really been a NOP. Whereas with the column trigger, the call to the PL code simply never takes place. So we not only save on the call but all of the wasted time inside of a trigger which ultimately decides its has nothing to do.
Also, when the trigger should be called, in a row trigger, triggers frequently must evaluate which columns have changed before it can even determine if it cares about this row. Should it then decide it does care about this row, likely you've already passed through a mass of CASE and/or IF/THEN/ELSE codes, which ultimately states yet more CASE and/or IF/THEN/ELSE to determine exactly what it should now be doing now that its decided it does need to process this row. Or, you can call a much smaller section of code which is dramatically simplified because one, its only called when its pre-qualified (saving the creation of much redundant code) and two, since its now pre-qualified, we can immediately get to performing whatever logic the trigger in question should do when the column in question has changed.
Those are worlds apart in performance, readability, maintainability. Not to mention the added granularity makes possible a reduction in the test matrix, regression tests, and even makes it more difficult (though far from impossible) to create a regression.
Re:Firebird is better (Score:3, Interesting)
Re:Firebird is better (Score:4, Interesting)
Its extremely ironic that you changed just as PostgreSQL become considerably faster than MySQL. PostgreSQL has always been far more scalable. To now hear you brag that you've never looked back at a superior and faster database because of your steadfast and likely false belief that MySQL is faster, is rather amusing.
One of the biggest problems with the MySQL user base is that they don't have any idea what "faster" means nor do they typically understand how to benchmark. Made worse, they constantly confuse speed with scalability. And made ever worse, most MySQL users take the MySQL benchmarks to heart when time and time again they are nothing but marketing lies. Most independent tests have historically had lots of problems even getting MySQL to stay running until the end. And when it actually does finish, its normally somewhere between the middle of the pack to dead last - and that's with all the other databases forced to use the lowest common feature which prevents them from using their advanced, much, much faster features.
The bottom line is, MySQL is popular because it has buzzword compliance for people who almost always don't know any better; but most of all, was readily available on Windows at a time when everyone was looking for a free database to go to. PostgreSQL is popular because it has both buzzword compliance, is far more feature rich, almost always out performs MySQL, and underscores, not to mention truly understands, what ACID is all about - while providing a very rich set of features which MySQL is unlikely to ever match. And that's ignoring that MySQL's optimizer absolutely stinks for anything but the most simple of queries.
The best rule of thumb is, think of MySQL as a really fast Access database. If you wouldn't use Access, ignoring database performance in the comparison, you should think really hard about using MySQL. There are so many superior and still free RDBMs compared to MySQL, its easy to see why so many get so frustrated when others insist on injecting an dramatically inferior solution into the equation, just because it has buzzwords.
Re:"Great leap forward" (Score:3, Interesting)
no foreign keys! no transaactions! no ACID!
One of the things that put me off mySQL some years ago was people both within the wider community and within the project team themselves seeming to claim that if you wanted such things you were doing things wrong. Not "we don't support that (yet)" but "you're being stupid" and if pressed the best you could raise them to was "here's a workaround that will achieve more-or-less the same thing with a chunk of extra work".
I may be about to be told I'm being wrong headed (and perhaps petty) here as no doubt the entire development team has changed several times over the years, but I can't quite shake the thought that maybe those features were implemented just to shut people up rather than because the team actually understood their importance to those asking for them. If I need/want those properties, other things being equal, I'd rather use a database that has had those properties baked in for much longer.
Re:Cool (Score:3, Interesting)
Re:Has the Documentation Been Improved? (Score:2, Interesting)
The fact he makes such a statement means two things. One, he's never bothered to look. Two, only interested in FUD'ing.
Of course, you are wrong. I've spent quite a bit of time in both SQL Server BOL and PG 8.1 manual. The PG manual may be one of the best OSS manuals but it is far from THE best manual. It's extremely weak in the areas of administration. And if you're brave enough to make a suggestion on one of the PG boards you are usually met with elitist resistance as if their acceptance of a suggestion is somehow a indication of failure. This is especially true if you reference a MS SQL example.
Re:Waiting for a capable PostgreSQL front-end (Score:3, Interesting)
I don't think SQLite belongs in that list. For most tasks I'd pick either SQLite or PostgreSQL, but I can't think of many applications where I would consider both. They are very different projects. The only time I have considered using both is when I wanted to have a large concurrently-usable data set stored in PostgreSQL and then a small single-user subset stored in SQLite on a handheld device. SQLite gives really great performance for single-user applications, but it lacks a lot of the more advanced parts of the SQL spec, doesn't really do concurrency well, and doesn't provide any kind of RDBMS, and is quite sloppy about data types. PostgreSQL provides a good implementation of the SQL spec and a lot of optimisations for massively multi-user setups.
There are cases when PostgreSQL is too heavyweight to be the correct solution, and SQLite is good for these. There are cases when SQLite is too simple, and PostgreSQL is good for these. There are no situations where MySQL is the right tool for the job. There are probably situations where you need something beyond what PostgreSQL can provide, and Oracle or DB2 would fit there, but I've not encountered any so I've not had the excuse to play with either.
Re:"Great leap forward" (Score:3, Interesting)
Re:Cool (Score:3, Interesting)
We aren't using triggers for your readability, your maintenance or your speed, we are using triggers and constraints to ensure our dataset is correct.
OP was bashing the programmers for always committing to hard drive, what was supposed to go there - and is going there in most MMOs - to me it sounded like the hardware setup was their downfall.
Re:Waiting for a capable PostgreSQL front-end (Score:3, Interesting)
I really just intended to remark on the attitude I see around database stories. Its worse than operating system prudes. You get the big database people sneering down at the mysql people and ignoring the fact that different applications require different tools.
Please don't use MSSQL (Score:3, Interesting)
If you are using hand-rolled SQL, most MySQL queries will execute on Postgres without much modification. However, MSSQL will be vastly different.
For example, look at these ugly MSSQL queries with explicit locking, which you will probably have to use as developers and DBAs can't seem to agree on a standard isolation mechanism:
SELECT COUNT(UserID) FROM Users WITH (NOLOCK) WHERE Username LIKE 'foobar'
and
UPDATE Users WITH (ROWLOCK) SET Username = 'fred' WHERE Username = 'foobar'
Also, there is no LIMIT / OFFSET keywords in MSSQL, you have to do crazy shit like:
WITH results AS (
SELECT
rowNo = ROW_NUMBER() OVER( ORDER BY columnName ASC )
, *
FROM tableName
)
SELECT *
FROM results
WHERE rowNo between (@pageNumber-1)*@pageSize+1 and @pageNumber*@pageSize
Source: http://stackoverflow.com/questions/187998/row-offset-in-ms-sql-server [stackoverflow.com]
You will soon realize that the Express version is super-limited (4GB max size, 1 GB ram, 1 core, no replication, etc.)
Source: http://www.microsoft.com/sqlserver/2005/en/us/compare-features.aspx [microsoft.com]
Postgres is highly tunable, but the defaults (that ship with many OSes) are for small footprints. This is an older document, but still relevant with explanations and the annotated config guide (bottom of page). Throw 8 cores and 16GB ram at Postgres, tweak the conf a tiny bit, and the feature set and performance will surprise you.
Tune Postgres: http://www.varlena.com/GeneralBits/Tidbits/perf.html [varlena.com]
There's no reason to use MSSQL unless all of your development and applications are on Windows, and your development team can't use anything other than their IDEs in a limited way. Once you start using Postgres, and realize the power behind it, you'll never want to use anything else.
If, for some strange reason, your company wants to spend money and buy DB support, go for a commercial vendor of postgres. Enterprise DB has some nice management features: http://www.enterprisedb.com/products/index.do [enterprisedb.com]