Please create an account to participate in the Slashdot moderation system

 



Forgot your password?
typodupeerror
×
Programming IT Technology

MySQL And PostgreSQL Compared 147

unicron writes: "PHPBuilder has got an article MySQL and PostgreSQL Compared. " Everyone who has used these DBs knows the differences between them, and now that licensing isn't one of them, let's try to talk about where each excels and the other fails. I know people get almost as religious about their DBs as they do about OSs and programming languages, but let's try to get somewhere here and not just needlessly flame and rant, mmkay?
This discussion has been archived. No new comments can be posted.

MySQL and PostgreSQL Compared

Comments Filter:
  • by Greyfox ( 87712 ) on Sunday July 09, 2000 @05:52AM (#947840) Homepage Journal
    DBase III wasn't a very good database program, but in its heyday millions of people used it and it got the job done for them. Even relatively inexperienced users could make use of it and write simple programs to manipulated their data. Even though it sucked, it was the right tool for a lot of jobs at the time.

    Compared to DBase III, both of these tools are excellent. I wish I'd had either one a decade ago when I started work doing clipper programming for a dog track related publishing company.

    For the dog track application I would have preferred Postgres; the rollback support would be pretty compelling for an application like the one we were doing. For something where I was just kicking around a database (Which I've also done a lot of) MySQL would be perfect. MySQL would be ideal in something like the RHS Orchid Registry, for instance.

    If application bigotry keeps you from choosing the right tool for a job, you will be a less valuable resource to those who employ you. Not too many people seem to "Get" this. People are often surprised that I will, on occasion, suggest that Microsoft products are the best tool for what they're trying to do. Usually those people asked me expecting a "Windows sucks use Linux" spiel, but if I think their situation warrants it (Inexperienced user, just wants to browse the web, word process and send E-Mail or wants to play games at all) I'll tell them to use Windows.

  • It has always seemed to me that PostgreSQL leans in the direction of an Open Source professional database with a modern syntax, cutting-edge feature set, etc.... Basically, designed for the Oracle user looking for an Open Source solution.

    OTOH, MySQL seems to me to lean in the direction of the life-long linuxhacker/perlcoder who doesn't really care about the latest Corporate technologies, and just wants an effective and efficient method of storing some data.

    Just my view of the whole thing, of course. Overall, I like them both and I hope they both continue on side-by-side. We need both alternatives around.

  • I know this is a little late, but for anyone else crusing the back issues...

    Check out this PostgreSQL book [postgresql.org] (PDF). I read through it, and found myself going "Ah-HA!" quite often. It has sensible examples of SQL queries, including many of the advanced features.

    There aren't too many books that illuminate a subject for me as well as this one did. It's a free download, and (bonus) written using LyX.

    -- Dirt Road

  • What's so crazy about pictures in a database? BLOBs come in awfully handy when your on-line catalog has pictures of the product in it. It's a lot more efficient to pull a long raw from your datafile than to do a system call that results in a linear scan through $CATALOG/images/product then pipe the found file's contents out to the client. It's also a lot safer--can you imagine the integrity nightmares that could be introduced when your image files can be affected from outside the RDBMS? I'd rather not add such tedious file management tasks on top of my present programming and DBA work.

    On a more mundane level, discussion boards, document storage/retrieval systems and their ilk often need to store more than 8k worth of characters per record. It's trivial to add a column called "part" and split data in/out of it, but a trivial thing done over and over and over again becomes a nuisance. Oh, you could chain images and other binary data together as well, but that doesn't sound awfully fun either, what with all them 0x00 characters showing up and confusing C's string handlers.

    --

  • Oracle8's LOB management indeed allows you to put the LOBs themselves in a different tablespace than the rest of the row; the table's physical rows store pointers to the other tablespace. With another RDBMS you could do it by hand by creating a table containing nothing but a primary key and your LOB. Reference the PK from your "master" table and your full table scans are as skinny as ever.

    Putting the BLOB's data outside the RDBMS's management by putting it in external OS-managed datafiles makes referential integrity much more troublesome, and reverts transactions to a task handled explicitly rather than for free.

    --

  • I know Slashdot has had test servers set up and asked people to troll away and try to crush the new server(s). Why not a Slashdot MySQL vs. PostgreSQL torture test? Invite everybody to hit two respective servers hard and see which chokes first. And also ask users to post comments on how fast the servers perform.
  • by Jerry ( 6400 ) on Sunday July 09, 2000 @12:31PM (#947846)
    When I was comparing the two (MySQL 3.21.33 and PostgreSQL 6.4) I noticed that the MySQL manual gave a list of SQL functionalities that were missing in it.
    • Sub-selects
    • Selects into a table
    • Transactions
    • Triggers
    • Foreign Keys
    • Views


    These aren't trivial weaknesses, and says nothing about syntactical differences between MySQL and ANSI SQL 92. I use these features everyday at work. To impliment them in code is possible but would be a royal pain. TRANSACTIONS, for example, can be worked around by creating duplicates of tables and batch processing into them. If all the updates on all the tables goes ok then copy the dups over the original tables. Not practical if the original tables are huge, but can be done at night after backups are done.

    It also listed functions which 'exist' only for compatibility (i.e., not functional)
    • GRANT
    • CREATE INDEX
    • DROP INDEX
    • ALTER TABLE

    After that list of don'ts I decided to check PostgreSQL, especially since TRANSACTIONS was missing and MySQL said it would require a 'completely different table layout than *MySQL* uses today.' I understand that they have employed a wrapper technique to supply a 'TRANSACTION'-like functionality. I've had experiences with extending fuctionality with wrappers and they are generally not a stable or acceptable solution.

    PostgreSQL has everything MySQL is missing. It has a very powerful set of operators (std, numerical, time-interval, geometric) which can be evoked by name. It also has a similar set of fuctions which can be used to increase the functionality of the operators. Postgres alows attributes of an instance to be defined as fixed-length or variable-length multi-dimenstional arrays, either of base type or user-defined types. (Not exactly like Pick's AMV, but close). I seldom see folks mention a powerful feature of PostgreSQL that MySQL doesn't have - Inheritance.
    From their documentation:


    CREATE TABLE cities (
    name text,
    population float,
    altitude int -- (in ft)
    );

    CREATE TABLE capitals (
    state char2
    ) INHERITS (cities);



    That's why I chose to PostgreSQL over MySQL.

  • <<Actually, I don't think I made my point clearly. I agree that guestbooks and applications on that level - which may involve little more than a table or two - are well-suited to MySQL/Postgres, and I agree that there are a lot of those out there. >>

    Unfortunately the big commercial databases (read: Oracle and DB2) have some critical limitations of their own.

    I work at a company that has dozens of tables, some of which have tens of millions of rows. We're a web based company and we serve millions of dynamically generated page views per day. We use MySQL, and are debating our options for the future right now.

    In the past we looked at DB2 (ported our flagship product to it for testing), and we've spoken with people who've used Oracle -- the conclusion we've come to is that for certain classes of problems, performance is just unacceptably bad for Oracle and DB2.

    Specifically, write performance is a problem. Now, MySQL isn't uniformly better: Table level locking is a SERIOUS problem, but at least there is a window of load where write performance is acceptable, and some tuning of your application can usually alleviate related problems quite well.

    In our test of DB2, versus MySQL, using a C-based client in a fairly "real world" (read: Access patterns similar to our product) Db2 was only able to get 9 inserts per second into a crucial table. If we turned transaction logging off (which the DB2 guys said was a Very Bad Thing -- and did in fact cause massive problems) we got that up to 31 inserts per second. We tweaked and tuned things with help from IBMs engineers.

    MySQL consistently got over 1,000 inserts/sec on the same hardware, with the same schema, and the same access patterns.

    Also, the time it takes to establish/break down a connection to the database can be an issue. Connection pooling helps, but if you've got a situation where you have more clients than possible connections, and not all of the clients will always need a connection then you can get into a situation where clients that *do* need a connection have to wait because clients that *dont* need one still have the connection in their pool. In such a case, if you are using MySQL, one can get better performance by turning connection pooling *off*. That's not a *common* occurence, but certain classes of web applications can experience it.

    Also, some features of MySQL can obviate the need for some of the "advanced" features of the commercial databases. Consider "views". As I understand it, there are two basic reasons to use a view:
    1) To provide finer-grained security than is otherwise attainable with table-level security. (essentially, per-column security) MySQl already supports per-column access priviledges, so this use of views is unnecesary.

    2) To allow/simplify some complex joins. I'm told by an engineer here with Oracle experience that left joins with multiple unprotected tables and one protected table aren't *directly* possible with Oracle -- the solution is to use views. You want to join A, B, C, and D with only A being protected. Make a view called A1 of A and B, make a view called C1 of A1 and C, and then left join C1 and D. MySQL supports left joins with multiple unprotected tables very easily -- so no need for views here.

    Obviously these features aren't applicable in 100% of cases, but in a great many situations, the "limitations" of MySQL really aren't as bad as they sound.

    And just for the record,as is briefly noted in the article, transaction support and page level locking is supported in the 3.23.x development series using Berkeley DB as the table handler. You have a per-table choice of which handler you want -- ISAM/MyISAM is the "normal" type (fast, disk-based, no transactions, table level locking), Heap (very fast, memory based, no transactions, table level locking), and BDB. (somewhat slower, full transactions, page level locking)

    -JF
  • by Anonymous Coward
    I agree, and can add more:

    Large database rows greatly increase I/O during table scans. For instance, if you're query references a column that isn't indexed, the dbms must scan through the entire table to perform your query.

    If each row is huge, that scan is going to take significantly longer.

    So kids: Keep the database record size small, and off-load large objects into BLOBs or build your own refrences to them. We'll all be happier.

  • Very interesting. I must admit, based on what I've heard and read, I wouldn't have expected MySQL to be capable of supporting "dozens of tables, some of which have tens of millions of rows [...serving] millions of dynamically generated page views per day." You guys should do a story on your setup and submit it to Slashdot - I'm sure there'd be a lot of interest.

    Regarding your write performance problem, I've never come across a performance problem that couldn't be solved, although in some cases that might involve custom solutions outside the database. Depending on the nature of the application, you could consider using something like a server program with either memory-resident data or fixed-format data files (perhaps even using MySQL for this limited application), to act as a buffer between the web server and the database. Of course, a lot depends on the details of your application, which I don't know.

    I've implemented something like this for a system that had to log information from multiple web sites, where the incoming data rate often exceeded what the database was capable of, and it worked very well. In that case, the data was being accessed in a limited number of predefined summary forms, so maintaining consistency between the external server and the database wasn't a problem. YMMV.

    BTW, views are useful for many purposes other than those you suggest. For example, as a large schema evolves, views can be used to support legacy code with the table structure they expect, when the underlying tables may have changed or been split into multiple tables. They can also be very useful during development as a way of reusing queries, i.e. a particular multi-way join that gets used repeatedly can be implemented as a view and then used in other queries. Finally, views can support standard ways of browsing and querying the data using client tools, by users or support personnel who would find the highly normalized underlying schema difficult to deal with.

    The feature I suspect I would find most difficult to live without, using MySQL, is enforcement of referential integrity. Also, what about stored procedures? I have the impression it doesn't support them? In more complex systems, more work is done in stored procedures that feed the results of one query into subsequent queries, than is done by stand-alone queries, so doing without stored procedures would be a crippling problem. But I admit I'm now talking about far more complex applications than the target audience for MySQL.

  • You rock!

    --
    Max V.
  • People who recommend MySql because in some unusual circumstances it is fast enough are as responsible as people who tell you that you'll go even faster on your motorbike if you take your helmet off and take your clothes off and cover yourself in oil ('helps the aerodynamics you see'). You might go faster but you'll die when you crash. In the database world this is known as losing data or having a database corruption. Oracle has whole galaxies of code just to handle journaling (so you don't ultimately lose data) - MySql has nothing.
  • Well, there's a difference between "I can get by with what I know just fine, thanks" and "Most of the complicated queries you might do are better expressed in a proper programming language" ...

  • And don't forget to mention that you can optionally write Postgres stored procedures in Perl.
  • >The Linux zealots don't have much room to brag over this incident.

    No matter what os you use if you don't run things right you are in for a world-o-hurt...
    I mean why do you think people call on you in the first place? It's becouse you get the job done right.

    The deal is when you don't do backups you recreate EVERYTHING from memory. That sucks.
    There are many other stupid things you can do (Like not setting up a proper redundency cluster so you never go down to start with) that can really hurt your ability to go back on-line.

    It's unfair to gasp and point at a single case and say "See.. problem..." you can screw up anything.
    Rember E Bay a while back? They use Sun UltraSparcs running Solarus. Top of the line hardware. A tiny mistake on E Bays part and BOOM hard crash.
    The fault? Was it The SQL server? No... Was it the hardware? No.. it was (and E Bay admits this) E Bays own mistake. They set it up wrong and they took themselvs down.

    Honnestly you can make NT reliable with work. It's just no simple matter. Linux has more ways to solve it when something really horrably wrong is found in the operating system.

    With BSD the chances of finding something really horrably wrong are low. You can patch it quickly yourself. Chances are good an offical bug fix will be out shortly but not soon enough.

    With Solarus the chances of finding something really horrably wrong are slim. You won't be able to patch it unless you got a source code liccens and Sun's not doing well in the timely bugfix department anymore.

    With Windows NT the chances of finding something really wrong are pritty up there. You can't fix it. But Microsoft is getting better at timely bugfixes. In the mean time you can simply disable the defective feature. Not exactly the kind of thing you want to do.

    With Linux the chances of finding something really sereously defective is fair. It's more likely you'll hear someone ELSE found it first and fixed it.
    But people have this habbit of not updating Linux software much and that's really not smart. Linux boxes should be updated as often as posable.

    When you get into hardware problems the first question to ask is "why no redundency"... any delays to getting back on-line can usually be attributed to poor managment not poor os.
  • <<You guys should do a story on your setup and submit it to Slashdot - I'm sure there'd be a lot of interest.>>

    I'll look into it, but I doubt I'll get the go ahead for such a thing...

    <<Depending on the nature of the application, you could consider using something like a server program with either memory-resident data or fixed-format data files (perhaps even using MySQL for this limited application), to act as a buffer between the web server and the database.>>

    Actually, our Oracle-experienced engineer says that this is what he had to do for (I think) AT&T because of Oracle's abysmal write performance.

    However, this depends on a couple things...

    First, that your system traffic is cyclical in nature, having a period of time where the traffic on the system is lower than the write performance of the database (the catch-up period) that is proportional to the period of time where write-performance is exceeded.

    Second, that you don't immediately need the data, or that you be willing to institute a failover lookup mechanism that allows you to check the buffer for pending data.

    For us, our performance cycle is narrowing as we start handling more and more international clients. And for our application, delaying commitment of data for a few hours is just unacceptable.

    The solution we're considering would best be described as "perpendicular" to this line of thought, although I can't really go into much more detail.

    <<For example, as a large schema evolves, views can be used to support legacy code with the table structure they expect, when the underlying tables may have changed or been split into multiple tables.>>

    Eek. In a perfect world one would not be re-normalizing schema periodically *after* you go to production... But it does happen. One would think however that to do so, it would be wise to review dependent code to eliminate "bit-rot" anyway.

    <<They can also be very useful during development as a way of reusing queries, i.e. a particular multi-way join that gets used repeatedly can be implemented as a view and then used in other queries.>>

    I'll concede this one... Although most of the situations I've encountered involved either fairly simple queries, or complex queries that were put together very dynamically. (picture a natural join that can have between 3-10 tables and *lots* of context-dependent conditions...)

    <<Finally, views can support standard ways of browsing and querying the data using client tools, by users or support personnel who would find the highly normalized underlying schema difficult to deal with. >>

    Only if they are using a generalized table-browser -- yech! A web-based front-end application is usually pretty trivial to put together, and generally far more intuitive to use. Although again, I can see instances where this makes sense...

    <<The feature I suspect I would find most difficult to live without, using MySQL, is enforcement of referential integrity.>>

    It's one of those sacrifice-features-for-performance type of deals. If you have transactions, it shouldn't be neccesary. If you don't have transactions then you'd better be darn careful. :) Yes, it can be annoying, but certainly not a critical failing...

    <<Also, what about stored procedures? I have the impression it doesn't support them? In more complex systems, more work is done in stored procedures that feed the results of one query into subsequent queries, than is done by stand-alone queries, so doing without stored procedures would be a crippling problem. But I admit I'm now talking about far more complex applications than the target audience for MySQL.>>

    Never used them myself... But yeah, having them would be nice...

    -JF

  • considering the whole point of MySQL is to sacrifice the heavier features of SQL and 'Enterprise' RDBM's in the quest for speed, it'd be amazing if it wasn't faster than Oracle.
  • Sorry, couldn't resist: In what direction is it going?
    --
    Change is inevitable.
  • Yeah, we have an Ingres2 DB the size of China, but this was just basic accounting (she kept a numbered list of Purchase Orders). She's got so many, it needed to be either 1 written into Ingres or 2 moved to something new. I use Access for the PC inventorying (there's only 100, so it's okay) and my purchase requests, so it was simple to just copy the PR one and modify it to use her numbering system. If I was a real programmer I'd use one of the DBs they're talking about in this forum, but I'm not, so the users get Access.

    My only issue with using it for something bigger is that people like me who aren't DBAs can forget that Access doesn't scale well. It gets tougher and tougher to justify using it as it grows in size and number of users. My using it for her is basically just a stopgap so the overworked DB progs can find a real solution.

    I said something semi-positive about MS. Who the hell modded it up?

    -jpowers
  • You could just store links to the pictures you know!

    Sure - but now when I do a DB backup I get the complete site content in one big file. Text, pictures and infos all packed up in a neat package : I like this :) . Plus, integrity is better because erasing or adding a picture is a one operation (vs 2 operations with a link, since you have to erase/save the file after erasing/adding a row).
  • by Anonymous Coward
    It's nice to see PHPBuilder/LinuxStart back online after, what, nine days of downtime? I hope they don't blame *this* on Microsoft!

    From the staff and management... There is no worse feeling for an Internet professional than when a site goes down -- except maybe the feelings of the loyal customer who relies on that site. As many of you know, we suffered through a massive hardware failure that essentially wiped out LinuxStart and crippled our mail and Web capabilities at a time when we were transitioning to a new disaster-recovery plan. Of course, disasters always happen at the worst possible time. The LinuxStart team has been working around the clock to restore the site and to make sure that such a disaster won't occur again. Meanwhile, please accept our apologies. We're working hard to make LinuxStart your complete source of Linux information. We value your participation in the site.
    -- Kevin Reichard [mailto], managing editor, Linux/Open Source Channel, internet.com We are aware that some features of the site are not yet functional, but we wanted to get the basic search and directory features online as quickly as possible. Please do not report errors at this time, because we are already working on most of them. Thank you for your patience as we continue to rebuild after a massive hardware problem.
    -- The LinuxStart webmasters [mailto]
  • by wik ( 10258 ) on Sunday July 09, 2000 @06:04AM (#947861) Homepage Journal
    The article was listed as a column, not a full-disclosure benchmark report. I'll let the author slide on those points, but here are some things to think about when running a benchmark:

    • He gave a very vague description of the layout of his database and the nature of the queries. Without that, it is impossible to reproduce his results.
    • He did not describe his client/driver configuration, for all we know, they could have been programs running on the same machine as the DB server
    • For database benchmarks such as TPC-C (you can see the stringent reporting specifications at www.tpc.org [tpc.org]) it is common to allow the driver program to warm up the database for a long amount of time (45 - 60 minutes seems to be common now, depending on the size of the database) then allow the database to reach steady state after at least 20 minutes. He does not mention whether he gave time for the database to stablize. Transient numbers are not a good indication of actual system performance.
    • His reported statistics are lacking. Saying that PostgreSQL is 2 - 3 times faster than MySQL leaves a 50% uncertainty in his measurements. At times, he reports 2 - 3, and others he reports simply 3. In fact, I don't see any solid measured numbers for queries/second, transactions a second or some other metric.
    Granted, most websites running either of these software packages are probably not going to see very heavy loads, but if you're deciding between two vendors, it's good to know exactly *what* you are comparing.
  • If you're putting pictures and stuff into database
    I think you're really making things harder and slower than they need to be.... it's better to
    insert a path to a picture on your harddrive than
    to put the whole thing in... (the longer the row,
    the slower the select...

    btw.. for anyone interested.. check out:

    http://www.naken.cc/mp2p/

    My project that converts PHP scripts written for
    MySQL into PHP scripts for Postgres :)
  • by wik ( 10258 ) on Sunday July 09, 2000 @06:11AM (#947863) Homepage Journal
    Try the TPC-C reports here [tpc.org]. These are not specifically comparisons between vendors, but they are all reports of systems running a common benchmark. The reported benchmarks all went through approved TPC audits before being released. (On an aside, Microsoft just withdrew their results on SQL Server 2000 because they failed the audit, it's on ZDNet somewhere, I don't have the link).
  • How do you do a SQL dump of BLOBs? I use a web hosting company with MySQL so I don't have root access. I would like to make a daily back up of my photos. Thanks.

    Neil
  • I'm not exactly sure what is done under MySQL, but elsewhere you would just make copies of your data files, assuming the files are set read-only or off-line or in a backup state.

    --
  • How about get_lock('mytable.field1=2345',5);

    Yes it's application level, but I never claimed it's database level locking.

    And you can do even better than row level locking, just use your imagination. There are so many ways you could use that arbitrary string.

    get_lock would also be very useful when you are trying to insert a unique value into a table, but you don't want to lock the whole table and you also don't want to raise an error. The latter case is important for Postgresql - any error forces rollback of the entire transaction!

    Unfortunately Postgresql doesn't have a get_lock. I've asked the developers about that, but I don't know if they're interested enough. AFAIK postgresql dev learning curve is steep, so I don't think I can roll my own- esp since it involves transactions etc.

    Cheerio,
    Link.
  • >now that licensing isn't one of them... Uh, last i saw PostgreSQL was under a BSD style license, while MySQL was under the GPL. CmdrTaco may not care about the difference, but i don't think it's fair to say that the licensing is the same.
  • SOLID is a nice lightweight database that supports transactions, and even some load-sharing/replcation (though I've never used this particular feature..) They also have a good, reliable JDBC driver (for you Java fans out there), and 'traditional' C based client libraries for about a dozen platforms (Win32, FreeBSD, Linux, etc.)

    Solid Information Technology [solidtech.com]

  • What exactly is being kept atomic, if not an ordered set of accesses?

    From the MySQL site:

    " It only means that you can be sure that while each specific update is running no other user can interfere with it and that there will never be an automatic rollback (which can happen on transaction based systems if you are not very careful). MySQL also guarantees that there will not be any dirty reads."
    You can also read more about how to cope without commit/rollback [mysql.com] on the MySQL site.

    What it really comes down to, is MySQL is a much lower level database than, say, Sybase. Sybase gives you a lot of high level tools for doing in-database programming. MySQL sacrifices this for speed, and thus you actually have to understand what you're doing in order to program for it correctly. If you do know what you're doing, there really isn't any end-result that you cannot achieve (including arbitrary levels of data integrity).

    The question is very simillar to: do you want to code in Java or C? Java provides all sorts of safety mechanisms so that stupid programmers don't write elevator control programs that turn the brakes off and drop the cable. C is fast enough to write large systems in and still have them respond in faster-than-geologic time. Same goes for high level databases and MySQL. MySQL gives you plenty of really nice rope with which nearly any knot can be tied, including a hangman's noose. C compilers also tend to be a hell of a lot easier to install and maintain than a Java VM/RTE, which makes the comparison even more accurate.

  • I certainly got my start on MySQL as did most others I guess. Truth be told, I got my real Unix DB start using mSQL but that's another story.
    MySQL for many people is a great start. The reason there are so many MySQL installs out there is because it's just so darn easy rto get started with. I should no, I wrote a tutori al [lycos.com] for it. The install is simple, accessing it is simple, and the SQL and data type are enough to get you off and running pretty easily.
    Postgres on the other hand is, for the novice, more difficult to install.
    While both DB's have an odd user auth system, postgres' is just that little bit more odd for the new user.

    Having said all that, postgres I think is the best introduction for the user wanting more than MySQL but not ready for the commercial DB's. As alreqady pointed out, it has (finally) foreign key support, sequences, transactions, sub-selects and so on. I've found using views has been amazingly handy coming from MySQL. I can take a very complex join and tidy the whole thing up into a view. One criticism is that outer joins aren't implemented and you have to have something together using a UNION.
    The other handy feature I like in postgres are the functions. Again they're not easy to get started in but you've got a choice of the language you want to write the thing in including C and tcl as well as pl/sql.
    Finally my other joy with postgres is large object support. There are innumerable postings to mailing lists about how to store images and other binary data in MySQL. The usual reponse is to not do it and to merely store the path to the file. The lo_import and lo_export functions in postgres are simple and easy to use and accessing them via PHP is a snap.

    Cheers,
    Graeme
  • Licencing IS one of the differences.
    GNU GPL and BSD for MySQL and PostgreSQL respectively.
  • Believe me, I know what the advantages are of commercial-grade RDBMS systems, I've been around the beasts long enough. And I do appreciate that PostgreSQL/MySQL don't have what it takes in a lot of applications, even when commercial DBMSes have price tag that can make a grown man cry.

    The point of the original poster was pretty simple: there are a lot of applications out there that don't need a RDBMS with a gazillion features out there. I suppose the guestbook example was a wee bit simple, but that's definitely not everything. Look here [slashdot.org] for a great example of a bit more complex application ;-)
    It could be called 'wrong' to just use PostgreSQL/MySQL because they're open sourced ( although that could be marked as -1 Flamebait ;-) when the features of the DBMS don't fit with your needs.

    All I'm saying is that you really don't need a complete state-of-the-art toolbox with a lot of toys when a simple hammer will do. And they'll do just fine in a lot of cases anyway.


    Okay... I'll do the stupid things first, then you shy people follow.

  • Excuse my apparent lack of knowledge on the subject, but are there that many differences in the databases? I mean, isn't SQL SQL?
  • What the heck is going on ? He's making some very valid points, why is his post marked as troll ?

    Anyways, it really annoys me that they don't have nor ever plan to support foreign keys last time I read their documentation. The "why NOT use foreign keys" is rubs off as just an excuse to not implement them, not a valid technical argument. Specially, when there are high performance databases that have that feature and seem to work very well.
  • The most interesting thing about my test results was to see how much of a load Postgres could withstand before giving any errors. In fact, Postgres seemed to scale 3 times higher than MySQL before giving any errors at all. MySQL begins collapsing at about 40-50 concurrent connections, whereas Postgres handily scaled to 120 before balking. My guess is, that Postgres could have gone far past 120 connections with enough memory and CPU.

    Ouch... very ouch. 40-5o concurrent connections *ISN'T* very high. MySQL is pooping out on that many connections? How so? Slashdot is running on MySQL, and I'm sure Slashdot is getting more then 40-50 concurrent trolls^H^H^H^H^H^Husers at a clip at times. Postgres has far passed 120 connections... This puts MySQL in the dust. If the connections are important, and you can deal with a data limit of 8k per data row, then postgres if your man I'd say after the article. (postgres is fixing that in 7.1 accordingly...) However, I still believe, and from a little experience, that MySQL runs nice and is more realible in many ways then Postgres was...

    I don't know if it's just Religon, but I'd go MySQL......

  • Just a tought. This guy is talking about connections to the database and the limits of it.
    For me this implies that for each request from the database, he starts upp a new
    connection to it, and when the query is over, it kills the connection, and the database
    is open for a new connection from another user.

    Shouldnt you program the webserver to have a few connections open, and
    keep them open??

    Philip Greenspun discus this in his book "Philip and Alex's Guide to Web Publishing".
    From the book...

    "For each URL requested, you save

    +the cost of starting up a process ("forking" on a Unix system) for the CGI script.
    +the cost of starting up a new database server process (though Oracle 7 and imitators
    pre-fork server processes and many new RDBMS systems are threaded)
    +the cost of establishing the connection to the database, including a TCP session and
    authentication (databases have their own accounts and passwords)
    +the cost of tearing all of this down when it is time to return data to the user"

    What is missing here? Shouldnt the webserver have the connections preopened to the
    server and keep them open, and therefore the amount of connections would be static.
    This making any discussion of how many connections a db can have irrelevant, and
    infact discussing it shows a lack of understanding how database connections should
    be handled.

    If you wanna discus this with your fist, youll have to come to Sweden.
    Ill meet you at Arlanda Airport, and bring your own thermo clothing.
    Ill teach you how to handle the polar bears, and the Swedish bikini team!
    Any questions ??
  • While I appreciate the author's opinion, the discussion needs to move beyond how they compare to each other.

    First, I agree with several of the replies indicating this was not a benchmark, but a starting point in the discussion. Not to undermine the effort taking place, but a benchmark is normally Specific, Attainable, Measurable and it must be able to be recreated. More detail would need to be provided, but that is beside the point right now.

    With commercial benchmarks available we should be comparing MySQL and PostgreSQL to DB2 and Oracle. If we open source to a level we can play in that field we are well on the way to success. Comparing them to each other is comparing High School football players. It may be valuable in the next game, but it does indicate how they will be in the pros.

  • Foreign keys, views and transactions are just
    cool? Statements never fail?
    I'm just wondering where did this guy studied
    RDBMS 101? At marketing college?
  • I must admit that the lack of sub selects was one of the biggest things to bug me when I first started using MySQL. I got around it by looping through queries in my code. Maybe a bit less efficient, but it works.

    I use MS SQL Server 7 in my job, but still dabble with MySQL. I find it a lot easier to use and administer.

    My point? Not sure.. I dabbled with Postgres and found the learning curve a bit steeper compared to MySQL. It could be argued that it's a more advanced/complicated product so that's to be expected. I don't want to start a flame war! For what I need to do, MySQL fits the bill perfectly. I'm an ardant fan!
  • My experience with MySQL is that anytime the box goes down unexpectedly (the coloc'd box I have gets mokeyed with by coloc personnel about once every 60 days), I get corrupted indices. I may be incorrect about this, but it seems part of the performance gains in MySQL are the result of not calling close/open to force the disk cache to flush to disk, something that could easily be done as part of an idle task.
  • Yes, SQL is SQL but there are 2 important points:
    1) Not every DBMS supports all of the standard SQL commandset. Most support an extrnded subset.

    2) More importantly, each is implemented in a totally different way. Saying there are no differences just because they have similar command sets is like saying all cars are the same because they have a steering wheel and 3 foot pedals.

  • Here's an article that explains why mySQL is not a real RDBMS since it's support for transactions are lacking.

    This is just wrong. Not all RDBMS' are ACID, thus MySQL is an RDBMS, but it is not ACID. Of course, if you have a look at the MySQL site [mysql.com], there's some good commentary on why transactions are not required in the general case, and specifically are not included in MySQL. What it comes down to, is the ACID definition makes an assumption: atomicity is required in RDBMS' (I don't agree, but am willing to concede that it's required for most RDBMS applications), and transactions are the correct path to atomicity.

    This latter assumption is argued against by MySQL. MySQL supports atomicity without transactions. This flies in the face of traditional RDBMS dogma, which certainly does piss a lot of people off. But, in the final analysis, what did you need transactions for? Sure, I can come up with the pathalogical case, and that's when you use Sybase or Oracle or PostgreSQL or any other transactional RDBMS. On the other hand, for 99.9% of the cases, atomicity is sufficient and for about half of those, more that sufficient.

    Please, don't try to get people to believe that transactions are required in order to a) be an RDBMS or b) meet the needs of business. It's really very confusing to most people and totally unrequired. What's more, I find that the following things are required in most circumstances, and very few products besides MySQL supply them:

    • Ease of installation (under Red Hat Linux, the MySQL install consists of downloading 3 files and typing "rpm -Uvh *.rpm" Compare this to the ultra-frustrating pseudo-user-interface of the Oracle install. Even PostgreSQL was a major pain to install (though nothing comes close to Oracle).
    • Ease of administration. MySQL keeps each table as several on-disk files. This means that if you want to move the indexes for a particular table to another disk, you type:
      • mv tablename.MYI /another/disk/somewhere
        ln -s /another/disk/somewhere/tablename.MYI .
    • Good tools. MySQL comes with a simple command-line SQL program, and just to point out the value of MySQL by a single example, this tool has command-line history and editing (through the readline library) as well as tablename completion. Why does Oracle's sqlplus (or sqlminus, as I prefer to call it) still not have this?
    • Speed. This is really the killer. MySQL is the fastest database out there. Except in some very pathalogical cases, MySQL is general 2-10 times faster than other databases for given operations.
    The real question is: why are we still having this debate? If you're running an open source-fiendly environement, you should be using PostgreSQL or MySQL. Look at the features of both and decide what makes sense for you. In most cases, I think that choice should be MySQL, but hey, the right tool for the job should always be the top concern.
  • by Anonymous Coward
    Not quite. The GPL is used by original authors who want other peoples' code changes released, if those people distribute a modified binary. In a sense, it is saying, "If you want to modify my code and distribute the binary version to someone, you also have to let them have the sourcecode to it." The original author may choose to grant exceptions, letting there be a closed-source fork. And the original author may just change the license if she so desires.

    The GPL is really for the author's protection: He gives out the code in the expectation that peoples' changes to his creation will eventually propagate back to him. Sounds fair to me.
  • by &lt; ( 159412 ) on Sunday July 09, 2000 @06:13AM (#947884)
    Mysql doesn't have referencial constraints. Mysql doesn't have Subqueries. Mysql doesn't have triggers. Mysql doesn't have stored procedures. Their manual has section that explains why "NOT to use foreign keys". Seems to me like they're ignoring SQL standard and will continue to do so in future. As for being "fast", thats only relative. And if you leave out large number of features supported in real SQL implementation than I guess it would be little faster.
  • Oh and how can I forget Views. MySQL doesn't support Views either.
  • How can you compair the two databases? They offer two different types of capibilities. MySQL is the fastest, and always will be the fastest simple query database on the market, because it doesn't support ACID transactions. The server doesn't need to worry about affecting other transactions or data integrity, it's your job to impliment that feature via your code. PostgreSQL on the other hand supports ACID transactions and is a great free tool if someone wanted to do ERP style storage.
  • I notice that postgresql has an 8k rowsize limit. Back when I was using MS software, I had occasion to try MS SQL 6.0. I messed with it for a while until I found that it had a 2k rowsize limit. Utterly ridiculous, especially since there was overhead for a row (index stuff or something) so the effective limit was around 1800 bytes.
    --
  • *My* car only has two pedals ;)
  • by paulrussell ( 209182 ) on Sunday July 09, 2000 @06:28AM (#947889)

    > my sites all have whole articles, pictures, etc... into a MySQL database.

    (Warning: DBA in a previous life - expect pain ;)

    I really wouldn't suggest placing text articles, images or other large objects into a database tuple. Modern databases (including PostgreSQL; not sure about MySQL) have support for 'BLOBs' (Binary Large Objects), which allow you to associate very large files and objects with a particular tuple.

    The BLOBs are stored as part of the database, but not within the relation itself. This works a *lot* faster than putting the images or text inline with the tuple, and is I suspect partly why the folks at PostgreSQL haven't yet fixed the 8kb limit.

  • Perhaps something like Sybase's SQL Anywhere
    Ahh, SQL anywhere. How many times has that thing been renamed?
    From Watcom SQL to SQL anywhere, to Sybase SQL anywhere, to Adaptive Server Anywhere, and now back to SQL Anywhere Studio.

    Very good product though.

    --

  • SQL (as a language, not a feature set) is designed for set processing. A particular SQL implementation is heavily optimised for the data store implementation used in it's RDBMS.

    I'd think you'd be crazy not to write as much of your applicaiton in SQL as possible, even though it's ugly lookin' and it will never be more than a portion of your code base. Ignoring a powerful and efficent tool like that seems foolhardy. (Not that I haven't used Postgresql or mysql, so perhaps their dialects are too limited to be as useful.)

    Trying to write the loops to manage set processing in a "proper" language will almost always be more inefficent and error prone. Not to mention the insanity of running that code interpreted in Perl/PHP/ASP/Whatever on your webserver.
  • I'll agree with the lack of support. I've been working on a postgresql & php based site for myself, but basically there's nothing around for that combination, I guess this is more of a beef at php, but all the support seems to be aimed at mysql. I'm trying to fix that with my page [50megs.com]. I've had a few people comment about it, so the interest is there, I'm trying to do something about the support in that area. I can't comment about the postgres support team.
  • One reason why Postgresql is slower is because by default it does an fsync on every commit (assuming you updated/inserted data).

    MySQL doesn't do an fsync, it lets the operating system decide when to do it.

    When I turned off the automatic fsync on Postgresql, the hits/sec on my web app went from 8-13 hits/sec to about 50-70 hits/sec.

    My web app uses persistent database connections, and I benchmarked using apachebench (ab). System was a Dell Poweredge 1300, PIII 500MHz with 512MB RAM.

    I turned fsyncs back on, because I'm paranoid about losing data. AFAIK the way you do data recovery on Postgresql is to run pg_dump and dump all the data out and then reload it back in.

    I have not ported that webapp to MySQL, but similar webapps using MySQL will do about the same hits/sec.

    Database level transactions make things more convenient when writing web applications, but their effectiveness in solving transaction issues is reduced drastically by the fact that it's difficult to get a database level transaction to persist beyond a single webpage load, and in many cases inadvisable for various reasons. So you usually have to resort to application level transactions for the real transaction stuff, or accumulate userdata in webpages (hidden fields, cookies etc) and only submit the data to the database in the final webpage (ick!).

    What I like about MySQL is the get_lock function. This allows custom application level locking, which is very useful. It allows you to lock on an arbitrary string. This means you can do arbitrary level locking - e.g. simulate row level locks, or super high level locks.
    e.g. get_lock("Initiating data migration",50)

    Cheerio,
    Link.
  • I am a Unix System Administrator. As any SysAdmin, I care about reliability and redundancy. I want to give my opinion as someone who is NOT a developper. The fact of the matter is, I don't even know SQL that much.

    When you work with a web site (as it's the exampe given in the benchmark), your motto is 24/7, 24/7 and more 24/7. I want the web site to be up all the time.

    To be able to do so, I use good and reliable hardware and software. But it's not enough. It has to be redundant.

    Redundant hardware is "easy" : RAID, 2 power supplies, etc. Redundant software is more of a pain and unfortunatly it's as important as hardware. There are Hardware tools out there (BIG-IP, Radware Webserver Director, etc.) who can load balance and manage fail over for web server. But what about Databases?

    This is something that is extremely important. I want to use good stuff (Open Source) without leaving my network having a single point of failure. I can pretty much have everything redundant. Linux has High-Availability kernel modules, Apache has mod_backhand (a Must!), DNS and NIS have master/slaves concepts ... but what about my databases?

    When Slashdot moved to the Matrix, Rob was talking about some project to make MySQL able to be redundant (or fail over capability, same thing). This is very important. Again, IT IS VERY IMPORTANT.

    Just like programmers want the best fonctionnality from a database (STORE procedures, etc) in order for them to code less, I want my DB to be redundant-ready so that I don't have to do it myself. Unfortunatly, this is not something that I can see happening soon in Open Source databases.

    So to the MySQL and PostgressSQL developers: please work on this. My roomate runs an NT shop and the main reason he uses Oracle and SQL7 is because of how easy it is to set up a Database Cluster (or Fail Over solution if you like) and how stable it runs once setup. Given that full index search doesn't work when SQL 7 runs on a cluster but's that's Microsoft hey :)

  • As long as I can use the dbmopen command in perl, I've no use for those fancy-schmancy (fillintheblank)SQL databases =)

    VIVA DBM!!

  • actually, 11.0 was unsupported when Sybase did the linux 'free for commercial use' release. But it still works great.

    I've used sybase 10, 11.0, 11.5 and 11.9 and IMHO, 11.0.3 and 11.9.2 are the best releases that Sybase has ever put out. 11.0.3 is still running at many sites. Ok, 4.x is still running at many sites, but the people who are running it call it MS-SQL.

    I've used Sybase's OMNI (CIS) stuff since OMNI was at 10.5, and I was sceptical at first, because it seems like 'magic'. But OMNI really works. And so does CIS.

    We are starting to check out Sybase 12, and while preliminary reports were pretty rough, 12.x is beginning to look pretty solid in the past two months.

    I was an early adopter of 11.5, and I really regret it--we were applying EBFs at least once a week, and the optimizer (still) sucks.

    I like sybase as well, and their products/service. But I agree with you about expecting them to advance their marketshare percentage. Oracle, IBM and M$ are much better at marketing.

    I honestly feel that Informix has the best product out there today...but being a Sybase DBA pays the bills....
    ---
    Interested in the Colorado Lottery?
  • So why do we have more than one browser?

    First, there are features that exist outside of the SQL definition. Things like user-authentication (for instance MS SQL will use NT auth), ease of backup, etc.

    Then there are performance issues. Some DBs, like MySQL, optimize for speed while others optimize for reliability.

    And finally there is adherence to the standard. For instance Access is pretty poor in this regard (at least up through Access 95 which is the last version I used).
    --
  • But Access is pretty useful for a few of my users who I caught using spreadsheets like databases ("Help! my machine crashed in Quattro Pro!" "Jeez, how many lines did you fill in this?" "Four thousand"). It isn't great, and I know better than to use Access for real production, but one table and a few forms/reports later, they're ready to go. And they seem to like that it doesn't reorganize their data when they hit sort...

    -jpowers
  • IMHO the best of both worlds. Sybase ASE version 11.9.2 is available for linux here [sybase.com]. I've been using Sybase on HP-9000, linux, and NT platforms for a couple of years. Sybase is stable, fast, well supported, supports transactions, sub selects, triggers etc, and its supported by PHP.

    SuperID

    "Happy Sybase User"

  • by Ami Ganguli ( 921 ) on Sunday July 09, 2000 @06:37AM (#947900) Homepage

    The reviewer mentioned Postgres triggers, but didn't make it clear that triggers use a more general feature called stored procedures.

    By coding lots of low-level functions as stored procedures that are stored _inside_ the database I can drastically reduce the number of database calls my application has to make.

    The benchmarks he used can't show this (because he said himself that he doesn't use these features) but proper use of stored procedures should improve performance drastically while simplifying application code.

  • by Carnage4Life ( 106069 ) on Sunday July 09, 2000 @06:39AM (#947901) Homepage Journal
    The difference isn't just the SQL support but more the transaction handling abilities of the DB specifically the ACID properties of the database. The ACIDity of a database is what makes it a RDBMS. Acid properties are

    Atomicity:
    This means that transactions are either fully completed or never begun at all. Any updates that a transaction might make on a system are completed in its entirety. If any error ccurs during the transaction keeping it from completing all its steps then the DB is rolled back to its previous state before the transaction began. E.g. Let's say a transaction consists of money being removed from a checking account and stored in a savings account. If there is an error after the money is removed from the checking account but not stored in the savings account then the transaction can be rolled back to when the money was still in the checking account.

    Consistency:
    This means that the DB is always in a valid state after a transaction. Thus if an error occurs during a transaction, the DB can be rolled back to the last valid state. E.g. From the above example the total amount of money in both accounts is constant and is equal to the sum of both amounts. If an error occurs while transferring money from the checking account to the savings account then the total of both accounts is not consistent with the amounts actually in the accounts. This inconsistency is handled by rolling back the DB to the last consistent state.

    Isolation:
    Each transaction appears to the only one being carried out by the system at that time. If there are two transactions both performing the same function and both running at the same time they will be invisible to each other. This ensures that the system is consistent because if transactions do not run in isolation the they may access data from the DB that is inconsistent due to the fact that some other transaction is in the middle of performing its tasks. E.g. if my paycheck is being deposited in my checking account at the same time as when I am viewing an account summary I will not be able to see the changes being made to my account until the deposit transaction is completed. This stops unsavory things like seeing the total amount in the account remaining constant while the recent deposits field may already contain the deposit.

    Durability:
    Once changes have een made to the DB they are permanent. There are safeguards that will prevent loss of information even in the case of system failure. By logging the steps that each transaction takes the state of the system can be recreated even if the hardware has failed.


    Here's an article [openacs.org] that explains why mySQL is not a real RDBMS since it's support for transactions are lacking. PostrgreSQL on the other hand supports transactions. For instance, people who use a DB that supports transactions don't have to lock the tables themselves when accessing the DB from their code instead stuff like that is handled by the DB.

    PS: Also not all of SQL is supported by mySQL (e.g. Foreign Keys for specifying integrity constraints).

  • No comments yet on interbase? - SQL-92 complete (basic level at least) and unique in having a multi-versioning architecture

    Not yet released as open-source but very close to being - we are assured - and the beta version is already free to download.

    Not that I've had any experience with it yet - anyone who has used both it and either postie or mysql like to comment?

  • For those that have just come in to see what people's comments are, you should really go and read this article if you use either database.

    The article does a pretty piss poor job of actually benchmarking either database, and the comparisons between the two are pretty wimpy, too. He seems afraid to say that one database is better than the other, even for specific applications.

    What he does do is give a pretty careful examination of the strengths and, more importantly, the weaknesses of each database. Obviously, neither database is going to list their weaknesses (except for the obvious ones, like MySQL's transaction omission), so articles like this serve to show the way in which MySQL's inserting methods are pretty inefficient (although, honestly, I think his numbers .. 40-50 concurrent connections before crashing .. were a bit off and I think they fixed the problem in the new 3.23.xx series) and ways in which Postgres just isn't a great option for small database projects.

    If you read this article, you're not going to be any closer to determining whether or not MySQL or Postgres is better for you, but if you've already decided which DB you're going to use, this article will help you work around the inherent limitations better.
  • >> But, in the final analysis, what did you need transactions for?

    If you need speed and the integrity of your database is not critical, then MySQL is an ideal choice. A search engine would be one ideal application; I'd be surprised if Google didn't run on MySQL. A site like Slashdot is another. It's not the end of the world if the site crashes and a few posts get lost. The speed of the database is more important than the integrity of the data in both applications.

    However, any update that involves two or more SQL commands to run as a single unit must use a transaction to insure data integrity. The simple banking transfer is the classic case. I want to transfer $50 from my savings to my checking account. Two SQL statements are needed - one to debit my savings account and one to credit my checking account. If the computer fails after the first command has run I end up short 50 bucks. So here is one example that needs ACID properties to guarantee that both commands will succeed or they both get rolled back.

    Any application that needs to guarantee data integrity or perform multiple SQL statements as a single unit must support the ACID properties and therefore transactions.

  • There are a lot of differences between MySQL and PostgreSQL. Except that the vast majority of them is small and can be went around easily (take subselects for example: I was able to work around this limitation of MySQL at all times with almost no performance hit).

    Arguments about both database's stability are irrelevant: both databases are rock-solid (the MySQL server I've been using has never had a problem, despite the user load). Whether they fail almost always depends on the environment--if the database is set up on an unstable system (e.g. Win32) or with poorly choosen system settings, then both databases are going to take a performance and possibly a stability hit.

    Let me tell you the stuff that matters. MySQL is fast. PostgreSQL is slow. MySQL shifts a lot of responsibility on the programmer, while PostgreSQL keeps this responsibility within itself. That's it. Comparison is over. Details are then just details.

    In other words, I suggest that you use MySQL if:
    1) you're writing simple databases (and therefore don't require things such as transactions, complex subselects, foreign keys)
    or
    2) you're writing complex databases AND you're willing to shift some functionality into your application (do some extra work to imitate subselects, lock all tables and keep an internal "undo table" to imitate transactions, make sure your program keeps the correct relations and otherwise does the right thing for foreign keys)

    This way you get the fastest performance with all the desired functionality.

    If you're not willing to move some functionality into your own programs, then use PostgreSQL, at cost of decreased speed.

    IMHO. Note: I'm a biased MySQL user, but I tried to be objective.
  • But despite that, stored procedures are still very useful!

    I'm sorry, I just couldn't resist.

    Matt
  • The SP language is loadable! Right now three languages are supported. PERL, TCL and PGSQL/PL (a oracle like language). I imagine there is a python in the works it just seems logical. The trick is to strip all the "dangerous" keywords.
  • Postgres has al the features you mentioned and more. If you look into it you will discover that postgres has some rarely used but very powerful features like operator overloding, spacial datatypes, object oriented storage, loadable languages etc. As an aside Interbase also kicks serious ass look into it.

    SQL anywhere is nice but thre really is no need to pay for it when comperable databases are free (same goes for SQL server BTW).

    Oracle and DB/2 OTOH are still very far ahead in enterprise features like replication, clustering, live backups etc. If you want all of these and don't want to pay for oracle or DB/2 check out frontbase [frontbase.com] for about 2 grand you get all this and more. The docs are pretty lame but the engine is great.
  • Newbie here, was wondering how connections are measured...is it 30-50 connections simultaneous at any given time? Meaning only a max of 50 people can query tables in a DB at one time? and what happens if more than 50 try to access the DB at one time? Then what? Errorz?

    thanks
  • Maximum bytes per row:
    MS-SQL 6.5: 1962
    MS-SQL 7.0: 8060

    Note that this limitation is pretty easy to work around with foreign key relationships. I have a feeling that MySQL users might percieve this problem to be greater than it actually is because they are used to designing larger 'flat file' databases.
  • Now, this is completely off-topic, but I might as well stick the thought into this discussion. Have any of you had positive experience running Sybase 11.3 on Linux in a production environment? The DBMS is free (as in beer) for any type of use, as opposed to the free version of 11.9, which you can only use for testing and development purposes. Now, 11.3 is pretty old, but it's still running a lot of mission-critical applications in the real world, and my experience with it (on Solaris, NT is a different story) has been extremely positive. Rock stable, with decent performance and, of course, the power and flexibility of T-SQL to boot.

    So would Linux + Sybase 11.3 be a reasonable alternative to Linux + MySQL/PostgreSQL for an organization seeking a cheap and reliable solution?
    --
    Violence is necessary, it is as American as cherry pie.
    H. Rap Brown
  • by alienmole ( 15522 ) on Sunday July 09, 2000 @08:43AM (#947914)
    Actually, I don't think I made my point clearly. I agree that guestbooks and applications on that level - which may involve little more than a table or two - are well-suited to MySQL/Postgres, and I agree that there are a lot of those out there.

    But the article itself discussed some of the more advanced database features provided by Postgres, which to me implies that it was addressing a wider audience than just the guestbook-writer's guild.

    My real point is just that once you get beyond a database with just a few tables, the benefits provided by a commercial database are probably worth the money, even if you're not a mission-critical enterprise systems developer. However, I grant that you'd want something a bit cheaper than Oracle, which is why I mentioned SQL Anywhere.

    My other point is that I suspect a lot of people who pick MySQL or Postgre because they're the only free/open tools available, and try to use them for something more than just a guestbook, may not realize what they're missing and how much unnecessary work they have to do just to take care of basic database plumbing issues and error handling.

    In fact, the original post I replied to mentioned dBASE/Clipper, which is what really prompted me to respond, because I remember all too well having to deal with some of that basic plumbing with those products. I don't doubt that MySQL and Postgres are an improvement over xBase, but some of the limitations mentioned in the article are serious ones, and will mean that developers have to do more work than they otherwise would to get an error-free and stable system.

  • This is all correct, except that MySQL has added transaction support in the latter part of their 3.23.x series of releases. (Including all releases under the GPL).

    ./configure
    make comment
    make post
  • Or, as is the case in point, in Kb/sec. Here is where MySQL rules. Using some sort of connection pool (so preventing reauthentication to the DB) and a well-tempered database schema, MySQL can be made to perform as fast as its big brothers (there are unoffocial benchmarks showing select times better than Oracle). The connection pool also limits the need for many open connections at any given time. The numbers [phpbuilder.com] in the article agree with this raw velocity advantage MySQL commands.

    My take is that when some features like transaction support, sub-selects and some sort of stored procedure support (other than writing it in C and recompiling :)) reach production grade for MySQL, we will have a clear winner and a hell of a open-source competitor for the big guys.

  • (opinion)

    IMO, SQL isn't all that cool. Most of the complicated queries you might do are better expressed in a proper programming language. I wanted to include native database support in a scripting language I was developing for precisely this reason.

    I tried both the Postgresql development kit and the mySQL one last year, and the mySQL kit was much easier to use.

    If you're doing that financial/business stuff that I try hard to avoid, perhaps transactions are important. Maybe with ultra-huge databases or high volume, complicated queries and stored procedures are important. But if you're just interested in using the very basics, mySQL is easier and (so they say) faster!
  • Informix does not know how to sell. I asked for a price quote three times both on the phone and over the web all three times I was blown off. I guess I am too small to be considered for a price quote.
  • Havent used mysql but have used both interbase and postgresql. I much prefer interbase here is why.
    1) faster.
    2) more stable
    3) runs on anything!
    4) IBO/IBX for direct connect with delphi (and soon to be klyx)
    5) loads of UDFs
    6) Supportive user base.
    7) Awsome documentation. Far better then anything else I have seen including commercial databases.
    8) Good language for SP and triggers.
    9) SP that return recordsets!

    All in all a pleasure to work with.
  • by morzel ( 62033 ) on Sunday July 09, 2000 @07:20AM (#947926)
    There is a huge crowd out there that uses databases and that don't need _any_ feature of the big commercial databases to get the job done.

    If I want to code a guestbook for a small website, create an on-line shop for the fanclub our obscure little rockband, I don't need all the extras of a commercial RDBMS.
    Last time I worked with oracle, the actual cost of getting the oracle licenses in place was completely insane. But that cost was justified, because we needed our app to be entreprise-ready (but God knows it'll be DB2 next time such a project comes along :p ) - But you can't expect me to be happy to pay big $$$ if I don't need that stuff.

    You missed the point, no more, no less :-) (nothing personal intenden ;-)


    Okay... I'll do the stupid things first, then you shy people follow.

  • Hey, this was a real world benchmark. Keep that in mind. Many of the points mentioned in your comment(waiting for the database to "warm up" for an hour or so) would never happen in the real world. You just lost your big client's only server. It's toast. You get it back up ... and now you're going to wait for an hour and twenty minutes or whatever? I doubt it. Nope, this was "real world." And, although I don't use databases on a regular basis, this is exactly the kind of information I'd make my choice on.

    Dave
  • by wik ( 10258 ) on Sunday July 09, 2000 @07:50AM (#947932) Homepage Journal
    The issue with allowing the database to warm up is that you're attempting to get the database into a steady-state condition, e.g. your disk cache is not stale or any database cache that would be beneficial to the performance of the database is not stale.

    A database that has been open and running for 2 minutes is not the same as a database that has been up for 30 days (the latter has been warmed up). It is how the server will actually be operating for a majority of its use.

    The warm up is not, however, something that you would do when you start up the server processes. It's simply to get the system in a state like it would be in when you have it running for a long time. We're talking about two different things. :)

  • by jetson123 ( 13128 ) on Sunday July 09, 2000 @10:11AM (#947937)
    Whenever the question of MySQL vs. PostgreSQL comes up, someone is sure to dust off the old "bank account transfer" example and pound the table about how important transactions are to guarantee consistency and prevent data loss.

    Nonsense. Many applications (including banking applications) that theoretically would be well served by transactions still run largely non-transactioned and use proven real-life approaches instead (transaction limits, nightly reconciliation) because trying to handle everything with transactions would create a huge bottleneck. As for data loss, that is more of an issue with database logging and replication, which you can have without having transactions.

    Both MySQL and PostgreSQL have their strong points and their weak points. I think I have a slight preference for MySQL, not in spite of, but because of, its limited feature set; it keeps people from blindly using performance killing features that most applications just don't need; full SQL just makes it too easy to write something that will bring the whole database to a grinding halt.

    Ultimately, the argument between MySQL and PostgreSQL is missing the point. The current crop of SQL-based databases (this includes Oracle and DB2) are just awful for modern applications: their APIs and data models are a poor match to what we really need (SQL was originally designed to allow managers to generate reports easily). The widespread use of "stored procedures" and various "object" features is a clear indication of that. Some alternatives are on the horizon. For now, we'll have to make do with what we have, I guess.

  • Because I didn't state that the MCI database of customers and internet lines was set up in DB/2. The dog track app was complex enough to need some of the special features of Postgres, but it had a low load and a fairly small number of tables. I'd probably break down fairly well into 3. Postgres could have handled that (With requisite backups, of course) without ever having a problem. It'd be mission critical enough, and my employer wasn't going to shell out for DB/2 or Oracle or the hardware to run them. It'd have been a huge step up from Clipper nonetheless.

    In the MCI application, DB/2 was perfect. Too bad that at the time I was working for them, they had IP addresses linked to circuits and not customers, so if you moved your T1 line, we'd have to manually move your IP addresses over or they'd get lost and recycled. This also kept us on OS/2 well beyond the point when I wanted to get away from OS/2. Didn't stay there long.

  • Getting subselects right requires that (a) the query optimiser in the DB works well and that (b) the person formulating the queries knows how to produce good SQL.

    Usually, it's the latter, not the former that falls down. Speaking as a SQL cavemen [greenspun.com], I usually hand of complex queries involving suub-sub-queries to our DBAs to do. Why? Well, Oracle sure as heck ain't crap at doing them. In fact, Oracle can do a complex sub query in less time than it takes to fetch a query, process it, submit another query based in te first, etc. The difference is that a well-formed query will be orders of magntiude faster than a poorly formed one.

    Overall, performing multiple queries instead of subqueries is almost always slower, and imposes heavier loads on your front end and database boxes than simply sriting a good piece of SQL. The problem is writing good SQL, because it's really easy to write something very slow indeed. And rarely ever the database engine's fault.

  • Did anyone think this test was kind of...I don't know stupid? I mean, what's the point of giving up numbers like "2-3 times faster" or "much better" That's not a benchmark. The author went into detail about the setup and then proceeded to not give any real detail about what happened. He aldo didn't really say what a connection was, was it being banged at constantly every second? Were they lagged to account for reading the data it was pulled up?

    All in all it seemed pretty superficial.

    Rev.Marc.

  • Excuse my apparent lack of knowledge on the subject, but are there that many differences in the databases? I mean, isn't SQL SQL?

    SQL is SQL, but neither Postgres nor MySQL are. There are a few small differences between Postgres and ANSI 92 SQL, and they get smaller with every release. MySQL lacks many important features of ANSI 92 (transactions, roolback, sub-queries...) and doesn't intend to change. Futhermore, MySQL lacks many features below the API which real databases need, like good query optimisation; although MySQL is very fast for simple queries, it is very slow for complex ones. For applications which don't really need a database at all, MySQL is fine; for applications which do, Postgres is better.

  • Well besides an idealogical perspective what is the problem between the two? Either allows you to do pretty much whatever you want with the software, so what are you worried about? Just curious this isn't a flame or anything.
  • I remember hearing about it on a mailing list that postgresql loses a respectible amount of time because it fsync's it's files all the time. postgresql IS, like most other good RDBMs, paranoid about data security.

    I'm sure everyone can guess how much of a slowdown constant fsync's would be, though maybe with ext3 or reiserfs, those operations can dump to the log and the penalty of fsync won't be as large.

    If you don't mind giving up some integrity, you can disable them. I'd like to see that benchmark rerun where postgresql wasn't fsync'ing all the time.
  • by Anonymous Coward on Sunday July 09, 2000 @11:13AM (#947962)
    It got modded up so high because the moderators know how databases work.

    You can't just add transactions as an afterthought. MySQL just wrapped sleepycat with a SQL front end and called it transactions. So technically it is, but you'd be insane to use it in any serious setting.

    sleepycat is not a good thing to build a client-server RDBMS on. (this isn't to say it doesn't have it's uses. sleepycat is great for simple inproc data storage. but this is going to be a disaster).

    mysql was not designed for this kind of application, and it shows. mysql needs a complete redesign before it will be remotely competitive with commercial databases.
  • I know that Slashdot does have rare problems (I've been visiting every day for a long time so I've seen them) but the author of this article makes it seem like every time you come here something breaks. That's not the case as far as I've ever seen. Am I just lucky in that this site runs like a champ every time I come here or do others see lots of database related problems popping up? Slashdot's use of mod_perl, apache, and mysql for the setup is one of the best examples I've seen of why the combo works so well.
  • by fence ( 70444 ) on Sunday July 09, 2000 @05:32AM (#947965) Homepage
    Now that MySQL [mysql.com] supports transactions, I would probably go with it over PostgresSQL [postgresql.org].

    Even though Sybase hasn't open-sourced their Sybase-SQL server and Adaptive Server Enterprise (and they probably never will), I prefer using Sybase over both PostgresSQL and MySQL. Sybase SQL server 11.0.3 on Linux [sybase.com] is free for commercial use.

    and no, I'm not affiliated, etc...I just like 'em
    ---
    Interested in the Colorado Lottery?
  • by driehuis ( 138692 ) on Sunday July 09, 2000 @05:35AM (#947968)
    I've found myself stuck in the PostgreSQL/MySQL discussion more often than I want. I personally don't care that MySQL has a more limited SQL implementation than PostgreSQL -- I'll happily write my code around its limitations. However, back when I settled on MySQL (some two years ago), PostgreSQL just wasn't stable enough, and to this date, I have yet to see MySQL corrupt a table. The only index corruption I ever saw was when someone managed to rip out a SCSI cable in mid update.

    This review also mentions the admin tools in passing. MySQL's backup strategy is very easy to explain to non-database guru's. Try explaining how to restore an Oracle backup to a non-DBA... You can even get a poor mans roll-forward working on MySQL and be able to explain how it works.

    Operational ease is always my first concern. As long as you keep in mind MySQL is a poor mans solution, it is a very nice poor mans solution.

    Sigh... Let the flamefest begin!

  • by linuxonceleron ( 87032 ) on Sunday July 09, 2000 @05:37AM (#947969) Homepage
    I used PostgreSQL as a database backend for my squidlog2sql script. Running on a 486/25 with 16mb ram, Postgres was slow, but it ran like a tank, it never went down and would always perform reasonably even under fairly high loads. It also had an ODBC driver for Win32, which allowed non-linux users to access the database without me having to write a CGI interface for it. I haven't tried MySQL, but from what I've heard, its quicker but not as 'professional' because it lacks some features. If MySQL was much faster and easy to install (Postgres is a pain to intsall), I might switch since I don't even need these professional features for what I'm doing.

  • FYI:

    The way I set it up in my scripting language is

    - plenty fast
    - MUCH less error prone than writing the queries in mySQL (because of static type-checking and even syntax checking! .. both lacking in mySQL until you actually run the query)
    - more powerful

    There is definitely a speed hit -- but that's what you get when you abstract. Since my volume is not hundreds of thousands of hits per day, I'm rejoicing in how easily I can deploy apps rather than being upset that I'm not hacking SQL for efficiency.
  • Exactly, I want to see a benchmark of PostgreSQL's transactions and MySQL's supposed transactions. Oh, and I want to see it on a machine that uses extensive use of triggers and stored procedures to cut down on connections to the database.

    MySQL is a great product, but comparing it to PostgreSQL is like comparing an ultralight to an oil tanker. MySQL is hardly more than an SQL-based interface to the filesystem, while PostgreSQL is an RDBMS with ACID properties and a raft of cool extra features. If all you need is a format to put your data in for immediate retrieval without ACID capabilities then MySQL is perfect. However, if you need more than this then MySQL simply won't do.

    The most interesting part of the article, in my opinion, was that MySQL was not really that much faster than PostgreSQL even in MySQL's one clear specialty selects. For the slight penalty that you pay in selects with PostgreSQL you gain a whole raft of features, that make database development a lot easier. The writer of the article mentions that you can get around MySQL's problems "with a little elbow grease," but why should you when the PostgreSQL team has already solved these problems for you.

    Just because MySQL doesn't solve your database problems does not mean that you should discount all Open Source database solutions. I think that you will find that PostgreSQL has the features that you need (unless you absolutely need remote synchronization, in which case try again soon they're working on it).

  • by laertes ( 4218 ) on Sunday July 09, 2000 @12:03PM (#947977) Homepage

    Don't you think that your comments are more than a little unsubstatiated? Have you ever used an ATM? If so, you are using transactions. Have you ever paid with something using a credit cart? If so, you are using a transaction application. Have you ever made an insurance claim? If so, you again have used transactions.

    While preventing dataloss is an important application for databases, it is not the only one. Instead, try creating a large, complicated system using no transactions. In order to execute most data manipulations, you would have to modify more than one table. Even something like a bank account transfer would access at least three tables. How would you do this using something with no transactions? You would lock the tables as you needed them. How would you do this with a transaction safe system? You would forget about locking the tables and commit or rollback, depending on the success of the action.

    The fact is, for simple joins, MySQL will always be faster, and simpler. For more complicated joins, it's naive query optimization is incredibly limited, with joins involving more than seven tables, my experience has shown that MySQL just cannot optimize correctly, even PostgreSQL was even faster on some of my most complicated joins.

    Simple insert/update actions likewise are much faster, but with the lack of foreign keys, you have to write select statements yourself to maintain referential integrety.

    MySQL is faster for simple actions that wouldn't need foreign key checks, and complex manipulations. However, consider when you have to make a modification to a large number of tables. in MySQL, you would have to lock each table, blocking out any other access to the same data, while something like PostgreSQL allows concurrent access to the data.

    You talk about the performance killing features of full SQL, they do indeed have a performance hit. These are overkill for glorified flat file databases, but for a normalised database, transactions give you an order of magnintude more speed.

    Consider the ACID test, durability is only one fourth of the equation, atomicity, durablity and isolation make up the rest of the equation. These features not only make the system more stable, they also make life for a developer much easier. They also allow a far more robust, scalable and flexible system to be developed in a shorter amount of time. Every big website that uses a DMBS uses something with transactions, not because they are a performance killer, but because they are a more scalable solution.

    At the low end, big databases are slower, but as the size of the system grows, MySQL gets slower faster, untill at Enterprise level, it becomes slower. Lest you think I don't know what I'm talking abount, I work for a fairly large e-commerce site as the senior programmer.

    BTW: I do agress that SQL is not the ultimate solution, I wouldn't go so far as to say it's awful, though. Instead of bitching, why don't you design something better, I am designing something better, it's just a long way down the road.

  • Actually it is now possible to raise the 8K limit to 32K when you are compiling PostgreSQL. This is sometimes handy, but it isn't nearly as cool as what they are currently working on for 7.1.

    In 7.1 they will allow you define columns as being "toastable" which means that PostgreSQL will automagically break the large objects into pieces and store them in several consecutive tuples. This will all be transparent to the user.

    Of course it is already possible to use the Large Object interface to store information in columns that might break 8K. It just is more unwieldy.

  • Don't blame your lack of support on PostgreSQL. When I decided to experiment with PHP and PostgreSQL I dutifully subscribed to the corresponding PostgreSQL mailing lists, and I have found them to be very helpful. Perhaps the most important benefit I have gained from the PostgreSQL mailing lists is an idea as to what the PostgreSQL backend is capable of.

    When I switched from MySQL I mostly was interested in using PostgreSQL exactly like it was MySQL. Hanging out on the lists I learned to use Postgres's more advanced features, and I found that this was a huge boon to my projects.

    Of course, I also found myself shifting away from PHP..., but that's probably just a coincidence.

  • Granted, our test wasn't versus Postgres, however it was a test for curiosity's sake.

    A coworker was developing an application that would use MySQL. He needed records to test with, so he connected over our LAN to the and imported records from the live Oracle (about 2 - 3,000,000 records in total).

    While the import was taking place, we were performing multiple queries (some using fairly complex calls) just to see how well MySQL would hold up.

    The results were near instanteneous. MySQL didn't even blink at the load.

    I understand that it isn't a real world situation as it was only one user querying, but with the amount of importing that was being down, we would have thought the system would bog. But you would never know the import was taking place.

    We even got a "holy shit!" from our Oracle DBA when we should him.

    --mark
  • Even though MySQL supports transactions I would still go with PostgreSQL for all database needs. Why?

    * PostgreSQL transaction support probably is more stable
    * More importantly PostgreSQL supports features necessary for a complex RDBMS-backed application. IE: Foreign keys, subselects (nice! very nice, why do 2 queries when 1 will do?), stored procedures, triggers, etc.

    If you have ever read the "why no foreign keys" on the MySQL website, it is funny as hell, I mean.. who the hell puts

    "The speed impact is terrible for INSERT and UPDATE statements, and in this case almost all FOREIGN
    KEY checks are useless because you usually insert records in the right tables in the right order, anyway. "

    What kind of lame statement is that? Its like "well, it takes too long to put on seatbelts and i have to bend my arm in a unconfortable way, but usually we dont have car accidents, so who cares?"

    I find MySQL's approach to DB consistency totally and completely irresponsible... Its like not installing any safety features in a car because they slow it down, they're unconfortable and inconvient... but thats ok, because we usually dont have an accident anyways!

    This is why I despise MySQL and monty, I think he is just a poor excuse for a RDBMS developer... The "nice thing about Foreign KEY is that odbc programs can draw relationship diagrams". Well if that is all they are good for, then I'd really question his sanity ;-)

    Anyways, I find Monty's "reasons not to use foreign keys" similar to what car makers probably said when the government started to make seatbelts manditory...

    PS: I have never used the ability to copy 3 files to move a mysql table, have you?

He has not acquired a fortune; the fortune has acquired him. -- Bion

Working...