UML, PostgreSQL Get Corporate Support 213
tcopeland writes "An article on NewsForge highlights some changes in the upcoming PostgreSQL release (v7.5) that are funded by Fujitsu. PostgreSQL core team member Josh Berkus says that "Tablespaces, Nested Transactions, and Java support" are being underwritten by Fujitsu; this has also been mentioned on the postgresql-hackers list. He also says that 7.5 will be "...the most significant new release of the software since version 7.0 almost four years ago". Good times for PostgreSQL users!" And ggoebel writes "Jeff Dike posted a notice to the UML [User-mode Linux] developers mailing list: 'The first bit of news is that as of last Monday, I am working for Intel. They
generously offered a full-time position, off-site, with my time mostly spent
on UML. This basically means that UML is no longer a part-time, after-hours
thing for me, so we should start seeing more work happening on it, especially
compared to the last month or two.'"
UML is pretty awesome (Score:3, Informative)
Re:UML is pretty awesome (Score:5, Interesting)
I respecfully disagree. While UML gives you excellent isolation, it is an extremely inefficient way to virtualize your server since it does not take advantage (by design) of all the optimizations that UN*X provides. UML is great for kernel developers and applications where isolation is far more important than performance.
In Linux virtual server hosting, the future will be Linux VServer Project [linux-vserver.org]
(ok, I'm somewhat biased, I admit)
Re:UML is pretty awesome (Score:3, Insightful)
Why are Wikis always touted as the solution to documentation, yet every time I try to find useful information in some project Wiki, it is always useless?
Ahh, there's a paper on VServers. Sounds kind of like jails with more separation. However, the filesystem separation of UML is a feature. VServers are good for completely managed hosting, I'm sure, but UML is the answer to people who want to get whole machin
Re:UML is pretty awesome (Score:3, Interesting)
Re:UML is pretty awesome (Score:2)
As far as I can tell the major overhead from running UMLs is memory. While that is certainly a limit, I think many people will be willing to spend a few megabytes per customer in order to get near-complete separation.
I'm not saying vservers have no place - again, I think they're highly useful for completely managed hosting. But, as an interim step between colocation and a
Re:UML is pretty awesome (Score:2)
www.linuxvirtualserver.org and www.linux-vserver.org are two totally different projects.
That's all fine and dandy, but... (Score:2, Interesting)
Will this mean that Intel might have a chance to influence its development? The true benefit of projects such as this is their independence from the big brother corp
Re:That's all fine and dandy, but... (Score:5, Interesting)
Re:That's all fine and dandy, but... (Score:2)
True, but funding UML seems kind of curious. After all, if you can virtualize, then theoretically you're going to be buying fewer of those Intel processors. So, what are those bunny-suited Intellers up to?
Re:That's all fine and dandy, but... (Score:5, Insightful)
You mean like Sun and HP funding the Apache group [apache.org]?
Or Novell and Ximian underwriting the Mono Project [mono-project.com]?
Or IBM contributing to F/OSS [ibm.com]?
Do you think these and other projects would be where they are today without the backing of serious money/resources?
Re:That's all fine and dandy, but... (Score:3, Insightful)
Re:That's all fine and dandy, but... (Score:3, Insightful)
You've seen free software projects with "Optimized for the Pentium 4" on them?
I think people may not realize the extent to which free software development is already corporate-funded.
--Bruce Fields
Re:That's all fine and dandy, but... (Score:2)
True enough, but isn't the major advantage of F/OSS that, even if company Foo wants to pour money into developing feature "X", if I want feature "Y", I can still develop that on my own? Granted, it might take more time, and it might even be more difficult, but I'm still free to build any extensions I want, as long as I have the time and resources. Company Foo doesn't "own" the project. They just get to encourage people to develop features they nee
Re:That's all fine and dandy, but... (Score:4, Insightful)
If IBM tells the apache group to put in dubious and buggy code, the apache group tells them to buzz off.
There is a difference, even if it isn't obvious at first glance.
Re:That's all fine and dandy, but... (Score:3, Interesting)
Sounds like a simple business investment to me - no need to search for conspiracies here.
Re:That's all fine and dandy, but... (Score:2)
Man the anti-corporation mind set really gets me sometimes. There are benifits to large corporatins. We would not have cheap super powerful PCs without them.
Unix cam from the granddaddy of all mega corps AT&T.
Let's see Intel is going to PAY this guy and he will also probably get benifit
clarification please... (Score:3, Informative)
(1) Unified Modeling Language?
or (2) User Mode Linux?
Methinks (2), given that I work alot with (1) and have never heard of Jeff Dike
Re:clarification please... (Score:4, Funny)
-read the article summary- please (Score:2)
That's straight from the slashdot blurb. Talk about lazy, JHC...
Re:clarification please... (Score:2)
Solid stuff, that PostgreSQL... (Score:5, Interesting)
Only a half million records and only about 75K queries a day, so it's not a huge DB... but it's definitely getting the job done.
Re:Solid stuff, that PostgreSQL... (Score:2)
UML is no longer a part-time, after-hours thing (Score:4, Funny)
You have my deepest sympathy.
UML (Score:5, Informative)
UML (Score:2, Informative)
Oh, you meant User-mode Linux? Well, why didn't you say so? Sometimes I think these writeups are intentionally confusing.
Re:UML (Score:2, Interesting)
You're right; I'd meant to parse the name and add in a link (as I now have done) to the project's web page.
timothy
Re:UML (Score:3, Funny)
Thanks timothy
Table spaces? (Score:5, Interesting)
Re:Table spaces? (Score:5, Informative)
You are referring to two completely different technologies:
(1) "Writing directly to disk cluster" - By that you seem to mean direct disk access, not through the filesystem. I don't even think this is part of the PostgreSQL TODO, because there is just not a very strong need. Are you experiencing performance problems in this regard?
(2) "fragment tables across spaces" - By that you mean "Table Partitioning". That allows you to break up a single table across multiple storage devices. That would be very valuable technology, but as far as I know, won't make 7.5.
If all these features really work out for 7.5, they should call the release 8.0, and maybe they will.
*: There are some tricks you can use if you need to move a single table to a different device prior to 7.5. I think symlinks work fine, but if it's important, I'd wait for 7.5 or ask on the -general list to make sure it's correct.
Re:Table spaces? (Score:2)
Re:Table spaces? (Score:2)
Writing to a raw partition is no real winner for PostgreSQL, and not likely to be implented until someone can SHOW that they'd be faster for an actual test case and then do the work.
Re:Table spaces? (Score:5, Informative)
Strictly speaking, that's not true. You can move things around manually, and some have done so, but it's not pretty, not easy, and not easy to maintain. Implementation of tablespaces in PostgreSQL simply allows its users to easily do what was previously an arcane-voodoo art. So clearly, it's a big step up. But, you already knew that.
"Writing directly to disk cluster" - By that you seem to mean direct disk access, not through the filesystem. I don't even think this is part of the PostgreSQL TODO, because there is just not a very strong need. Are you experiencing performance problems in this regard?
That's correct. AFAIK, there is no desire to implement raw partition support. The speed difference is minimal and the required code is large. Basically, you wind up writing a FS and associated buffer management into the database. The return generally is not very high. It used to be, many years ago. These days, filesystem technology and implementations are plenty fast. Those that want raw partition access, IMO, are simply living in the past.
If all these features really work out for 7.5, they should call the release 8.0, and maybe they will.
You are correct. Accordingly to the list, the numbering constantly goes back and forth. From what i gather, they are waiting to see what features actually make it in. Depending on the scope of changes, they'll then determine the version number. As a rule of thumb, people are calling it 7.5, simply because nothing else has been blessed.
Please don't think I'm correcting what you've said. You've said nothing that I disagree with. I'm simply adding a followup remark.
Cheers!
Re:Table spaces? (Score:5, Interesting)
For the uninitiated and lazy, is there any compelling reason why that's better than putting the database files on a RAID and letting the OS split the table across devices?
Re:Table spaces? (Score:5, Informative)
> database files on a RAID and letting the OS split the table across devices?
Sure, you might want to distribute your data across multiple arrays. For example - keep your logs and tempspace on an fast & expensive raid 0+1 array of fast (15k drives). Then put small OLTP stuff on a another raid 0+1 array. Then put your huge graphic images, documents, etc on a much more economical RAID5 array.
I use multiple arrays all the time for performance and economics (in db2 & oracle) - this is cool to see postgres pick itup.
Re:Table spaces? (Score:2)
Re:Table spaces? (Score:2)
Re:Table spaces? (Score:2)
> first place. Keep em on the filesystem with the web tier stuff. Do you really put this stuff in the
> DB? Or were you just remarking in general about splitting up files by what they are used for?
Both actually. I was just generalizing for an example. However, as much as I'd prefer to avoid putting files in a database, it is sometimes necessary.
A few examples of where it's necessary include:
- some
Re:Table spaces? (Score:2)
Re:Table spaces? (Score:2)
Yes, almost anytime you have mug shots of people (e.g. HR, FBI terrorist suspects, etc.) they go in a db along with the rest of the individual's HR record. And it isn't that unusual to store documents in a db, either.
Re:Table spaces? (Score:4, Informative)
However, for larger or more complex systems there are some advantages to splitting tables over multiple disk systems. For example, tables with lots of little niggling disk writes (access tables, change logs, temp tables) can go on a fast (possibly striped) disk system. You don't have to waste high-priced, high performance RAID on archived data (if it crashes, restore from tape), or on large media files etc stored as blobs or clobs.
These are just examples, but on a large server with several different disk sytems available, this technology lets the database designer match storage system performance characteristics much more accurately than a simple raid.
Re:Table spaces? (Score:2, Informative)
They also do not have table partioning. It has been discussed and it is a high priority feature but it
Re:Table spaces? (Score:5, Insightful)
It's a little more complicated I think. Using the filesystem has other advantages as well:
(1) PostgreSQL can work well with other applications running. Let's say you invent the best caching algorithm possible, then you still have two seperate caches, one for PostgreSQL and one for everything else. That means you have to dedicate the machine to PostgreSQL and have a high PostgreSQL cache (but any other app will suffer), or give postgres a low amount of cache space and it will suffer.
(2) The postgres developers don't want to worry about the bugs involved in making their own filesystem. Also, who's to say they can make a filesystem as fast right off the bat? It might be a huge development effort, with relatively minor benefit for most people.
Re:Table spaces? (Score:2)
Demo from my linode [brlewis.com]
what's the point? (Score:2)
Re:what's the point? (Score:2)
Re:what's the point? (Score:5, Interesting)
Tablespaces allow you to do things like place a table that is 90 percent read and 10 percent write on one RAID array while taking another table that is maybe 50 percent write and 50 percent read on another table and then taking the Postgres WAL and placing that on a completely different array.
Table usage varies greatly across large databases. Some tables barely get touched, others get written to alot, others get read from alot.
I'm currently running a database where our peak loads are around 35 queries, per second. I've actually symlinked table locations to put my most heavily accessed tables on a seperate RAID array from the rest of my database. This gave me a 3 fold increase in speed. This is really noticed when we do things like VACUUM the db.
I'm a programmer (Score:4, Insightful)
I know it's too much to ask OSS projects not to pick confusing acronyms and names, but I'd like to think that story submitters or at least editors could a little clearer.
Re:I'm a programmer (Score:2)
No real reason to cater to one audiance over the other -- I think it's perfectly reasonable for folks to actually (say) check what the UML link points at and run from context.
Better acronym (Score:2)
Then CoLinux could have become "Linux On Windows", which also has a good acronym.
Google didn't exist when user-mode linux started (Score:3, Informative)
message from jeff [iu.edu]
Unified Modelling Language may have existed in early 1998; I first saw it in April 1999. But Unified Modelling Language was a lot smaller back then.
And Google did not exist in February 1998!
These days, when I need to name something, I stick the name in google and check for conflicts.
Re:Google didn't exist when user-mode linux starte (Score:3, Informative)
According to this [usc.edu], UML 0.9 was from 1996, UML 1.0 was 1997.
Good tools out there for PostgreSQL.... (Score:4, Informative)
PLUG: For example, there's this little SQL query analysis [postgresql.org] utility!
All Welcome and expected - expect more.. (Score:4, Insightful)
Also this is consistent with the Open Source Paradigm. Where it is in the interests of companies to improve the software, and the advantages far outweigh the disadvantages of them not being exclusive. It is this philosophy, in my opinion, that will beat proprietary software models such as Microsoft, and it is these companies that are key in stopping those who want to halt the advancments of FOSS using idiotic patents and other invalid IP arguments.
Re:All Welcome and expected - expect more.. (Score:2)
This is just another example of a company not having to be forced to do the right thing, but rather just doing the right thing because they recognize the advantages. Kudos.
OLAP still missing... (Score:4, Interesting)
I made my company switch from SQL Server to PostgreSQL but now I have to export data every day from PostgreSQL to SQL Server just to get my OLAP reports!
As soon as OLAP is there I'll definitely get rid of SQL Server.
Re:OLAP still missing... (Score:3, Insightful)
Re:OLAP still missing... (Score:2)
Keep in mind that I am not an OLAP guy, so you may need to talk down to me.
Cheers!
Re:OLAP still missing... (Score:2)
> requirements. OLAP is a topic which comes up from time to time but real world use is often not
> offered.
Sure - grouping sets, rollup, and cube commands. These allow you to create a cube with subtotals of various dimensions in a single pass of the data. Hugely useful for cross-tab reports, olap/reporting tools, etc.
But, as someone else pointed out, data partitioning & parallelism are the key per
Re:OLAP still missing... (Score:2)
Re:OLAP still missing... (Score:3, Informative)
efeu [cybertec.at]
Re:OLAP still missing... (Score:2, Informative)
For the uninitiated, OLAP stands for online analytical processing. In layman's terms, this refers to the process of interactive analysis of data, typically via incremental queries that progressively slice, dice, and refine the data set in order to reveal non-obvious relationships between various parameters.
OLAP is typically
And that's a problem, because....? (Score:3, Insightful)
True OLAP often involves many layers of analysis, with many steps of processing. I had hoped that SQL Server OLAP would help manage all that, but it doesn't do enough. To be fair, there are some nice tools to graphically create what amouont to some stored procedures, but
Re:OLAP still missing... (Score:2)
Postgres is kicking butt (Score:5, Interesting)
It's got several really cool features, such as the ability to create your own index types, the ability to create your own column types, the ability to create rules for updating views, and a lot of other things that make it an absolute joy to work with.
The only thing I don't like about it is that it needs the ability to read bytea's as if they were BLOBs. Then life would be perfect!
From Fujitsu's pile, tablespaces is the most interesting feature I see - and that's actually pretty cool. That's one of the things that really allows you to realize the logical/physical separation that relational databases promise.
Point-in-time recovery (Score:2)
I'd trade most of these new features for that one right there. If you have a 10Gb database full of transactional data, you can't do full dumps continuously, but equally you can't afford to lose a day or even an hour of data since the last full dump.
I know its being worked on. This is the one feature keeping me away from PostgreSQ
Re:Point-in-time recovery (Score:2)
Re:Point-in-time recovery (Score:2)
Re:Postgres is kicking butt (Score:2)
Re:Postgres is kicking butt (Score:3, Informative)
This rules! (Score:5, Interesting)
Postgres flat blows away MySQL in every way I can thnk of except for the fact that one has to "manually" vacuum (cleanup + reindex) the db
If you're out there playing with MySQL or MSSQL, you owe it to yourself to give Postgres a shot.
Re:This rules! (Score:4, Interesting)
PostgreSQL is probably the most well-polished and useful open source project there is (gcc being the runner up, I skip linux since there really are plenty of decent OSS alternatives to it). Good going PostgreSQL team!
Re:This rules! (Score:2)
Make sure you're using the current release and use the daemon. I think you'll be thrilled.
Cheers!
Re:This rules! (Score:2)
Sit tight through the beta period and you'll have your wish.
Re:This rules! (Score:2)
I'm curious about what that extra effort is. I develop on postgres and manage about 6 instances with a few dozen databases, and I've never had to do more than "./configure && make install" and maybe cp the "large database" config example. Can't really see how it could get simpler.
Re:This rules! (Score:2)
Of course, you'd have to have known what you were talking about to have known that before you posted your idiotic reply, but I'm sure thinking isn't your strong suit.
Good news! (Score:5, Interesting)
More servers running PostgreSQL... (Score:3, Informative)
Props to Tim Perdue for picking a solid database on which to build GForge [gforge.org]!
User-Mode Linux Management (Score:5, Informative)
I had a few problems getting it started, but the developers were very helpful.
Why corporate self-interest can be good for OSS (Score:5, Interesting)
Postgres is getting really close to the functionality and capabilities of the Big Commercial Enterprise DBMS, close enough that anyone can see that bridging that gap is quite doable. Most of the arguable weaknesses in Postgres are in the more esoteric high-end feature space, as it is already strong and quite feature complete for most routine RDBMS work. And the upcoming new version addresses a great many of those weaknesses. As the article said, this is going to be a major release.
The self-interest part is that it is a HELL OF A LOT CHEAPER for a corporation to pay people to add those last few features and bits that they want to Postgres than to pay an unholy amount of money to buy the required Oracle licenses. The Postgres engine is clean and fundamentally pretty good in an engineering sense, and so enterprise feature tweaks are relatively cheap. It is all about dollars and sense at the end of the day. Purchasing Postgres plus feature development is almost always going to be vastly cheaper than buying Oracle. And unlike Oracle, it is pretty much a one-time fixed cost. It is worth repeating that the engineering strength and scalability of the underlying Postgres platform is the primary reason the market is evolving this way. The gap between MySQL and high-end RDBMS is comparatively much too great for a company to fund closing that gap because a lot of additional arguably unrelated work may be required because of the internals. This increases time to delivery of features, increases the cost of adding high-end features, and increases the risk of problems.
If Oracle suddenly dropped its enterprise licensing costs by a couple of order of magnitude, then it would seriously threaten Postgres development. But since that is unlikely to happen, corporate money will continue to flow into making Postgres a formidable Oracle replacement, which it is already well on its way to being.
Re:Why corporate self-interest can be good for OSS (Score:2)
Most of the arguable weaknesses in Postgres are in the more esoteric high-end feature space
Some of those esoteric features are things like clustering/failover, which in my view aren't really so esoteric. Yes, I do know that there is third party support for it, but it isn't free.
Raw partition support would also be a good checkbox in the 'enterprise ready' tab
Re:Why corporate self-interest can be good for OSS (Score:2)
What does raw partition support have to do with enterprise applications?
When I think about raw partition access, I think about a huge amount of code that allows some minor optimizations that help only on dedicated postgres boxes.
Re:Why corporate self-interest can be good for OSS (Score:3, Informative)
You're right about this being fo
Re:Why corporate self-interest can be good for OSS (Score:2, Informative)
I take some exception, however, to your view on raw partitions vs. filesystem-based storage. At least in the Oracle world, most studies and expert opinion I have viewed generally recommend against use of ra
windows port (Score:3, Informative)
Even though it is currently in beta it works very well. The port is now being downloaded over 2000 times a week and increasing all the time.
Also in PostgreSQL 7.5 - Native Windows Port (Score:4, Informative)
that's nice, but (Score:2)
Re:that's nice, but (Score:2)
UML... (Score:2)
UML (Score:2)
Do they have a diagram or something I can look at? I want to really understand what User Mode Linux "is".
Still trying to figure it out (Score:2)
Re:Good (Score:2)
Then you probably would not have a job there would you.
Re:Good (Score:2, Interesting)
Although recently one of our employees demo'd a "clone" (not of all the features, but enough to show it's real) of our system ported to PostgreSQL.
It's being considered for some new (possibly lower margin, so free is good) products in the product family.
The old "pgadmin II" tool had a useful migration tool, so other than stored procedures, the upgrade from MSsqlserver to PostgreSQL is supposedly quite smooth. That tool is still available [postgresql.org] but is hard to find because the newer pgadmin II
GUI Tools (Score:4, Interesting)
Frankly, I still like the old TCL based "pgaccess". It was buggy as all get out, and really bogged down on larger databases, but it had some really nice tools such as the visual query designer.
The article mentions a couple of other GUI tools for accessing and maintaining PostgreSQL databases. Has anyone else used these, or are there other tools that people like?
Re:Good-Postgres and SQL Server (Score:3, Interesting)
My sense is that it would be possible to extend Postgres to have a mode fully compatible with Oracle and/or Microsoft SQL Server. What this might mean is having SQL interpreters fully compatible with the quirks of Oracle and SQL Server-identical system tables available and identical libraries. I think Oracle will be the first target here because Oracle licensing fees are much higher than SQL Server--and parts of SQL Server are harder to re-engineer(
Re:Good (Score:3, Insightful)
I suspect there are more system administrators reading slashdot than programmers.
So if you are being paid to program and you want to work with PostgreSQL, your best bet is to talk your current employer to switch.
That is because almost no one is hiring programmers for PostgreSQL or MySQL.
Or you can keep using Oracle or MSSQL and put marketable skills on your resume.
Re:Good to Hear... (Score:5, Informative)
Au contraire, there are PHP interfaces for PostgreSQL, Oracle, Sybase, and MSSQL built right in to the source distribution. I seem to recall that back in the Bad Old Days before Mac OS X, when you had to compile things yourself, building PHP with all the necessary libraries was a huge pain, but now it's a trivial thing. Marc Liyanage maintains a PHP module package [entropy.ch] that snaps right into the built-in Apache web server on your Mac, and it already has most of the necessary bells and whistles [entropy.ch] built in.
Re:Good to Hear... (Score:2)
Still, the interface should be the same in your PHP code, so you don't really care what the back-end is unless you're doing something funky. I'm just glad to see more and more open source databases getting play.
Re:Good to Hear... (Score:4, Insightful)
the primary DB System for so long has been MySQL.
Care to qualify that statement? Ever hear of Oracle? Or DB2 or SQL Server or Sybase or...?
advanced features can be ignored (Score:3, Interesting)
Re:Hehe... (Score:2)
PgSQL is of my choice for a database.
Superior technology is only understandable by superior elite intellect, as you admit.
Yeah, this flamebait is better than yours.