Sneak Peek at IBM 'Viper' DB2 Release 181
Rob let us know that Computer Business Review magazine is reporting that IBM is about to add more fuel to the database fire. The company has offered up a sneak peek at their upcoming "Viper" release of their DB2 database. From the article: "DB2 Viper will be distinct from current DB2 database implementations in that it will be able to store XML formatted data inside the database natively--XML support will not be bolted onto the side. Viper will also support relational data stores, of course, and access to those database tables using the SQL programming language."
Re:XML database (Score:4, Insightful)
That's probably because an XML database is NOT a decent idea. XML is NOT meant to be used as a way to store data! Rather, it's a way to communicate data between entities.
Sadly, XML is a one of those words that have the magic power to make marketing people happy. So they put it everywhere. If that doesn't work, they just put more.
Re:XML database (Score:3, Insightful)
Late to the party (Score:4, Insightful)
Sql support has been on the most wanted list for most companies for quite some time now. With Web Services being used everywhere, and most data formats going XML, representing all those in old-style tabular form and querying them is such a pain. Now, Sql Server 2005 and Oracle have excellent Xml support right now, not next year. Which means IBM, you are late. The deperate switchers are already switching (I know many who did to MSSQL 2005). And many for whom it is desirable have been playing around with it for atleast a year now. By the time Viper is done, they would already be running some database which supports Xml.
Which not only means that you would get very little of the Xml pie, but also that you will have to work real hard to make sure your existing customers don't move to Oracle or MS, because they want Xml support much earlier.
Re:So? (Score:3, Insightful)
Why not pay someone else to do that kind of work?
[And yes, you can donate to PostgreSQL development!]
Re:XML database (Score:2, Insightful)
There were huge debates about the "abstract model" of a relational database that didn't make sense in The REAL WORLD (TRW), because "real" problems were more complex than the relational model and performance would suffer.
I don't know that an XML database is "better", but then again, I don't know that it isn't. Maybe I'll learn something!
Re:Oracle (Score:4, Insightful)
Oracle has stored XML data in a tree structure and allowed querying via XQuery since version 9.
Stephen
Re:Technology Pot Pie (Score:3, Insightful)
Every so often a company other than IBM will try Lotus Notes. I've encountered a couple of them over the course of my career. Interest quickly devolves into the deep kind of loathing that I describe here. That usually results into a quick move (back) to MS Exchange server.
I keep hoping that IBM will give up on Lotus Notes and kill that atrocity, just like they killed Smartsuite and OS/2. Unfortunately they seem bound and determined to get some value for their 6 BILLION DOLLARS, even if it means saddling every employee in the company with the worst E-Mail client on the planet.
MSSQL (Score:2, Insightful)
For processor intensive searches, you have the option to throw hardware at the problem, moving up into RISC and mainframe platform's if needed.
Re:So? (Score:2, Insightful)
Re:XML Database, Good or Bad (Score:3, Insightful)
> Lots of people would like to be able to define records in their RDBMS that have arbitrary fields that the
> designer of the schema did not know about when the database was built. SQL does not cope with this scenario at all.
Well, relational databases can handle this situation - you just have to avoid relational *modeling* within the database. And the challenge you get into at that point is that you lose some valuable features such as foreign key support, etc. But it is doable, just performs slowly and is labor-intensive to create. Still, it has its place.
> However in my view correct normalisation solves most of these issues and makes
> the need for native XML unnecessary.
Hmmm, not sure about that. Dynamic models weren't really an issue thirty years ago when Codd was coming up with these ideas: business changed much more slowly. Today we're changing business rules so quickly - and expect to modify major systems in the blink of an eye. As mentioned above, we can support this using relational systems, but we end up heading away from normalization, not towards it.
> The complexity of such an implementation would be high, particularly within the context of a
> database that still has good indexing, table management and performance. Foreign keys would
> be an intriguing challenge. There is nothing about the problem that is inherently unsolvable but performance would be a real challenge.
Until people determine what the 'best practices' are for such a database they can get into trouble with it: how do you convert the data when the application changes? Is it easy as it is today with relational databases? Or do you have to write entire data conversion apps (like you did with hierarchical databases twenty years ago). How do you design & tune for performance? How do you handle data quality? Well, it will probably be best to start with some very small projects
Re:Excellent (Score:2, Insightful)
Actually the pricing [ibm.com] of DB2 is quite resonable -- especially for the express version.
<flame suit on>The other issue is that many companies using products such as MySql have to re-implement features that are standard in other systems. Features such as robust replication, clustering, etc also are just coming on line for MySql and Postgres, but have been part of DB2 and friends for years.
<flame suit off>Re:XML Database, Good or Bad (Score:3, Insightful)
> databse every being more agile than a properly normalized one. Either I'm missing what you mean by dynamic
> model, or you don't understand the benefits normalization.
Right - i'm not talking about 'denormalization' - in the way that you would denormalize a modeling to simplify sql and improve performance on a reporting application. I'm talking about not applying that set of database modeling rules at all.
> You do know one of the main goals of the relational model was to allow agility right?
Yep, and it has done that well: relational databases are far more agile than the hierarchical ones that preceded them. But - they aren't agile enough for some problems.
For example, lets say that you have a bicycle-shop-management application that you sell to small shops. You sell it for, what? $5,000 plus 18% annual maintenance. It handles bicycle inventory, sales, some light marketing, etc. Well, one day one of your customers decides to sell books about bicycles. Well, perhaps you've got a generic inventory table that he can describe things in - but if you've got a 3-5NF model - it isn't that generic. There are no columns specific to books in it. And he really can't afford to spend $10-50k on an update to support that.
So, ideally you've got a model in which some attributes of items are kept in key-value pair tables. This isn't wonderful for a lot of reasons - but it does give the application owner the ability to define new kinds of attributes that were unforseen by the dba. And, if done well, he can even define (in the database) rules for when some of these attributes are required, what their domain is, what their type is, what their default is, etc. These "dynamic attributes" would give the user the ability to create whatever new columns they want to describe the entity "book".
Additionally, you could design the model to support the concept of "dynamic entities": in which concepts such as book, bike, helmet, wrench, tire can be logical subtypes of inventory item. Not just identified through a single simple tag - these concepts can be related through many-to-many relationships to one another, to multiple stores, to customers, etc. The relationships between these entities can be dated, prioritized, weighted, and the entities can inherit from multiple parents in this case. Now when the store owner wants to add the concept of book they can *easily* also create overlapping sub-categories below it (mountain biking, road biking, family biking, competitive biking, history, etc) - and then relate these items to other inventory items that share that category. End result - you click on the bike shop's web site and look at a heading called "winter biking" - and see everything remotely related to this concept. And - it was easy to set up, and there's nothing specific to "winter biking" in the structure of the data.
Sort of similar to what the topic maps community is trying to do with XML:
http://www.topicmaps.org/ [topicmaps.org]
Though in my opinion they are only shooting for a subset of what we should be trying to do at this time, and what we can do via relational databases or whatever. Still, with strong db2 support for topic maps that may be the easiest way to go for now.
XML Query and XML in databases (Score:4, Insightful)
The IBM article does say that their Viper product will support XML Query (it's also known as XQuery).
So yes, looks like they will be supporting XML Query.
Is it a good thing? Some pretty smart people seem to think it's a good idea, so maybe it's worth at least taking the time to listen to them.
If the only XML you've dealt with is the result of marking up relational tables, you might not see much advantage.
If you have a lot of XML documents, though (say, five million) that all validate to an XML Schema, you know some things about them. You might know, for example, that all of the price elements contain numbers. You might know that the description elements may contain embedded partnumber elements intermixed with the text, and that those partnumber elements contain part numbers formatted a particular way.
A database can build an index based on this sort of information, and can do very efficient searches and "joins".
You might also think about what you could do if you had all of the XHTML documents from some major Web site (perhaps an Intranet corporate site, or maybe your own personal site) stored in a database in such a way that you could easily make different views of the information.
I think the real niche for XQuery might be as middleware: the ability to run queries against multiple databases, whether XML or relational or flat file or whatever, without caring about how the data is stored, can be very interesting, not to say useful.
ISO SQL has also standardised on how to map between SQL and XML Query data types, and on how to evaluate XML Query expressions embedded in SQL expressions. The Java Community Process has been working on XQJ, a way to reach out to XQuery data stores from within Java.
The XML Query Home Page [w3.org] (disclaimer: I maintain this) lists some 45 implementations, both proprietary and open source. Not all of these are complete, but, as others have noted here, XML Query is a W3C Candidate Recommendation: we're asking for public feedback from implementors, and trying to make sure that the specification is clear and precise enough that implementations all work the same way.
I think XML Query support in SQL databases is likely to become pretty widespread. Until it is, you can also use some open source implementations that support JDBC, as well as one or other of the commercial implementations that support query optimisation over external SQL-based data stores.