Sneak Peek at IBM 'Viper' DB2 Release 181
Rob let us know that Computer Business Review magazine is reporting that IBM is about to add more fuel to the database fire. The company has offered up a sneak peek at their upcoming "Viper" release of their DB2 database. From the article: "DB2 Viper will be distinct from current DB2 database implementations in that it will be able to store XML formatted data inside the database natively--XML support will not be bolted onto the side. Viper will also support relational data stores, of course, and access to those database tables using the SQL programming language."
IBM answers MS (Score:4, Interesting)
I wonder what the price point for Viper is going to be in comparison. I already know what it is for the various versions of SQL Server 2005. Ouch! I'm waiting for my Enterprise and Developer versions to show up now so I can play more (I've been playing with the betas for a long time now as I do DBE work as well).
Usefulness? (Score:3, Interesting)
So it's not the storage that counts, it is the ability to extract useful information from the text field/clob without requiring a great amount of processing overhead. Which is where I wonder how useful this is except in situations where there is very little post-processing or querying to be done against the XML. For example, if I am always just going to render the XML or pass it along without any post-processing. Even then, in terms of processor time, etc. it just isn't that hard to write good code to pull the data from a regular SQL database, output it as XML, etc. thus gaining gain all of the other advantages that a modern dbms has over flat file storage without imposing the dreadful data overhead required for all of the xml tags, etc.
Am I missing something?
Re:"the SQL programming language" (Score:5, Interesting)
(Actually I think the very latest SQL standard may have some support for recursion to handle queries like this one. I don't know if it is Turing-complete though; I suspect not.)
Does this mean SQL is bad? No. Partly because it is less powerful than a full programming language, the database can often work out roughly what a given query will need to access and so make an efficient query plan for it. If what you want is expressible as SQL, it's very often a lot faster than coding the same thing in a general-purpose language, and easier to write and understand.
XML Database, Good or Bad (Score:4, Interesting)
First, is there a difference between doing this in a relational database versus another kind (say object DB). Perhaps so, but I wish to focus on RDBMS since it is the one that is on topic here and the one that seems so counterintuitve.
Marked up data (XML, HTML, perhaps even SGML) consists of field values _and_ the schema of the fields themselves (even if not always the base data type). Whilst it may be necessary to have the grammar to be certain about the full domain of the *ML there is enough in the marked up data to construct a record from the input data. Think about it, this means that each record arriving at the database contains some information about the schema of the record as well as the data itself.
A database that took this *ML and integrated it natively would, in my world allow the user to create tables with an indeterminate number of fields that could vary from record to record whilst still allowing normal RDBMS functionality.
The complexity of such an implementation would be high, particularly within the context of a database that still has good indexing, table management and performance. Foreign keys would be an intriguing challenge. There is nothing about the problem that is inherently unsolvable but performance would be a real challenge.
I don't think that this functionality is a category killer. But I can imagine why some people love the idea. Lots of people would like to be able to define records in their RDBMS that have arbitrary fields that the designer of the schema did not know about when the database was built. SQL does not cope with this scenario at all. However in my view correct normalisation solves most of these issues and makes the need for native XML unnecessary. Perhaps it would have been easier for IBM to ship DB2 with a copy of McGovern and Date.
Re:"the SQL programming language" (Score:3, Interesting)
Re:Oracle (Score:3, Interesting)
Oracle basically chucks it's XML into a LOB
How *else* do you store a value from a type in a database?
How does Oracle store integers? "Uhh, that's different" I hear you mumble. No, it's not. An XML document and the associated tree representation is a *value*, an instance of an *XML data type*, with associated operators (xpath, text search, update, etc). So it goes into an attribute (column).
Go back and review your relational theory (that advice applies to 99.99% of users and vendors unfortunately).
If Oracle's marketing has convinced you of something different, then that's their marketing department's fault. The exact implementation (how the XML tree is stored) and syntax (how you query it) is irrelevant. The relational model, and classical type theory (which predates the RM) already tells you how to think logically and abstractly about any data storage and manipulation task, without regard to the peculiarity of any particular product.
For a more concrete example, not using the syntax of any particular product:
Of course, this needs a cup and a half of syntactic sugar to make it more pleasant when using XML-heavy applications (for instance, XPath could be embedded a little more gracefully), but surely you can see that all XML databases can reduce to the same model. Those that are created with ignorance of the relational model won't be as useful.
A well-designed relational database would already be an XML (hierarchic) database, and would already be an object (network) database, because those are both less general than the RM (entities related by arbitrary assertions).
One problem with today's SQL products (and they have MANY) is that you can't create your own types easily. You should be able to add XML support, object support, or whatever else, as easily as you can with a general-purpose programming language. You shouldn't have to wait for the vendor to "add" it.
Imagine if you had to *wait for a new release* to get XML support in Java or Perl. Yeesh. Yet database users seem perfectly content to suck down the crap from the vendors. They don't know what to ask for or how to evaluate what they get. Even though this was mostly figured out 30 years ago.
Re:XML Database, Good or Bad (Score:3, Interesting)
I had the distinct pleasure to discuss exactly this topic with Date yesterday. Yes, that Date. To say that he's not pleased with the idea of XML in databases would be a very british understatement. In his words "it's a throwback to the hieracical database model, that has already been proven defect once and for all".
On top of that I would like to add that it's extremely rare that one encounters an international celebrity within the academic environment that is such a nice person.