Do XML-based Databases Live Up to the Hype? 105
douthitb asks: "I have recently started work as a contractor with a company developing/improving an application for exchanging large amounts of data. The current solution exchanges data via XML, but the data itself is stored in a SQL Server database. There is a concern about the overhead involved with wrapping and unwrapping the XML to get the data in and out of a relational database. The proposed solution is to use Tamino, an XML-based database. Neither I nor any of the other developers have any experience with Tamino, but the desired result is to remove the bottleneck of converting the XML back and forth. Does anyone have experience using Tamino (or any other XML-based database)? What benefits and/or difficulties did you have in using an XML database, as opposed to its relational counterpart? How large of a learning curve should be expected with a product like this? Do XML databases really live up to the hype? A similar topic was discussed on Slashdot way back when, so I was hoping to get some more up-to-date feedback on the subject."
"Sales reps from Software AG, the makers of Tamino, were brought in to discuss the benefits of their product with us. They, of course, presented Tamino as the end all, cure all database system (it will even clear your acne and make you popular with the girls!). The management of the company I'm contracting with were basically eating out of the sales reps' hands, without asking any of the "tough" questions about what the product can do; I was less convinced. Doing some initial searching on the Internet, I have had trouble finding much information about Tamino outside of the Software AG website."
I've worked with the Tamino kit... (Score:5, Insightful)
If things are fixed, there are a lot of other options out there for faster manipulation. XMLBeans (now an Apache project, formally BEA) is good stuff. Hibernate is lovely kit for mapping objects to a relational DB.
Re:I've worked with the Tamino kit... (Score:2, Interesting)
The thing the XML databases are nice for is if folks can't really lock down the schema
If you don't know the structure of your data, you're not dealing with data at all, but incoherent noise, which should be treated as an opaque object.
There are few shortcuts in life, and data storage is no exception. If you don't take the time to understand your data OR admit you don't understand it and treat it as an opaque object, you will likely get burned. Sometimes you won't, but don't let that fool you. You can
Re:I've worked with the Tamino kit... (Score:2)
XML may make sense when you're force to temporarily 'shim' two things together, but it puts the 'k' in kluge.
Proverb (Score:5, Funny)
Re:Proverb (Score:2)
But an ex-colleague seemed to take months to solve a parser issue. The library used was barfing on some stuff, or something like that.
I suppose everything is fine if the other end actually sends you proper XML instead of broken XML... My guess is my ex-colleague had to deal with a broken case.
I'd have used Perl instead of Java (which my colleague was using).
Aside: It seems more common for the perl module writers to actually use the stuff they write, an
Obligatory Windows Server 2003 commercial quote (Score:1)
yeah, i support a tamino server at work. (Score:4, Informative)
Berkeley DB XML (Score:5, Informative)
Berkeley DB XML 2.0 [sleepycat.com]
Re:Berkeley DB XML (Score:3, Informative)
Tamino seems to claim recent support for "Enterprise High Availability" but I'm not sure what that means.
Before I'd decide on XML, SQL,flat files,OODMBS, RDBMS etc, I'd want to know four things:
1. How will it be secured.
2. How will I back it up and recover it.
3. How will I replicate/mirror/cluster it locally and over distances in case of a failure/disaster.
4. Do upgrades require downt
Re:Berkeley DB XML (Score:2)
Have you looked at the Berkeley DB XML High Availability product? It definitely supports multi node clusters. Remote replication should be fairly trivial to achieve, too (although I haven't personally tried it).
Oracle and XSQL (Score:5, Interesting)
Re:Oracle and XSQL (Score:3, Interesting)
Yeah, but how much data? And how many calls/second?
A few years ago I worked on a day trading system that talked to a SQL Server database and we were going to use XML to wrap the data but found that it did add significant time to the commits, and in that business time was $$$ so we left it out. (and yes, we spooled commits out of a separate thread, etc. etc. but don't ask; it was a complicated architecture that I was saddl
Re:Oracle and XSQL (Score:3, Informative)
The devil's in the details (Score:5, Informative)
Relational databases with good XML support (my background is DB2 but most major databases should be able to do this) reach a good compromise by giving you acces to normalised relational data as XML (which you can compliment with XSLT it if that's what needs to be done), while preserving it internally reduced to its bare essence as data (according to relational calculus' idea of what constitutes the bare essence of data, anyway.)
On the other hand, for single-app applications, or data that is more file oriented than datum-oriented (databases of XML documents where the document rarely or never needs to be abstracted from the data it contains), XML databases offer simplicity and efficiency by removing the need to work out a relational data model. Why break up your structured documents into a DBA's hand-tuned data model when 99.9% of your queries will just build these data sets back into XML documents (even when DB2, Oracle, and I assume SQL Server can automate this last task)? An XML database can give you more flexibility in querying than an all-XSLT solution, while saving a lot of unnecessary work over an SQL-to-XML solution for what is really an XML-to-XML application.
As I see it, that's the big picture. The actual decision has to come down to your applications. An XML database will be less efficient for non-XML applictions, plain and simple. Querying XML cannot be made as fast as querying relational tables, meaning extra overhead for non-XML apps. But *your* application encurs overhead in turning relational tables into XML (probably via the RDBMS's internal facility), and in transforming it if necessary. The question is therefore: who makes more queries on the database, this application or other non-XML ones? Who will make more queries in 5 years?
If you answer 'others' to either question, use a relational database--their XML support is decent now and will only get better, and they're far more popular in business which is an important CYA factor. If you answer 'your app' or 'other XML-based apps' for both questions, it's time to check out what XML databases have to offer right now. I expect other posts to comment on the current state of the art right now, but you can expect things to only get better as industry support for XQuery et al. improves--but don't expect them to *ever* pass up the relational databases in terms of raw performance, it's impossible. But as the evolution from Assembler to C to Java has shown in programming languages, the day may come when raw performance takes a back seat to other concerns.
I agree (Score:5, Interesting)
XML is kinda nice for some things, and really rotten for some things. Please do yourself a favor and sit down and try to decide what problem you are trying to solve. XML really stinks when it comes to sets: something that SQL based databses excel at.
I think that with the XML fetish we have these days, that we are reverting to the preSQL days of CODASYL or IMS (pre 1980s for those of you young'uns).
Re:I agree (Score:2, Interesting)
Stop bashing Charles Bachman's grand ideas. Dr. Codd used "math" to incorrectly justify bashing Bachman's beautiful techniques in their debates. But Bachman's ideas were more natural and organic. After all, natural selection didn't lead to relational structures in our brain. Do you have a relational brain? No? Why not? Because relational is too artificial
Re:I agree (Score:1)
Note to the person who modded the above as "troll". I think it was meant as a joke. It seems they were implying that THEY have a relational brain, like a robot or something.
Another "I agree" (Score:1, Interesting)
It sounds like you are in danger of changing the original data store (the relational database) in order to preserve a data transfer mechanism (XML). This is probably a bad idea.
Why is the data in the database to begin with? Is it the database for some other business application? Probably - be ca
Re:The devil's in the details (Score:2, Interesting)
But as the evolution from Assembler to C to Java has shown in programming languages, the day may come when raw performance takes a back seat to other concerns.
The point of a database is *data integrity*, not data storage and retrieval. Those are side issues. I can store data very quickly by dumping it to a raw disk device (/dev/hda1). But I will have a hell of a time guaranteeing data integrity (for instance, does each order item have a corresponding inventory item?).
Your evolution example of C to Jav
Relational-friendly text alternative (Score:3, Interesting)
Another potential problem is that existing RDBMS tend to be strong-typed. However, "dynamic relational" is not out of the question. Just because current RDBMS are strong-typed and have "static schemas" does not mean that is the only way to do it. There is a distinction between limits of implementations and limits of relational theory.
Adding MORE XML Won't Fix It (Score:3, Interesting)
So, can you explain how an XML database will fix this?
Your database still needs to translate the verbose, human readable XML into an internal storage representation. If you're transfering the data between two SQL databases now, then I can't see why it should matter if you're parsing XML and putting into a "traditional" row-column RDBMS or parsing XML and putting into a datastructure more suited for storing XML data. The parsing is going to take exactly the same amount of time.
The XML database would help if you've mapped your data representation to XML, and are having a difficult time persisting it to SQL. For some data representations, going from XML to parsed binary RDBMS representation back XML may be difficult, and it may be easier to just go from XML to parsed binary representation of XML back to XML again. But either way, you're doing the parsing.
You're solving the wrong damned problem.
Re:Adding MORE XML Won't Fix It (Score:2, Insightful)
Re:Adding MORE XML Won't Fix It (Score:2)
That depends on the Schema of the XML and the RDBMS. SQL to XML to SQL requires no rewriting of XML to relations. None. Zero. Zilch. It's simply parsing, and no faster or slower than rewriting XML to an internal XML binary representation.
Again, if it's specifically XML-centric data, and the difficult part is getting into and out of a RDBMS (and the RDBMS doesn't add any value), then go XML all the way. It's a good way to go (assuming, of course, that y
"Overhead" is not important here... (Score:2)
Re:"Overhead" is not important here... (Score:1)
The poster can correct me if I'm wrong, but I don't think they're just storing chunks of XML into the database. I wager they have a complicated XML document which they are parsing to extract keys and values. Those keys and values are used to make SQL statements which don't include XML at all. The reverse process happens when extracting data -- normal "SELECT * FROM foo WHERE boo = baz" or what have you is used, then that data is used to build an XML tree.
It is that wrapping and unwrapping that I beli
Re:"Overhead" is not important here... (Score:2)
Thumbs Down on XML Databases. (Score:5, Insightful)
If your current relational database schema is either 1) small flat files or 2) a few big tables with most/all of the data stored in "blob" columns: i.e. blobs, clobs, byte arrays, or big varchars. You might be a candidate for an XML database. I'd get two experienced DBA's to agree there was no realistic way to normalize the data, first, but that's me.
If you actually need a database (as opposed to a few files, XML or flat) and your data can be normalized (it almost always can), then a relational database will tend to provide important advantages in three areas: unforseen query handling (OLAP, data mining, etc.), scalable performance, and availability of people with the skills to maintain it.
As for the tradeoff of converting to XML, a number of the commercial RDBMS's allow you to obtain query results as XML. Though I don't know for certain how they handle inserts and updates, I suspect that there are XML equivalents for those as well. However, even if you have to completely roll your own conversion from SQL to XML, that cost is minimal against the cost of accessing the disk to fulfill the query, which both RDBMS and XMLDBMS will have to do.
In general, after working with a commercial XML database and attempting to work with another XML database written in house, I'm categorically unimpressed. I think that a lot of engineers have discounted the relational programming model without first understanding it. In my opinion, people familiar with functional and object programming models would do well to learn about relational programming with an eye to determining the appropriate model for different kinds of problems.
Regards,
Ross
Objectivity or Caché? (Score:2)
If your current relational database schema is either 1) small flat files or 2) a few big tables with most/all of the data stored in "blob" columns: i.e. blobs, clobs, byte arrays, or big varchars. You might be a candidate for an XML database. I'd get two experienced DBA's to agree there was no realistic way to normalize the data, first, but that's me.
We're doing "scientific" computing, and we're finding that classical "SQL/RDBMSs" just don't cut the mustard:
Re:Objectivity or Caché? (Score:2)
Usually, you select an RDBMS over other means of data storage when one of the requirements is to allow future users to ask currently unknown questions quickly. With the data sizes you're talking about, I don't think that any common means of db optimization will allow for particularly fast queries and
file systems (Score:2)
Back to your problems: I've never looked at the two systems you mention, but I would suggest rethinking your data management approach at the same time you are thinking about vendors. If you want blobs over 4GB, what you want is to store metadata in the RDBMS and the data block on disk as a file. The reason most RDBMS's set the blob limit to 32kB (not 32MB) by default is to discourage mass storage of untyped data in the database.
Right - I don't have any problem with a classical RDBMS serving as not much m
XML DB? In my expert technical opinion.... (Score:2, Insightful)
Obvious (Score:5, Insightful)
Benefits: XML is new and trendy.
Difficulties: Ignorance of the decades of scientific research and engineering experience in the field of relational database management systems, relational algebra [wikipedia.org], set theory [wikipedia.org] and predicate calculus [wikipedia.org]; lack of real atomicity [wikipedia.org] of transactions, lack of guaranteed consistency [wikipedia.org] of data, lack of isolated [wikipedia.org] operations, lack of real durability [wikipedia.org] in the ACID [wikipedia.org] sense, and in short, the lack of relational model [wikipedia.org]; scalability, portability, SQL standard, access to your data after two years and after twenty years; to name just a few.
How large of a learning curve should be expected with a product like this?
Certainly smaller than a real, relational database.
Do XML databases really live up to the hype?
No.
I believe that you are confusing an RDBMS with an object store. You should read this excellent comment [slashdot.org] posted almost three years ago by Frater 219. I understand that you may be inexperienced but you should not be ignorant. Literally decades of scientific research has been put into relational database management systems. Of course you are perfectly free to forget about computer science, jump on the bandwagon and choose whatever buzzword is trendy these days (yesterday it was OOP, today it is XML, tomorrow it will be
Re:Obvious (Score:3, Interesting)
Excellent point... I've worked with some huge CORBA systems with semi-custom object databases and have seen firsthand the pain these systems can put you through.
One of the bigger vendors whose software we use claims to be porting their entire system to an Oracle or DB2 backed system instead.
Of course, they'll probally use some J2EE monstrosity to implement the new system, so performance will still suck.
Re:Obvious (Score:5, Insightful)
Excellent post, as is the Frater 219 post that you referenced.
I think that both of you stopped short of pushing your arguments to their conclusions, though, so I'd like to add a bit.
Frater 219 is exactly right that objects and tuples are fundamentally different, but he focused on both from a purely data-oriented point of view, which caused him to understate the issue a bit. A better understanding of the real goals of objects and tuples helps, IMO, to clarify why they're so different -- and the arguments can be extended to consider XML as well.
Consider the goals behind relational database normalization. It's obvious that the primary goal is one of flexibility, ensuring that the data can be sliced and diced in any way imaginable, easily (which is not always the same as efficiently). A good relational design provides total "transparency", so that no matter what future demands are made, if the information is in the database it can be retrieved, just by asking the right, simple, question.
Obviously, relational database technology was created because in the past there were systems that structured data in ways that limited the ways in which it could be retrieved and analyzed. RDBMSs solve that problem admirably well.
So, if data transparency is such a wonderful thing, why does another computing tool, Object-Oriented Software structure, place so much emphasis on data abstraction and even data "hiding"? The answer is: because OO is about behavior, not data.
The tenets of good OO design are all about partitioning the problem into compact components that interact in flexible ways. Objects have data, but only, really, to provide these fundamentally behavioral entities with the data elements they need in order to function "independently". This doesn't mean that object architectures can be defined without consideration of data, or that none of the ideas about data relationships which would be at home in a relational design have a place in object design, because they do, but the core ideas of object-oriented design are about entities that act in response to stimuli, allowing internal details (like what the supporting data looks consists of) to be hidden, and allowing subtitution of other entities that accomplish the same abstract goals, but may do it in different ways, using different data.
This is the real fundamental "impedance mismatch" between OO design and relational design, IMO. Relational design focuses almost purely on data, with little attention paid to how the data will be used (well, in practice, that gets a lot of attention when it becomes clear that the nicely normalized model is simply too slow, but that's separate), and object design focuses mostly on behavior, paying attention to data only as needed to point out obviously bad factorings. This means that if you design a very nice object-oriented application and then try to simply persist those objects in relational tables, the result will be a very poor relational database. On the other hand, if you create a nice relational design and then try to create a class for each table, the result will be a painfully sub-optimal OO design.
So, as Frater 219 pointed out, if you want a database, use an RDBMS, if you want a persistent object store, use an OODBMS. If you want both (as is common), well, you have to deal with the impedance mismatch, and it'll nevery be pretty, or very efficient. IMO, the best approach is to do the OO and relational designs more or less separately, then work out a solution to translate between them.
So what about XML? Well, let's look at the goals behind XML.
One problem with doing that is that there are at least two uses of XML. The first is as markup, in the sense that the document content is really not intended to be understood or processed by machines so much as people. The tags are only used to make machines ablee to grab hold and manipulate bits of it, without any understanding of the rest of the stuff. HTML is like this. An HTML document is ulti
Re:Obvious (Score:2, Insightful)
No. Normalization eliminates duplicate information, and insures that non-key attributes are dependent on (correctly grouped with and referenced by) key fields. Normalization is not primarily about flexibility, it's primarily about data integrity. Dat
Re:Obvious (Score:2)
Data can be "sliced and diced in any way imaginable" in both normalized and unnormalized databases, but data integrity can only be guaranteed in normalized databases.
It's easy to create counterexamples, situations in which it is very difficult to extract data in certain ways with non-normalized data. Also, data intregrity can be guaranteed in non-normalized databases, just not by the database engine. Not without add-ons -- triggers and stored procedures, to be exact.
Normalization serves both purposes
Re:Obvious (Score:1)
Of course non-normalized data can be hard to query, but that doesn't say anything about data integrity, or whether a normalized database is easier to query.
Data integrity can be implemente
I've used Tamino and here's my story (Score:5, Interesting)
Well, we did finish the software on time, but it was a complete nightmare. Software AG hardly gave us any straight answers (even though they charged big $ for customer support).
Tamino itself was missing a lot of features and seemed designed as a system for storing documents, totally lacking traditional database qualities (uniqueness, reliability, scalability,
Needless to say, the software was thrown away and rebuilt with a reliable SQL database.
I would strongly discourage anyone from bilding an application on top of an xml database, especially Tamino. If you really want to build your application on top of an xml database, I would seriously ask myself why and what difference it would make. Also, if you really need an xml interface, choose an ordinary sql db that has a xml plugin.
don't waste your time with XML (Score:4, Insightful)
It is not a database, nor a data model, nor should it have anything to do with data storage and manipulation. You can store XML documents *in* a database (just like you can store dates, IP addresses, or JPG data). You can index and join on XPath components of an XML file. And you get XML documents *from* a database. But the database itself has little to do with XML. A well-designed XML database is just a well-designed relational database, and XML is just another data type.
People are now reverse-engineering a hierarchic data model from XML text files. But the hierarchic data model is less general than the relational model, and in fact was used and rejected *40 years ago* as not being general or powerful enough. Funny how history repeats itself.
Example: for simplicity, the relational model specifies that ALL data must be stored explicitly in the database. For instance if you have three rows of data, you can't assume any particular order unless the order can be calculated from the contents of each row. But XML nodes have implicit order, which means even the simplest XML document mixes data with metadata. Even a simple query requires dealing with both.
I recommend anyone who has ever uttered the term "XML database" with straight face to go back and learn some basic relational principles. I think you will agree that all data models are either 1) flawed and incomplete; or 2) reduce to the relational model.
In CS we don't have a lot of formal models to guide us, as in engineering or other science. Much of CS is entirely ad-hoc. However we do have a sound and complete model for data storage (relational model) and hardly anyone uses it. It boggles my mind. Do people not *want* their programs to work predictably?
Re:don't waste your time with XML (Score:1)
Thank GNU for this small slice of clear thinking. Every time I hear a programmer babble away and toss XML around, I want to smack him. The most retarded use of XML I've seen so far is on mobile devices running Java crap. Not only are you limited by speed, you also have a limit on the size of your application. One wee program ran much faster when the geniass stopped using XML and just used a plain olde text file. Heck, he even had room to make it fully functional.
"Use brain. Repeat." - me
It's not only a "text file format" (Score:1)
The Problem (Score:5, Funny)
For an XML database to really shine, it needs to be integrated with with a TCP/IP filesystem. Once the physical data is stored using TCP/IP (as opposed to FAT or NTFS), the XML database really begins to take off because the data is already in a network format.
I swear to god there was a Dilbert on this...
Re:The Problem (Score:2)
Re:The Problem (Score:5, Funny)
I found that Dilbert, btw! It was an E-Mail based database! Now if you'll please excuse me, I'll be over here, ducking under a table.
Re:The Problem (Score:2)
Re:The Problem (Score:2)
I'm proposing that we start a brand new paradigm -
Re:The Problem (Score:2)
i'll stick to my good ol pencil'n'paper (tm) [slashdot.org].
talking about data integrity >20yrs.
or maybe in this case, >5yrs. if at all.
Re:The Problem (Score:1)
Re:The Problem (Score:2)
Re:The Problem (Score:2)
Re:The Problem (Score:2)
Re:The Problem (Score:2)
you must be new around here.
Re:The Problem (Score:2)
You don't have to; if you implement the TCP/IP filesystem, it will already be fscked.
Re:The Problem (Score:2)
All you have to do is serialize your motherboard through an HTML port.
Re:The Problem (Score:2)
Uh, SQL? (Score:2)
You're pre-optimizing... (Score:1, Insightful)
Premature optimization is the root of all evil.
You say "you're concerned". That means you don't know.
Why don't you find out?
If you have a schema and some of your major transactions speced out, then do some performance testing and see where your bottleneck may be. For G
I'd pick an alternative (Score:1)
Now I'm not an XML expert so this comes with a grain of salt, but I personally don't like the human-readable format because it's really not that hard to get, or code a program that'll read a nor
Re:I'd pick an alternative (Score:1)
OT -- sig (Score:1)
if ( 0 ) { printf("enough"); }
But then I realized I was on the wrong verse. Can't think of anything for "I set it up".
Where will Tamino be in 5 years? (Score:2)
XML,SQL,XML Query, Databases (Score:5, Informative)
If you mostly deal with the sort of data for which relational databases are generally optimised, you'll probably not be very interested in XML solutions, as they are solving problems you don't have.
If you routinely get questions like "how often is part 1976 mentioned in the same repair procedure as part 2001?" or "which of our 150,000 documents have chapters containing five or more subsections any of which does not yet have a summary?" then the XML approach becomes more interesting.
In my book on XML databases (1999 so I don't recommend going out and getting a copy today) I talked about using a hybrid system, with metadata picked out of XML whenever a changed version is stored (e.g. you might use a CVS commit script) and stored in a relational database.
With a relational database you have a lot of flexibility to change your queries but the data representation has to be static. Even changing the type of a column can be difficult in an RDBMS.
Queries may be a little harder with the XML system, but the data storage is more flexible and you have native knowledge of sequence and hierarchy that are traditionally absent using SQL.
More recent versions of SQL have added some XML support, understanding the different sorts of queries that people typically run against such very different sors of data. There has been a lot of research over the past 30 or 40 years (hierarchical databases predate the relational model) on hierarchy, sequence and thesort of irregularity that RDBMS people call semistructured data and the rest of us call XML
XML Query is a query language designed to run over both relational and XML-native data sources (and others, for that matter) and to be optimized very efficiently, so that people like IBM (makers of DB2), Oracle, BEA, Software AG and othes can have efficient implementations. There's also standards work on how to embed XML Query expressions in SQL.
The public XML Query Web page is at www.w3.org/XML/Query [w3.org] and lists quite a large number of implementations. Software AG have participated in the XML Query development.
You might like to look at the XML Query use case document and see how close the examples map to your own situation.
Disclaimer: I work for the W3C, participate in the XML Query WOrking Group, and maintain the XML Query Web page. But it sounded like it's the sort of information you were looking for.
I can't comment on the quality of Tamino, as I have not used it, but I will also note that if you stick to openly-defined standard query languages wherever you can, there's a good chance you could move to a different implementation if you needed to with relatively little cost. This is similar to SQL, of course.
There was lots of hype around XML, but that doesn't mean it's all false, nor that it was all true. XML is a good way to interchange structured, hierarchical imformation, but it probably won't cure acne
Liam
[slashdot::Ankh -- Liam Quin, W3CXML Activity Lead]
Re:XML,SQL,XML Query, Databases (Score:2, Interesting)
That sounds like it means something, but I don't think it does. The examples that follow, "how often is part 1976 mentioned in the same repair procedure as part 2001?" or "which of our 150,000 documents have chapters containing five or more subsections any of which does not yet have a summary?" see
Re:XML,SQL,XML Query, Databases (Score:1)
As for data that fits well into the relational model and data that doesn't, consider trying to do precise queries on mixed content data, in which text and markup is interleaved. The most common approaches in the past to this were either to store the entire mixed content (e.g. a paragraph) as a single blob or long text column or to split it up into separate items.
If you
Re:XML,SQL,XML Query, Databases (Score:1)
It's easy to demonstrate that any hierarchical or nested data (such as a document marked up with XML) can be stored in a relational schema: the relational model is a superset of the hierarchical and network models (with much better integrity). So the question isn't can an XML database do something an RDBMs can't do, but rather does it make sense to manipulate a
Re:XML,SQL,XML Query, Databases (Score:2)
The people working on XML Query are not ignoring the history of relational databases. Heck, the language is edited by the co0inventor of SQL itself, the the Working Group chairs have been in
Re:XML,SQL,XML Query, Databases (Score:1)
Any time the domain (type, range of allowable values) changes significantly, the database and the applications that use it may face complex changes. In a lot of cases -- changing a column from BYTE to INT, or INT to FLOAT, or even DATE to VARCHAR -- the RDBMS will convert the underlying data. In other cases the DB
Re:XML,SQL,XML Query, Databases (Score:2)
XML Query is not defined on the string/text representation of XML, but over instances of a data model, which can be (for example) created by projections of relational data.
Time will tell how much XQuery will catch on, but I think at this point we're not contributing
Re:XML,SQL,XML Query, Databases (Score:2)
select count(distinct A_repair_proc_parts.repair_proc_fk) from repair_proc_parts A_repair_proc_parts inner join repair_proc_parts B_repair_proc_parts on A_repair_proc_parts.part_number = '1976' and B_repair_proc_pairs.par
Re:XML,SQL,XML Query, Databases (Score:1)
Re:XML,SQL,XML Query, Databases (Score:2, Interesting)
However, we should consider the viability of storing what you and others describe as unstructured documents in blobs with server-side operations available to you. Just because you're going to have some XML values (that's what they are) in your database doesn't mean the whole thing needs to be XML, nor does it mean you should have to do all operations client-side because you're using a relational database. What it does mean is that if you're de
I remember Software AG's presentation (Score:2)
We ruled it out because of expense, IIRC. Looked like a really nice product, if you had big bucks to spend on a document management system (which is what we were after). I did not get the impression that it was any sort of replacement for a proper RDBMS--speed was acheived, from what I remember, by storing the data hierarchically. And not abstracting any relational features on top of that.
There was another product out of Canada call
Re:I remember Software AG's presentation (Score:2)
Full text was split off so that the work didn't add any more delays to the main specifications. There are implementations of drafts, but they should be considered very early.
There are open source XML Query implementations too, of course.
Liam
Don't bother with an XML "database" (Score:4, Insightful)
The first is that you just heard XML is a great way to transport data, and decided to use it.
The second is that you're using the XML for more than just transporting data from one database to another; you're using it at some point with your application.
In either case, the bottom line is that XML is not good for you. If your data fits in a relational database, you should USE RELATIONAL MEANS TO ACCESS YOUR DATA. Don't use that nifty new XML reader to access your data. It's not nearly as fast or flexible as basic SQL; it's actually much more trouble than it's worth.
If you're just transporting data from one relational database server to another, use a flat file, or better yet raw SQL dumps. If you're accessing the data with an application, use SQL or the underlying API.
The only reason you *ever* need to use an XML database is when your data doesn't fit into a standard relational schema. In fact, if you try to fit standard data into an XML database, you're much more likely to end up with a ton of overhead, both in storage and speed.
Fortunately, non-relational data is extremely rare. So rare, in fact, that I've yet to see a non-contrived-proof-of-concept "real life" example.
Re:Don't bother with an XML "database" (Score:2)
You are narrowing down the aspect of transporting data too much. The most aspects of transporting data is not between db servers (with the same db schemas) but between application servers, which might use relational databases as their backend to store data - but in totally different schemas.
So, you want to exchange data not between the database backends - whose db structure you mi
eXist XML DDatabase (Score:1)
It supports XPath and XQuery, give it a try:
http://www.exist-db.org/ [exist-db.org]
Combination (Score:1)
Probably a stupid question, but... (Score:3, Insightful)
When an "XML database" is changed, is the data prior to the change left in its old XML format pointing to the original DTD, or does it require conversion of all existing data? How can the data be accessed while that conversion is going on?
How would the method of implementing a schema change be communicated to other places which have already archived copies of an old XML data entity? DTD only defines current state information - it doesn't communicate "If XYZ = 1 in DTD.v1 then set XYZ2 to "A" and set new field ABC to "foo" for DTD.v2". Each iteration of change would become increasingly more complex unless the data is converted.
This is not to say that the same issues don't exist with SQL or relational databases - but just abstracting the organization of the data doesn't mean that your problems are solved.
Lately, I've been using mySQL - and the developers have some curious ideas about the "real world". Even the most trivial changes to the database schema require mySQL to copy and rebuild the entire table... like adding a new index or adding a new field at the end of the table. When tables start having millions of rows, that means this becomes a much less attractive product.
The rationale for doing things this way had two reasons - first, it was the easiest way to implement schema changes. Second, "People should never be changing data schemas in a production environment".
Oh, really? When did we regress to the idea that databases can go down overnight in order to back them up and to implement schema changes?
half of one / six a dozen of another (Score:3, Insightful)
XML is designed to package schema info with the data exchanged between DB instances. It's higher level, more verbose, and not optimized for data processing (except for the import/export). So you'd better be absolutely certain that your overall system performance is bottlenecked by your interchange processing performance, more than it will be bottlenecked by the "XML-native" DB processing XML data, which isn't optimized for performance.
Ananova was built on Tamino (Score:2, Informative)
Project 90% XML based (Score:3, Informative)
Actually, I'm using exist-db.org and it works fine but I have some performance problems when I want to sort data.
I have tested Ipedo, TextML, dbXML, XHive,
and TextML was the faster but it doesn't support XQuery.
Ipedo was the faster with an old XQuery version support. I think it's the best product because it provides an RDBMS bridge to query with SQL and XQuery and some other features like XViews.
My application use XQuery/XSLT to read data and JAXB to check and execute business method before storing.
I think the main problems with XML Native database is performance and no transaction support but document locking.
But, the advantages are:
Like CRM, Groupware, Administrative application, fulltext and contextual searching,
You must try at least one XML Native Database in your life to compare it with RDBMS and Object databases and make your own opinions.
Where's your bottleneck? (Score:3, Insightful)
Have you timed the job of wrapping/unwrapping XML? My guess is that on modern hardware, that task is trivial. Bandwidth is a more common bottleneck for XML data transfers, and that problem is usually mitigated by compressing the XML before transfer. But I never heard anyone complain about a CPU taking too much time to extract the data from XML.
If your application queries the data selectively, you will probably find that the difference in query-processing time, between a traditional SQL database and a native XML database, more then makes up for any difference in format-conversion time.
Let your database use its own, efficient, optimized internal data formats. XML is much more suitable for data transfer than for data manipulation.
Choose wisely (Score:1)
I think it is vital to make a difference between the actual storage and the way to access it, the 'interface'. With Oracle (since 9iR2) one can store the xml as a clob, as an XMLType and object types that are based on your xml complex
No (Score:2)
Store your data in a well-researched normalized, relational form and format your query results as XML if you like. How is the progress on binary XML to avoid killing your network and CPU on your thin client?
Missinformed Story (Score:1)
1: you can always put the XML in a text field if you want to.
2: I would be far more concerned about using raw XML, it's not RIFF so you can't goto a single point without parsing the whole file, it's not indexed, so you have to search the whole file for the data you need