Effective XML 312
Effective XML: 50 Specific Ways to Improve Your XML | |
author | Elliotte Rusty Harold |
pages | 336 |
publisher | Addison-Wesley |
rating | 10/10 |
reviewer | milaf |
ISBN | 0321150406 |
summary | Very well written collection of topics on XML Best Practices |
In Effective XML: 50 Specific Ways to Improve Your XML, Elliotte Rusty Harold takes a different approach: know your elements and tags -- they are not the same thing! -- and weigh your choices in a context, because any technology applied for the wrong reasons may fail to deliver on its promises.
Following Scott Myers' groundbreaking Effective C++, the author invites us to re-evaluate seemingly trivial issues to discover that life is not as simple as it seems in the world of XML. In each of the 50 items (chapters), he gets into the inner workings of the language, its usage and related standards, thus giving us specific advice on how to use XML correctly and efficiently. The 300-page book is divided into four parts: Syntax, Structure, Semantics, and Implementation. Yet in the introduction, the author sets the tone by discussing such fundamental issues as "Element versus Tag," "Children versus Child Elements versus Content," "Text versus Character Data versus Markup," etc. On these first pages the author started earning my trust and admiration for his knowledge and ability to get right to the point in a clear and simple language.
The first part, Syntax, contains items covering issues related to the microstructure of the language, and best practices in writing legible,maintainable, and extensible XML documents. (In it, over 19 pages are dedicated to the implications of the XML declaration!) That seems a lot for one XML statement that most people cut-and-paste at the top of their XML documents without giving it much thought, doesn't it? Actually not, if you follow the author's reasoning and examples.
The second part, Structure, discusses issues that arise when creating data representation in XML, i.e. mapping real-world information into trees, elements, and attributes of an XML document; it also talks about tools and techniques for designing and documenting namespaces and schemas.
The third part, Semantics, explains the best ways to convert structural information represented in XML documents into the data with its semantics. It teaches us how to choose the appropriate API and tools for different types of processing to achieve the best effect. This chapter has a lot of good advice for creating solutions that are simple, effective, and robust.
The final part, Implementation, advises the reader on design and integration issues related to the utilization of XML; these issues include data integrity, verification, compression, authentication, caching, etc.
This book will be useful to a professional with any level of experience. It may be used as a tutorial and read from the cover to cover, or one can enjoy reading selected items, depending on the experience and taste. The book's very detailed index makes it an excellent reference on the subject as well. In the prefix to the book, the author writes, "Learning the fundamentals of XML might take a programmer a week. Learning how to use XML effectively might take a lifetime." I'm not sure about the "lifetime" -- that's an awfully long time for using one technology -- but for the most confident of us this still may not be enough :) . Your mileage may vary, but I suspect that you could shave a few months off that time by browsing through this book once in a while. Most importantly, it will make you a better professional and make you proud of the results of your work. Wouldn't this worth your while?
You can purchase Effective XML: 50 Specific Ways to Improve Your XML from bn.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page.
library (Score:5, Interesting)
Re:library (Score:3, Funny)
One thing is for sure... (Score:4, Funny)
It's got to be better than Ineffective XML
Re:One thing is for sure... (Score:2)
One that comes to mind would be Bitter Java which demonstrates wrong patterns used in applications and alternatives that tend to be more effective.
So don't be too sure that it is better than Ineffective XML
Government Health Warning (Score:4, Funny)
Reading this book shortens life expectancy. Still, it's your choice...
The main issue with XML is performance (Score:4, Informative)
On a related note, more details on Microsoft Indigo are finally available. According to this article on XML mania [xmlmania.com] microsoft's future platform will use XML as much as possible. More details are available on microsft's site [microsoft.com]. The funniest part is they are claiming indigo + longhorn will be the best thing since slice bread. Maybe they haven't learned the hard lesson that parsing XML kills performance.
Re:The main issue with XML is performance (Score:3, Interesting)
On that note though, I wonder if this author has some insight into better uses for XML than what I've typically seen (XML does everything!). I won't, however, be running out to buy it, as XML will always be just more bloat and a resource hog by nature.
What are you talking about? (Score:5, Insightful)
Oy!
Re:What are you talking about? (Score:5, Insightful)
Re:What are you talking about? (Score:2)
Re:What are you talking about? (Score:4, Insightful)
Having a huge amount of metadata surround every piece of data is not always a good thing. XML is slow, parser issues notwithstanding.
Re:What are you talking about? (Score:4, Insightful)
Re:What are you talking about? (Score:3, Interesting)
Also, a lot of stuff that goes around in packets is free-form text anyway
Re:What are you talking about? (Score:2)
Exactly. It is just a form of text tagging. XML is evolutionary and not revolutionary in terms of technology. First there was SGML (Standard Generalized Markup Language) and then there was HTML.
I remember working with SGML a dozen years ago. It was certainly not easier to use than the old system of formatting manuscripts. In fact it was much more time consuming. But the real benefit was the ability to make an archive of searchable articles with results that could be pulled up and be pro
Re:What are you talking about? (Score:4, Interesting)
You bring up some really good points. The reason that you hear a lot of "XML is slow" is because of the usage of XPATH. To use XPATH expressions, most implementations parse the entire XML document into memory.
I suppose you *could* write a custom parser. If your structure is well-defined, and not subject to a lot of changes, you could significantly increase performance that way. The other option is to parse the document once, get out what you need to get out into smaller chunks, dump the larger document, and only work off the smaller chunks.
Looks like TMTOWTDI is not just for Perl
Some, not most (Score:2)
Use the context, Luke.
Re:What are you talking about? (Score:2)
I'm at a loss, and have been ever since it came out, as to why this is becoming a common way of doing things.
Re:What are you talking about? (Score:5, Funny)
<pixelrow>
<pixel>
<value channel="red" level="0.023"/>
<value channel="blue" level="0.22"/>
<value channel="green" level="0.5"/>
</pixel>
</pixelrow>
...
Re:What are you talking about? (Score:2)
Now, if you want to compare something to do with images and xml, try comparing Flash files to SVG files and see what conclusions you come up with...
Sure: (Score:5, Informative)
Re:What are you talking about? (Score:2, Informative)
Actually, yes.
Its called SVG, it is a very nice way to represent graphics.
Re:What are you talking about? (Score:2)
Could someone explain to me why conversion into, say a binary map or such, isn't an advantage? I can easily see its portability, and ease of use, I just don't see the speed and small size.
Well, double dumbass on us.... (Score:2)
XML may not be slow, but when it's used as a network protocol like SOAP and that craziness called XML-RPC it sure is. An XML parser is unjustifiable ove
Re:What are you talking about? (Score:3, Insightful)
there you go - 3 classic geek reasons to do something the hard way instead of the standard, ordinary, easy but OK for mortals way.
Incidentally, XML really is slow. Sure it looks nice, is easy to understand, easy to create with the simplest of text editors, interoperable, and an ind
Re:What are you talking about? (Score:3, Insightful)
No, your company did the exact right thing in choosing XML. When the nascent system is still being actively debugged, you made the process much easier because XML is human
Re:The main issue with XML is performance (Score:3, Insightful)
I see XML as a nice way to transport data but (at least right now) it's not mature and/or fast enough to serve as a fully functioning database.
Re:The main issue with XML is performance (Score:2)
XML is not a functioning database. XML is a way to transport data. So your misgivings are due to the fact that you have stumbled across reality.
Re:The main issue with XML is performance (Score:2, Insightful)
speek kills... (Score:3, Insightful)
While I'm not an XML zealot, I like the clarity it can bring to many domains of practice. Regarding the performance hit, get a faster computer! If you don't have a fast enough one yet, wait a year.
Lisp was shunned in the past primarily for speed reasons, too. Now the main reason many don't like Lisp is because they don't understand advanced software engineering concepts and write poor Lisp code.
Re:The main issue with XML is performance (Score:4, Interesting)
XML needs to be updated to allow binary encoding [cubewerx.com]. The open-source high-performance parser/generator library at the link demonstrates the performance gain [cubewerx.com].
Re:The main issue with XML is performance (Score:3, Informative)
Re:The main issue with XML is performance (Score:4, Informative)
Re:The main issue with XML is performance (Score:2)
piffle (Score:3, Interesting)
SQLXML and most other value-adds are bull. Your business objects should optimize the hell out of their DB access and return XML. XML is messaging and presentation tier glue. Read the book.
XML is very fast (Score:5, Interesting)
That's because everyone uses slow XML parsers. Some years ago at one of the then-top 5 web portals I was unhappy with the standard SAX/DOM parser in use; it was ridiculously slow (and buggy).
So I wrote a new one. Parsing XML became one hundred fold faster! I timed it quite carefully.
Other people in this thread are saying "of course XML is slower than binary formats, it's 3 times bigger." But a factor of 3 in performance is nothing, considering some of the advantages.
A slowdown of 100, on the other hand, is absurd.
I don't know why people don't rebel against this and make faster XML parsers the widely-used ones; for whatever reason, apparently everyone continues using slow parsers.
At any rate, no, XML is not slow. It's just a simple, easy to parse format, for which IBM and others have written very, very slow parsers.
And everyone just assumes that it has to be slow. Sheesh, why should an XML parser be slower than a C++ compiler??? Come on.
Re:The main issue with XML is performance (Score:2)
They make most of their consumer OS money through OEM sales anyway, so why not take advantage of all that under utilized power.
Nobody knows better than Microsoft that it's buzzwords that sell software. There's probably only 1 in a thousand users that actually even begin to take advantage of any features made available by the changes that they put into their back end software these days. Their design decisions are all purely marketing related. Fooling yourself i
Re:The main issue with XML is performance (Score:5, Insightful)
this single record
Doe, John 1234567 12/1/2001
took 31 bytes, while it's XML companion (using short, simple tags) took 96 bytes.
Not all XML files wind up being 3 times the size of their flatfile counterparts, but they are inherintly larger. There really isn't a way to make loading/parsing that data any faster, by the nature of working with ASCII/ANSI files. XML will always be slower.
Re:The main issue with XML is performance (Score:4, Insightful)
Uh huh. Now let me ask you, is that record space-delimited? Comma-delimited? Fixed-width [shudder]? If it's fixed width, and the first name is fixed at four characters, is the person's name "John" or "John-Paul"?
31 bytes for your record, and 96 for equivalent XML... but how many extra bytes were spent on code to manage your particular flavor of data? How much time was spent in development of that code? How does that time (and associated cost) compare to the extra millisecond/record required to transmit and process the XML data?
XML is standard. It can fit almost any type of data (though binary data is not currently the most effective thing in the world, but it can be incorporated). Since MS is integrating XML into all of their products, we won't have to worry about many people who don't have a good XML library installed on their systems. So instead of 50 programs with their own (limited and likely buggy) data formatting subsystems, we'll have 50 programs that each call one library on disk, in a standard, robust system with enough exposure to squash the show-stopping bugs.
Depends on how you look at it. If the aforementioned widely-available XML parser gets enough of a beating, it will be optimized like you wouldn't believe. Yes, two data processors (one XML, one markupless) with equal amounts of work spent on them will perform in favor of the simpler format... but XML's simplicity and universality will make it so that the XML parsers will have more eyes.
The same philosophy is why the well known open-source programs (linux, apache, etc) are functional and stable as hell:
Wide use + Openness = Greatness.
Re:The main issue with XML is performance (Score:2)
Re:The main issue with XML is performance (Score:2)
I'm assuming since you can read that you also read my minor disclaimer that followed that text. If not, go back and revisit it.
XML is standard
So are flat files, and all you need to know is what the fields are. And if you don't know what they are (and no data dictionary is provided) then yo
Re:The main issue with XML is performance (Score:2)
Bear in mind there is a huge difference between a flatfile and a fixed file. Flat files can simply be one-line (or in some cases more than one line) per record and delimited by any character or combination of characters. So the idea of extra spaces taking up room is a bit moot if you're delimiting by a single character. Granted, that one character per "field" in each line of the flatfile, so y
Re:The main issue with XML is performance (Score:2)
Re:The main issue with XML is performance (Score:2)
My entire history of coding in the enterprise has been about saving bandwidth while focusing on performance of the app. I deal with very large amounts of data (as do many programmers that read
Re:The main issue with XML is performance (Score:2, Insightful)
"Doe, John 1234567 12/1/2001 "
If you think about it that is a useless piece of information without lots and lots of context surrounding it.
* What is Doe?
* What is " John"?
* What is 1234567
* 12/1/2001 looks like a date. Is it Dec 1 or Jan 12?
* How do I know if this record is complete?
* Is my field separator a " " or ","?
Problem: The year is 2023, we now use format "x" in our records, you need to onvert all records to format "x" -- there are 233 different types of records. 7,220,134 records need to be tran
Re:The main issue with XML is performance (Score:2)
Maybe XML, then, is useful for development where coding fast (less documentation) is important. That's not a world I want to live in, but I'm sure others would think differently.
Re:The main issue with XML is performance (Score:2)
(I realize that this isn't the case with most systems, just a huge benefit of
XML... (Score:5, Insightful)
Re:XML... (Score:3, Interesting)
So, I took a look in the XML-file that the connected to the Word document to make it smart. I wasnt very impressed (but fairly amused) when I saw that the XML-file was li
userland XML (Score:2)
Frankly, ERH is a great writer and has good insights into the use and abuse of markup. This book is one of the things that was missing while the pro/anti-XML hype trains were picking up steam.
where are the open source XML repositories (Score:5, Interesting)
XML would work better if there were consistent DTDs for tagging information that everyone would use. There should be an open database of these DTDS.
I was looking for a simple one to tag photos with. Couldn't find it, made my own. Is there a repository of these DTDs out there?
Re:where are the open source XML repositories (Score:5, Informative)
Maybe here [xml.org]?
Re:where are the open source XML repositories (Score:3, Informative)
Let's see... A <digital> element contains zero or more <frame>s, each of which can contain an <image> with a URL.
W3C (Score:5, Informative)
Re:where are the open source XML repositories (Score:5, Informative)
XML is extensible by it's very nature. By itself, an xml file is just that, an xml file, it means absolutely NOTHING without context and definition.
This is what DTD's do. They don't limit xml in any way, rather they describe a particular use of xml. For example: SVG, MathML and XHTML are all languages that use xml. Each one of these languages have a DTD that define the format for a valid xml document FOR THAT LANGUAGE.
Just because a DTD for SVG exists doesn't mean that anything at all has changed with xml itself.
Next, XSLT is a technology with a very specific purpose, simply put: To take an xml file as input and create a new xml file for output based on the rules written into the transform.
So, with all of that said, there is absolutely NO reason why there shouldn't be a DTD repository, and again, there is no reason why there shouldn't be a PhotoAlbum DTD in that repository. What problems would this cause? None. What benefits could be observed? Instead of everyone needing an xml document to describe photo albums rolling their own format, people might just reuse a standard DTD to do so. And application writers just might too. And lo and behold, Application X on platform Y might be able, with no work involved, open Album AA Created by Application BB on platform CC.
Getting some of the big picture?
Re:where are the open source XML repositories (Score:2)
DTDs are but one way to do this. W3C schemas, RELAX NG, or simply a memo sent from me to ytou will also do the trick. DTDs are a good way to enforce contrants on an XML document, but a poor way to communicate among humans. None of these formats help convey much about semantics or appropriate use.
Anyway, you're confusing XML the syntax spec with specific markup language t
Re:where are the open source XML repositories (Score:2)
The post I was replying to was insinuating that DTD's are useless because they impose limitations on xml and make xml harder to use, which is just not the case. DTD's, as with Schemas or your proverbial memo all exist to make xml useable in a given context.
XML (Score:2, Funny)
milaf, if you could expand a bit... (Score:5, Insightful)
These are issues that need to be solved first, before one creates an effective XML structure. Does the book address them?
Re:milaf, if you could expand a bit... (Score:5, Insightful)
Incidentally, one of the main reasons to choose XML over either CSV or INI is that both of those formats are pretty driven by rigid "column" type structures. In most INI files there's only room for pairs of names and single values. In CSV records are one row with a set number of fields.
XML lets you expand the children fully and represent more complex data. For instance, a classical CSV file with address information for customers would have columns for street address, city and then start to have problems when you start having columns for State (when you actually consider the world outside the US), postal codes, etc. If this is in XML, you can have your schema be more flexible and say that each <customer> contains a <shippingaddress> element which can contain either a <state> or a <province> or neither.
In other words, you can use trees to represent data instead of flat rows. I'm not saying that it's the be-all and end-all that the evangelists say it is. There are still lots of places that simpler text files and other data storage formats are better, but XML can be useful.
.INI files (Score:2)
Actually there's nothing forcing you to stop at trees; you could represent arbitrary directed graphs in
Re:.INI files (Score:2)
Now, what do the contents of this ini file mean and how shall I edit it to do what I want it to?
[fido.ini]
(Contents don't matter for the point to be made)
Re:milaf, if you could expand a bit... (Score:2)
Every book on XML should address this issue. I wonder if this book does.
Re:milaf, if you could expand a bit... (Score:2)
If xml is so great, why wasn't the review written in xml? Why wasn't the book written in xml? Why aren't its' advantages obvious as opposed to the disadvantages (bloat, slow, etc).
XML Limited in at least one regard. (Score:4, Interesting)
ID and IDREF, meet the previous poster (Score:2, Informative)
You can link between XML entities quite easily.
Also consider that RDF, which describes directed graphs, is quite easily expressed in XML; there's nothing to say that you can't describe a graph and reference actual elements with IDREFs. I don't think you've really thought about this.
Hmm.. (Score:3, Funny)
Wouldn't this what my while???
All your base are belong to us!
(huge eye roll)
Re:Hmm.. (Score:2)
Not a programming language? (Score:2)
Here's the list of 50 (Score:5, Informative)
Include an XML Declaration
Mark Up with ASCII if Possible
Stay with XML 1.0
Use Standard Entity References
Comment DTDs Liberally
Name Elements with Camel Case
Parameterize DTDs
Modularize DTDs
Distinguish Text from Markup
White Space Matters
Structure:
Make Structure Explicit through Markup
Store Metadata in Attributes
Remember Mixed Content
Allow All XML Syntax
Build on Top of Structures, Not Syntax
Prefer URLs to Unparsed Entities and Notations
Use Processing Instructions for Process-Specific Content
Include All Information in the Instance Document
Encode Binary Data Using Quoted Printable and/or Base64
Use Namespaces for Modularity and Extensibility
Rely on Namespace URIs, Not Prefixes
Don't Use Namespace Prefixes in Element Content and Attribute Values
Reuse XHTML for Generic Narrative Content
Choose the Right Schema Language for the Job
Pretend There's No Such Thing as the PSVI
Version Documents, Schemas, and Stylesheets
Mark Up According to Meaning
Semantics:
Use Only What You Need
Always Use a Parser
Layer Functionality
Program to Standard APIs
Choose SAX for Computer Efficiency
Choose DOM for Standards Support
Read the Complete DTD
Navigate with XPath
Serialize XML with XML
Validate Inside Your Program with Schemas
Implementation:
Write in Unicode
Parameterize XSLT Stylesheets
Avoid Vendor Lock-In
Hang On to Your Relational Database
Document Namespaces with RDDL
Preprocess XSLT on the Server Side
Serve XML+CSS to the Client
Pick the Correct MIME Media Type
Tidy Up Your HTML
Catalog Common Resources
Verify Documents with XML Digital Signatures
Hide Confidential Data with XML Encryption
Compress if Space Is a Problem
He missed a couple, IMHO (Score:3, Informative)
Here are two heuristics for good XML design that I dearly wish more people would take to heart:
1. If processing any text field requires parsing, Something Is Wrong, and you probably need to break it apart into more elements/subelements.
The only exceptions to this rule are fields that are numbers, or maybe date/time stamps that adhere to ISO standards.
2. If you're using attributes, You'll Wish You Hadn't In The Future.
Attributes are supposed to be the way X
My experience with XML (Score:5, Insightful)
Server load could be at the root of XML's problems (Score:3, Insightful)
5 years in the business... (Score:5, Insightful)
I hate XML with a passion. Let me present you with three examples
1) Programming languages based on XML.
Yes, it is true. Perverted minds, somewhere on this planet, actually seems to think that this is a neat idea! Since their initial conception the pivotal point of programming languages have been to raise the level of programming. To move from the computers domain to the human domain - to make it more intuitive an natural for a human being to program a computer. With these new XML-based languages we are moving a step backwards, because truely the only benefit of XML in this context is that it is easier for computers to parse, while it is certainly harder for humans.
2) XSLT
Have you tried it? I rest my case.
3) SOAP
Okay, initially this actually seemed like a good idea to me, but having thought about it, I really think it sucks. Okay, so it is easier to implement SOAP for a particular platform or programming language, but a wire protocol is like a compiler or an OS kernel in a certain sense - it is okay that it is very hard to write, as long as it is stable and high performance, because it is such a central component.
i second.. (Score:4, Informative)
the fact is that XML is just marshelling and unmarshelling of all computational data to and from strings thereby negating fast numerical performance that a CPU inherently has. you want to add two numbers? create a string representation, pass it around thru a bunch of parsers/transformers as strings then finally convert it back to the number it really is then add then convert it back to string for passing it around all over again... what a waste.
Re:i second.. (Score:2)
Right tool for the job. I don't believe I've EVER heard somebody suggest that one should remove some heavy-duty number crunching from a c++ app and stuff it into XML...
On the other side however, ever tried stuffing a family tree into a relational database? Or doing large quantities of text processing in c++?
Re:5 years in the business... (Score:2)
1) Programming languages based on XML.
Yup.
2) XSLT
Have you tried it? I rest my case.
I'm coding some right now, and it's not easy. The thing is this: it is tremendously powerful, and good at doing the one thing it's good at: converting XML to XML. There aren't many cases when you should need to do this, and XSLT beats perl, IMHO.
3) SOAP
Okay, initially this actually seemed like a good idea to me, but having thought about it, I really
Re:5 years in the business... (Score:2)
Most of the time those "config files" are actually just little scripts that get sourced into whatever startup script needs the information. The variables you're setting are actually environment variables. Interestingly enough, if you changed the
Besides, i
Re:5 years in the business... (Score:2)
A bunch do, but many don't. For those that are really dressed up sh files, I have to agree with you. For the rest, a standard format (XML) would be nice.
bombadil% ls
Re:5 years in the business... (Score:2)
I've tried it, and I ended up loving it. Of course, I'm not using it as it was intended. I'm using it to convert DocBook into HTML and PDF statically. This is a heck of a lot better than using SGML/Jade.
Like XML, XSLT is being adopted in areas the authors never really intended, but ignored in those they did. I use XML in several areas, so when my employer offered to send me to an XML class for free, I accepted. It was horrible! The examples used by the professor al
when I was young. . . (Score:2)
Yeah, but code *generation* with XML is the cat's pyjamas.
2) XSLT
You clearly haven't tried it, or did not use it as intended. Do you have any experience with other functional languages? I work almost exclusively with XSLT at the moment and wouldn't have it any other way.
3) SOAP
is butt-stupid, I admit. But hey, ninty-odd percent of the beef this topic has generated can be fixed with a glance at the book being reviewed.
Nicest use of XML I've seen (Score:2)
The site [xmlrpc.org] has loads of implementations of both server and client code, some in *very* obscure languages
Simon.
Re:Nicest use of XML I've seen (Score:2)
How much easier it is to debug and produce test cases on all those platforms when it's an ascii file...
If you're concerned about size or security, then use gzip and ssl. No problem.
Not sure I'd do it if I wanted the absolute last percentage point of performance from the system, but overall, in any application I've coded, it's a major win.
Simon.
X is for the Xtensions, M is for the Metadata... (Score:5, Interesting)
I have not read this book, but it sounds interesting already.
XML is an interesting technology that has the potential for changing the way we use technology in all kinds of weird and wonderful ways. (And in a few ways that may not be so wonderful.) But using XML correctly is tough. I've written and discarded more DTDs and schemata than I care to admit because they were seriously flawed. Getting it right is important and very, very hard.
XML looks simple, and in some ways it is. But in so many other ways it is not simple at all - in large part because it gives us a tool to approach some very hard problems. And hard problems, often even when expressed in the simplest way around, tend to stay hard. (Calculus makes saying some things simple, for example, but understanding those things still takes work and insight.)
I will be taking a good look at this book in the near future to see what it has to say. And I'd urge those who dislike XML to do the same. And finally, even those who like XML need to think hard about how to use it well, so perhaps this would be a good read for them too.
Sick of XML? Try YAML. (Score:4, Interesting)
Then I found YAML [yaml.org]. Long and short, YAML is very lightweight, eminently readable, easy to use (parsers exist in multiple languages) and a pleasure all kinds of projects that require data serialization. Where XML branches off into other types of uses, like XSL programming, YAML doesn't really compete. I find this to be a strength, actually, because once you've used YAML and seen it in action, XSL seems like a big, fat add-on. But for those that rely on XSL and other things, YAML won't do the trick.
But if all you need is data serialization in a compact, easy-to-read, easy-to-use package -- and this, in my opinion, is by far what XML is most used for -- then YAML is great. Give it a shot.
As for XML. I used to hate it with a passion. Now I still hate it, but I'm less passionate. The creators of XML are ambitious people, and they tried to do something in that spirit. It works, basically and XML doesn't deserve *all* the bad press it gets.
Re:Sick of XML? Try YAML. (Score:3, Interesting)
XML is great for "documents" - text documents, that is. XML does an admirable job seperating "content" from "markup" which can be used to drive "presentation". It really is a big improvement over SGML. Things like DocBook, and CSS stylesheets, make XML the choice for writing documents.
YAML is great for "data" - data structures, that is. YAML directly maps to common application data struct
Read it on Safari (Score:2)
If you have a Safari account, you can read it Here [oreilly.com]
One way to improve it. Don't use it. (Score:3, Insightful)
Re:One way to improve it. Don't use it. (Score:2, Insightful)
And just assume that six months after releasing your program you realize it would be very useful with an "OCCUPATION" field to. What do you do now? Maintain a separate collection of databases for each generation of your software?
The XML-database
Re:One way to improve it. Don't use it. (Score:3, Interesting)
On the other hand if it looked more like this:
<Records>
<RECORD id =
<RECORD id =
<RECORD id =
<RECORD id =
</Records>
and if the tag was nested in something else, then xml is appropriate.
A
XML is just tagged s-lists. (Score:4, Insightful)
Glad to get that off my chest. I have a bitter history with XML. I was the first person at my former company to bring XML in as a uniform configuration file format for our product, but then found myself a couple of years later forced into adding XML specific features to the filesystem that was the core of our company's product. I spent a week thinking about the idea, and concluded that it was a bad one. Thus followed a long (and fruitless) battle with management to scratch the plan. The end result was a technically nifty but useless set of features. The work remains unreleased for lack of customer interest. At least I get a bit of "I told you so." pleasure.
more reviews of this book (Score:3, Informative)
Several chapters are online (Score:4, Informative)
Unix Tab-Separated ASCII Files vs. XML (Score:5, Interesting)
Re:Unix Tab-Separated ASCII Files vs. XML (Score:5, Interesting)
My current project for the last 8 months has been working on just that - parsing HIPAA EDI transactions. We do it by converting them to XML data structures. There is a decent white paper [xedi.org] about it too.
What I've found is that, for readability, XML is the way to go. For performance, EDI is definately better. I have one EDI file that is 23k. When expanded to XML, it is close to 5000 lines long.
I agree with an earlier post. If you are using an hardware XML accelerator, or using small XML documents (config, etc), or needing readibility over performanc then it is great. But I have a hard time believeing that it will replace tab-seperate files any time soon (not that the parent poster was implying this).
Your forgetting a key reason for XML (Score:2)
Well, hang on. There's the cost factor. When you take into account Value Added Network (VAN), storage and interconnection fees, plus the usual per-kilocharacter fees, XML suddenly performs much better - the bandwidth to send it is greater but if you have an FTP server then it's not even your bandwidth at issue. The cost per invoice/order is MUCH less even when development fees are taken into account and therefore performance is higher.
EDI is a pain in the ass to d
Re:Unix Tab-Separated ASCII Files vs. XML (Score:2)
I wholeheartedly disagree. XML adds a level of standardization that is unheard of (though not impossible technically to achieve) vs any type of tab/comma/verticalbar/whatever (I'll refer to any file like this as csv). Using csv, you either have to agree on a convention for labeling, or you're stuck using positions to access data. If your schema changes,
Re:Unix Tab-Separated ASCII Files vs. XML (Score:2)
Re:![CDATA[This is effective XML]] (Score:2)
the current buffer does not have a name.
It doesn't force quit, it forces the write then quits.
Re:comparing to Scott's famous c++ book!? (Score:2)
Well at least I know it's not just my imagination... Anyway, this isn't the place to discuss moderation issues I was told. I simply wanted to thank you for undertanding my pov, so there, "Thanks VioletGreen" :)
Re:Why do I have the feeling... (Score:3, Insightful)
I'm not fully convinced that S-expressions are isomorphic to XML either. The proper handling of Unicode and non-English, non-ASCII text presented in multiple encodings is a big advantage of XML compa