Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
×
Programming IT Technology

XML Namespaces and How They Affect XPath and XSLT 188

Dare Obasanjo writes: "XML namespaces are an integral aspect of most of the W3C's XML recommendations and working drafts, including XPath, XML Schema, XSLT, XQuery, SOAP, RDF, DOM, and XHTML. Understanding how namespaces work and how they interact with a number of other W3C technologies that are dependent on them is important for anyone working with XML to any significant degree." Some heavy reading below, as Dare completes the thought.

This article explores the ins and outs of XML namespaces and their ramifications on a number of XML technologies that support namespaces. What follows is a shortened version of my first Extreme XML column.

Overview of XML Namespaces

As XML usage on the Internet became more widespread, the benefits of being able to create markup vocabularies that could be combined and reused similarly to how software modules are combined and reused became increasingly important. If a well defined markup vocabulary for describing coin collections, program configuration files, or fast food restaurant menus already existed, then reusing it made more sense than designing one from scratch. Combining multiple existing vocabularies to create new vocabularies whose whole was greater than the sum of its parts also became a feature that users of XML began to require.

However, the likelihood of identical markup, specifically XML elements and attributes, from different vocabularies with different semantics ending up in the same document became a problem. The very extensibility of XML and the fact that its usage had already become widespread across the Internet precluded simply specifying reserved elements or attribute names as the solution to this problem.

The goal of the W3C XML namespaces recommendation was to create a mechanism in which elements and attributes within an XML document that were from different markup vocabularies could be unambiguously identified and combined without processing problems ensuing. The XML namespaces recommendation provided a method for partitioning various items within an XML document based on processing requirements without placing undue restrictions on how these items should be named. For instance, elements named <template>, <output>, and <stylesheet> can occur in an XSLT stylesheet without there being ambiguity as to whether they are transformation directives or potential output of the transformation.

An XML namespace is a collection of names, identified by a Uniform Resource Identifier (URI) reference, which are used in XML documents as element and attribute names.

Namespace Declarations

A namespace declaration is typically used to map a namespace URI to a specific prefix. The scope of the prefix-namespace mapping is that of the element that the namespace declaration occurs on as well as all its children. An attribute declaration that begins with the prefix xmlns: is a namespace declaration. The value of such an attribute declaration should be a namespace URI which is the namespace name.

Here is an example of an XML document where the root element contains a namespace declaration that maps the prefix bk to the namespace name urn:xmlns:25hoursaday-com:bookstore and its child element contains an inventory element that contains a namespace declaration that maps the prefix inv to the namespace name urn:xmlns:25hoursaday-com:inventory-tracking.

<bk:bookstore xmlns:bk="urn:xmlns:25hoursaday-com:bookstore">
<bk:book>
<bk:title>Lord of the Rings</bk:title>
<bk:author>J.R.R. Tolkien</bk:author>
<inv:inventory status="in-stock" isbn="0345340426"
xmlns:inv="urn:xmlns:25hoursaday-com:inventory-tracking" />
</bk:book>
</bk:bookstore>

In the above example, the scope of the namespace declaration for the urn:xmlns:25hoursaday-com:bookstore namespace name is the entire bk:bookstore element, while that of the urn:xmlns:25hoursaday-com:inventory-tracking is the inv:inventory element. Namespace aware processors can process items from both namespaces independently of each other, which leads to the ability to do multi-layered processing of XML documents. For instance, RDDL documents are valid XHTML documents that can be rendered by a Web browser but also contain information using elements from the http://www.rddl.org namespace that can be used to locate machine readable resources about the members of an XML namespace.

It should be noted that by definition the prefix xml is bound to the XML namespace name and this special namespace is automatically predeclared with document scope in every well-formed XML document.

Default Namespaces

The previous section on namespace declarations is not entirely complete because it leaves out default namespaces. A default namespace declaration is an attribute declaration that has the name xmlns and its value is the namespace URI that is the namespace name.

A default namespace declaration specifies that every unprefixed element name in its scope be from the declaring namespace. Below is the bookstore example utilizing a default namespace instead of a prefix-namespace mapping.

<bookstore xmlns="urn:xmlns:25hoursaday-com:bookstore">
<book>
<title>Lord of the Rings</bk:title>
<author>J.R.R. Tolkien</bk:author>
<inv:inventory status="in-stock" isbn="0345340426"
xmlns:inv="urn:xmlns:25hoursaday-com:inventory-tracking" />
</book>
</bookstore>

All the elements in the above example except for the inv:inventory element belong to the urn:xmlns:25hoursaday-com:bookstore namespace. The primary purpose of default namespaces is to reduce the verbosity of XML documents that utilize namespaces. However, using default namespaces instead of utilizing explicitly mapped prefixes for element names can be confusing because it is not obvious that the elements in the document are namespace scoped.

Also, unlike regular namespace declarations, default namespace declarations can be undeclared by setting the value of the xmlns attribute to the empty string. Undeclaring default namespace declarations is a practice that should be avoided because it may lead to a document that has unprefixed names that belong to a namespace in one part of the document, but don't in another. For example, in the document below only the bookstore element is from the urn:xmlns:25hoursaday-com:bookstore while the other unprefixed elements have no namespace name.

<bookstore xmlns="urn:xmlns:25hoursaday-com:bookstore">
<book xmlns="">
<title>Lord of the Rings</bk:title>
<author>J.R.R. Tolkien</bk:author>
<inv:inventory status="in-stock" isbn="0345340426"
xmlns:inv="urn:xmlns:25hoursaday-com:inventory-tracking" />
</book>
</bookstore>

This practice should be avoided because it leads to extremely confusing situations for readers of the XML document. For more information on undeclaring namespace declarations, see the section on Namespaces Future.

Qualified and Expanded Names

A qualified name, also known as a QName, is an XML name called the local name optionally preceded by another XML name called the prefix and a colon (':') character. The XML names used as the prefix and the local name must match the NCName production, which means that they must not contain a colon character. The prefix of a qualified name must have been mapped to a namespace URI through an in-scope namespace declaration mapping the prefix to the namespace URI. A qualified name can be used as either an attribute or element name.

Although QNames are important mnemonic guides to determining what namespace the elements and attributes within a document are derived from, they are rarely important to XML aware processors. For example, the following three XML documents would be treated identically by a range of XML technologies including, of course, XML schema validators.

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:complexType id="123" name="fooType"/>
</xs:schema>

<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<xsd:complexType id="123" name="fooType"/>
</xsd:schema>

<schema xmlns="http://www.w3.org/2001/XMLSchema">
<complexType id="123" name="fooType"/>
</schema>

The W3C XML Path Language recommendation describes an expanded name as a pair consisting of a namespace name and a local name. A universal name is an alternate term coined by James Clark to describe the same concept. A universal name consists of a namespace name in curly braces and a local name. Namespaces tend to make more sense to people when viewed through the lens of universal names. Here are the three XML documents from the previous example with the QNames replaced by universal names. Note that the syntax below is not valid XML syntax.

<{http://www.w3.org/2001/XMLSchema}schema>
<{http://www.w3.org/2001/XMLSchema}complexType id="123" name="fooType"/>
</{http://www.w3.org/2001/XMLSchema}schema>

<{http://www.w3.org/2001/XMLSchema}schema>
<{http://www.w3.org/2001/XMLSchema}complexType id="123" name="fooType"/>
</{http://www.w3.org/2001/XMLSchema}schema>

<{http://www.w3.org/2001/XMLSchema}schema>
<{http://www.w3.org/2001/XMLSchema}complexType id="123" name="fooType"/>
</{http://www.w3.org/2001/XMLSchema}schema>

To many XML applications, the universal name of the elements and attributes in an XML document are what is important, and not the values of the prefixes used in specific QNames. The primary reason the Namespaces in XML recommendation does not take the expanded name approach to specifying namespaces is due to its verbosity. Instead, prefix mappings and default namespaces are provided to save us all from developing carpal tunnel syndrome from typing namespace URIs endlessly.

Namespaces and Attributes

Namespace declarations do not apply to attributes unless the attribute's name is prefixed. In the XML document shown below the title attribute belongs to the bk:book element and has no namespace while the bk:title attribute has urn:xmlns:25hoursaday-com:bookstore as its namespace name. Note that even though both attributes have the same local name the document is well formed.

<bk:bookstore xmlns:bk="urn:xmlns:25hoursaday-com:bookstore">
<bk:book title="Lord of the Rings, Book 3" bk:title="Return of the King"/>
</bk:bookstore>

In the following example, the title attribute still has no namespace and belongs the book element even though there is a default namespace specified. In other words, attributes cannot inherit the default namespace.

<bookstore xmlns="urn:xmlns:25hoursaday-com:bookstore">
<book title="Lord of the Rings, Book 3" />
</bookstore>

Namespace URIs

A namespace name is a Uniform Resource Identifier (URI) as specified in RFC 2396. A URI is either a Uniform Resource Locators (URLs) or a Uniform Resource Names (URNs). URLs are used to specify the location of resources on the Internet, while URNs are supposed to be persistent, location-independent identifiers for information resources. Namespace names are considered to be identical only if they are the same character for character (case-sensitive). The primary justification for using URIs as namespace names is that they already provide a mechanism for specifying globally unique identities.

The XML namespaces recommendation states that namespace names are only to act as unique identifiers and do not have to actually identify network retrievable resources. This has led to much confusion amongst authors and users of XML documents, especially since the usage of HTTP based URLs as namespace names has grown in popularity. Because many applications convert such URIs to hyperlinks, it is irritating to many users that these "links" do not lead to Web pages or other network retrievable resource. I remember one user who likened it to being given a fake phone number in a social situation.

One solution to avoid confusing users is to use a namespace-naming schema that does not imply network retrievability of the resource. I personally use the urn:xmlns: scheme for this purpose and create namespace names similar to urn:xmlns:25hoursaday-com when authoring XML documents for personal use. The problem with homegrown namespace URIs is that they may run counter to the intent of the Names in XML recommendation by not being globally unique. I get around the globally unique requirement by using my personal domain name http://www.25hoursaday.com as part of the namespace URI.

Another solution is to leave a network retrievable resource at the URI that is the namespace name, such as is done with the XSLT and RDDL namespaces. Typically, such URIs are actually HTTP URLs. A good way to name such URLs is by using the format favored by the W3C, which is as follows:

http://my.domain.example.org/product/[year/month][/area]

See the section on Namespaces and Versioning for more information on using similarly structured namespace names as a versioning mechanism.

DOM, XPath, and the XML Information Set on Namespaces

The W3C has defined a number of technologies that provide a data model for XML documents. These data models are generally in agreement, but sometimes differ in how they treat various edge cases due to historic reasons. Treatment of XML namespaces and namespace declarations is an example of an edge case that is treated differently in the three primary data models that exist as W3C recommendations. The three data models are the XPath data model, the Document Object Model (DOM), and the XML information set.

The XML information set (XML infoset) is an abstract description of the data in an XML document and can be considered to be the primary data model for an XML document. The XPath data model is a tree-based model that is traversed when querying an XML document and is similar to the XML information set. The DOM precedes both data models but is also similar to both data models in a number of ways. Both the DOM and the XPath data model can be considered to be interpretations of the XML infoset.

Namespaces in the Document Object Model (DOM)

The XML namespace section of the DOM Level 3 specification considers namespace declarations to be regular attribute nodes that have http://www.w3.org/2000/xmlns/ as their namespace name and xmlns as their prefix or qualified name.

Elements and attributes in the DOM have a namespace name that cannot be altered after they have been created regardless of whether their location within the document changes or not.

Namespaces in the XPath Data Model

The W3C XPath recommendation does not consider namespace declarations to be attribute nodes and does not provide access to them in that capacity. Instead, in XPath every element in an XML document has a number of namespace nodes that can be retrieved using the XPath namespace navigation axis.

Each element in the document has a unique set of namespace nodes for each namespace declaration in scope for that particular element. Namespace nodes are unique to each element in that namespace. Thus namespace nodes for two different elements that represent the same namespace declaration are not identical.

Namespaces in the XML Information Set

The XML infoset recommendation considers namespace declarations to be attribute information items.

In addition, similar to the XPath data model, each element information item in an XML document's information set has a namespace information item for each namespace that is in scope for the element.

XPath, XSLT and Namespaces

The W3C XML Path Language also known as XPath is used to address parts of an XML document and is used in a number of W3C XML technologies including XSLT, XPointer, XML Schema, and DOM Level 3. XPath uses a hierarchical addressing mechanism similar to that used in file systems and URLs to retrieve pieces of an XML document. XPath supports rudimentary manipulation of strings, numbers, and Booleans.

XPath and Namespaces

The XPath data model treats an XML document as a tree of nodes, such as element, attribute, and text nodes, where the name of each node is a combination of its local name and its namespace name (that is, its universal or expanded name).

For element and attribute nodes without namespaces, performing XPath queries is fairly straightforward. The following program, which can be used to query XML documents using the command line, shall be used to demonstrate the impact of namespaces on XPath queries.

using System.Xml.XPath;
using System.Xml;
using System;
using System.IO;
class XPathQuery{
public static string PrintError(Exception e, string errStr){
if(e == null)
return errStr;
else
return PrintError(e.InnerException, errStr + e.Message );
}
public static void Main(string[] args){
if((args.Length == 0) || (args.Length % 2)!= 0){
Console.WriteLine("Usage: xpathquery source query <zero or more
prefix and namespace pairs>");
return;
}

try{

//Load the file.
XmlDocument doc = new XmlDocument();
doc.Load(args[0]);
//create prefix<->namespace mappings (if any)
XmlNamespaceManager nsMgr = new XmlNamespaceManager(doc.NameTable);
for(int i=2; i < args.Length; i+= 2)
nsMgr.AddNamespace(args[i], args[i + 1]);
//Query the document
XmlNodeList nodes = doc.SelectNodes(args[1], nsMgr);
//print output
foreach(XmlNode node in nodes)
Console.WriteLine(node.OuterXml + "\n\n");
}catch(XmlException xmle){
Console.WriteLine("ERROR: XML Parse error occured because " +
PrintError(xmle, null));
}catch(FileNotFoundException fnfe){
Console.WriteLine("ERROR: " + PrintError(fnfe, null));
}catch(XPathException xpath){
Console.WriteLine("ERROR: The following error occured while querying
the document: "
+ PrintError(xpath, null));
}catch(Exception e){
Console.WriteLine("UNEXPECTED ERROR" + PrintError(e, null));
}
}
}

Given the following XML document that does not declare any namespaces, queries are fairly straightforward as seen in the examples following the code.

<?xml version="1.0" encoding="utf-8" ?>
<bookstore>
<book genre="autobiography">
<title>The Autobiography of Benjamin Franklin</title>
<author>
<first-name>Benjamin</first-name>
<last-name>Franklin</last-name>
</author>
<price>8.99</price>
</book>
<book genre="novel">
<title>The Confidence Man</title>
<author>
<first-name>Herman</first-name>
<last-name>Melville</last-name>
</author>
<price>11.99</price>
</book>
</bookstore>

Example 1

  1. xpathquery.exe bookstore.xml /bookstore/book/title

    Selects all the title elements that are children of the book element whose parent is the bookstore element, which returns:
    <title>The Autobiography of Benjamin Franklin</title>
    <title>The Confidence Man</title>

  2. xpathquery.exe bookstore.xml //@genre

    Select all the genre attributes in the document and returns:
    genre="autobiography"
    genre="novel"

  3. xpathquery.exe bookstore.xml //title[(../author/first-name = 'Herman')]

    Selects all the titles where the author's first name is "Herman" and returns:
    <title>The Confidence Man</title>

    However, once namespaces are added to the mix, things are no longer as simple. The file below is identical to the original file except for the addition of namespaces and one attribute to one of the book elements.

    <bookstore xmlns="urn:xmlns:25hoursaday-com:bookstore">
    <book genre="autobiography">
    <title>The Autobiography of Benjamin Franklin</title>
    <author>
    <first-name>Benjamin</first-name>
    <last-name>Franklin</last-name>
    </author>
    <price>8.99</price>
    </book>
    <bk:book genre="novel" bk:genre="fiction"
    xmlns:bk="urn:xmlns:25hoursaday-com:bookstore">
    <bk:title>The Confidence Man</bk:title>
    <bk:author>
    <bk:first-name>Herman</bk:first-name>
    <bk:last-name>Melville</bk:last-name>
    </bk:author>
    <bk:price>11.99</bk:price>
    </bk:book>
    </bookstore>

    Note that the default namespace is in scope for the whole XML document, while the namespace declaration that maps the prefix bk to the namespace name urn:xmlns:25hoursaday-com:bookstore is in scope for the second book element only.

Example 2

  1. xpathquery.exe bookstore.xml /bookstore/book/title

    Selects all the title elements that are children of the book element whose parent is the bookstore element, which returns NO RESULTS.

  2. xpathquery.exe bookstore.xml //@genre

    Selects all the genre attributes in the document and returns:
    genre="autobiography"
    genre="novel"

  3. xpathquery.exe bookstore.xml //title[(../author/first-name = 'Herman')]

    Selects all the titles where the author's first name is "Herman," which returns NO RESULTS.

    The first query returns no results because unprefixed names in an XPath query apply to elements or attributes with no namespace. There are no bookstore, book, or title elements in the target document that have no namespace. The second query returns all attribute nodes that have no namespace. Although namespace declarations are in scope for both attribute nodes returned by the query, they have no namespace because namespace declarations do not apply to attributes with unprefixed names. The third query returns no results for the same reasons the first query returns no results.

    The way to perform namespace-aware XPath queries is to provide a prefix to namespace mapping to the XPath engine, then use those prefixes in the query. The prefixes provided do not need to be the same as the namespace to prefix mappings in the target document, and they must be non-empty prefixes.

Example 3

  1. xpathquery.exe bookstore.xml /b:bookstore/b:book/b:title b urn:xmlns:25hoursaday-com:bookstore

    Select all the title elements that are children of the book element whose parent is the bookstore element and returns the following:
    <title xmlns="urn:xmlns:25hoursaday-com:bookstore">The Autobiography of Benjamin Franklin</title>
    <bk:title xmlns:bk="urn:xmlns:25hoursaday-com:bookstore">The Confidence Man</bk:title>

  2. xpathquery.exe bookstore.xml //@b:genre b urn:xmlns:25hoursaday-com:bookstore

    Selects all the genre attributes from the "urn:xmlns:25hoursaday-com:bookstore" namespace in the document that returns:
    bk:genre="fiction"

  3. xpathquery.exe bookstore.xml //bk:title[(../bk:author/bk:first-name = 'Herman')] bk urn:xmlns:25hoursaday-com:bookstore

    Selects all the titles where the author's first name is "Herman" and returns:
    <bk:title xmlns:bk="urn:xmlns:25hoursaday-com:bookstore">The Confidence Man</bk:title>

    Note This last example is the same as the previous examples but rewritten to be namespace aware.

For more information on using XPath, read Aaron Skonnard's article Addressing Infosets with XPath and view the examples at the ZVON.org XPath tutorial.

XSLT and Namespaces

The W3C XSL transformations (XSLT) recommendation describes an XML-based language for transforming XML documents into other XML documents. XSLT transformations, also known as XML style sheets, utilize patterns (XPath) to match aspects of the target document. Upon matching nodes in the target document, templates that specify the output of a successful match can be instantiated and used to transform the document.

Support for namespaces is tightly integrated into XSLT, especially since XPath is used for matching nodes in the source document. Using namespaces in your XPath expressions inside XSLT is much easier than using the DOM.

The example that follows contains:

  • A program for use in executing transforms from the command line.
  • An XSLT stylesheet that prints all the title elements from the urn:xmlns:25hoursaday-com:bookstore namespace in the source XML document when run against the bookstore document from the urn:xmlns:25hoursaday-com:bookstore namespace.
  • The resulting output.

Program

Imports System.Xml.Xsl
Imports System.Xml
Imports System
Imports System.IO
Class Transformer
Public Shared Function PrintError(e As Exception, errStr As String) As String

If e Is Nothing Then
Return errStr
Else
Return PrintError(e.InnerException, errStr + e.Message)
End If
End Function 'PrintError

'Entry point which delegates to C-style main Private Function
Public Overloads Shared Sub Main()
Run(System.Environment.GetCommandLineArgs())
End Sub 'Main


Overloads Public Shared Sub Run(args() As String)

If args.Length <> 2 Then
Console.WriteLine("Usage: xslt source stylesheet")
Return
End If

Try

'Create the XslTransform object.
Dim xslt As New XslTransform()

'Load the stylesheet.
xslt.Load(args(1))

'Transform the file.
Dim doc As New XmlDocument()
doc.Load(args(0))

xslt.Transform(doc, Nothing, Console.Out)

Catch xmle As XmlException
Console.WriteLine(("ERROR: XML Parse error occured because " +
PrintError(xmle, Nothing)))
Catch fnfe As FileNotFoundException
Console.WriteLine(("ERROR: " + PrintError(fnfe, Nothing)))
Catch xslte As XsltException
Console.WriteLine(("ERROR: The following error occured while
transforming the document: " + PrintError(xslte, Nothing)))
Catch e As Exception
Console.WriteLine(("UNEXPECTED ERROR" + PrintError(e, Nothing)))
End Try
End Sub
End Class 'Transformer

XSLT stylesheet

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:b="urn:xmlns:25hoursaday-com:bookstore">
<xsl:template match="b:bookstore">
<book-titles>
<xsl:apply-templates select="b:book/b:title"/>
</book-titles>
</xsl:template>
<xsl:template match="b:title">
<xsl:copy-of select="." />
</xsl:template>
</xsl:stylesheet>

Output

<?xml version="1.0" ?>
<book-titles xmlns:msxsl="urn:schemas-microsoft-com:xslt"
xmlns:ext="urn:my_extensions" xmlns:b="urn:xmlns:25hoursaday-com:bookstore">
<title xmlns="urn:xmlns:25hoursaday-com:bookstore">The Autobiography of
Benjamin Franklin</title>
<bk:title xmlns="urn:xmlns:25hoursaday-com:bookstore"
xmlns:bk="urn:xmlns:25hoursaday-com:bookstore">The Confidence
Man</bk:title>
</book-titles>

Note that the namespace declarations from the stylesheet end up on the root node of the output XML document. Also to note is the fact that the XSLT namespace is not included in the output XML document.

Generating XSLT stylesheets from the output of your XSLT transforms is slightly cumbersome because the processor has to be able to determine the output elements from the actual stylesheet directives. There are two ways I have found to deal with this issue, both of which I'll illustrate by showing stylesheets that generate the following XMLT stylesheet as output.

<xslt:stylesheet version="1.0"
xmlns:xslt="http://www.w3.org/1999/XSL/Transform">
<xslt:output method="text"/>
<xslt:template match="/"><xslt:text>HELLO WORLD</xslt:text></xslt:template>
</xslt:stylesheet>

The first method involves creating a variable containing the stylesheet to be created, and then using value-of in combination with the disable-output-escaping attribute to create the stylesheet.

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" encoding="utf-8"/>
<xsl:variable name="stylesheet">
&lt;xslt:stylesheet version="1.0"
xmlns:xslt="http://www.w3.org/1999/XSL/Transform"&gt;
&lt;xslt:output method="text"/&gt;
&lt;xslt:template match="/"&gt;&lt;xslt:text&gt;HELLO
WORLD&lt;/xslt:text&gt;&lt;/xslt:template&gt;
&lt;/xslt:stylesheet&gt;
</xsl:variable>
<xsl:template match="/">
<xsl:value-of select="$stylesheet" disable-output-escaping="yes" />
</xsl:template>
</xsl:stylesheet>

This first method works best if the stylesheet being created can be easily partitioned so that it can be placed in variables. While this technique is quick and easy, it also falls into the category of gross hack, which typically tend to become unmanageable when faced with any situation requiring flexibility. For instance, when creation of the new stylesheet involves lots of dynamic creation of text and is intertwined with the stylesheet directives, the following method is preferable to the aforementioned gross hack.

<xslt:stylesheet version="1.0" xmlns:xslt="http://www.w3.org/1999/XSL/Transform"
xmlns:alias="http://www.w3.org/1999/XSL/Transform-alias">
<xslt:output method="xml" encoding="utf-8"/>
<xslt:namespace-alias stylesheet-prefix="alias" result-prefix="xslt"/>
<xslt:template match="/">
<alias:stylesheet version="1.0">
<alias:output method="text"/>
<alias:template match="/"><alias:text>HELLO
WORLD</alias:text></alias:template>
</alias:stylesheet>
</xslt:template>
</xslt:stylesheet>

The above document uses the namespace-alias directive to substitute the alias prefix and namespace name it is bound to with the xslt prefix and the namespace name to which it is bound.

Namespaces are also used to specify mechanisms for the extension of XSLT. Namespace prefixed functions can be created that are executed in the same manner as XSLT functions. Similarly, elements from certain namespaces can be treated as extensions to XSLT and executed as if they were transformation directives like template, copy, value-of, and so on. Below is an example of a Hello World program that uses namespace-based extension functions to print the signature greeting.

<stylesheet version="1.0"
xmlns="http://www.w3.org/1999/XSL/Transform"
xmlns:msxsl="urn:schemas-microsoft-com:xslt"
xmlns:newfunc="urn:my-newfunc">
<output method="text"/>
<template match="/">
<value-of select="newfunc:SayHello()" />
</template>
<msxsl:script language="JavaScript" implements-prefix="newfunc">
function SayHello() {
return "Hello World";
}
</msxsl:script>
</stylesheet>

XML Namespace Caveats

Namespaces in XML, like any useful tool, can be used improperly and have various subtleties that may cause problems if users are unaware of them. This section focuses on areas where users of XML namespaces typically have problems or face misconceptions.

Versioning and Namespaces

There are two primary mechanisms used in practice to create different versions of an XML instance document. One method is to use a version attribute on the root element as is done in XSLT, while the other method is to use the namespace name of the elements as the versioning mechanism. Versioning based on namespaces is currently very popular, especially with the W3C, who have used this mechanism for various XML technologies including SOAP, XHTML, XML Schema, and RDF. The namespace URI for documents that are versioned using the namespace is typically in the following format:

http://my.domain.example.org/product/[year/month][/area]

The primary problem with versioning XML documents by altering the namespace name in subsequent versions is that it means XML namespace-aware applications that process the documents will no longer work with the documents, and will have to be upgraded. This is primarily beneficial with document formats whose versions change infrequently, but upon changing alter the semantics of elements and attributes, thus requiring that all processors no longer work with the newer versions for fear of misinterpreting them.

On the other hand, there are a number of scenarios where an XML document versioning mechanism based on a version attribute on the root element is sufficient. A version attribute is primarily beneficial when changes in the document's structure are backwards compatible. The following situations are all areas where using a version attribute is a wise choice:

  • Semantics of elements and attributes will not be altered.
  • Changes to the document involves the addition of elements and attributes, but rarely removal.
  • Interoperability between applications with various versions of the processing software is necessary.

Both versioning techniques are not mutually exclusive and can be used simultaneously. For instance, XSLT uses both a version attribute on the root element, as well as a versioned namespace URI. The version attribute is used for incremental, backwards-compatible changes to the XML document's format, while altering the namespace name is done for significant changes in the semantics of the document.

Document Types

The term document type is misleading as discussed in several philosophical debates on various XML related mailing lists . In many cases, the namespace name of the root element can be used to determine how to process the document, however, this is hardly a general rule and stating it as such violates the spirit of XML namespaces as they were designed exactly so that developers could mix and match XML vocabularies.

A succinct post that captures the essence of why thinking that root element namespace URI are equivalent to a notion of document type is this post by Rick Jelliffe on XML-DEV. The essence of the post is that there are many different types that an XML document could have, including its document type as specified by its Document Type Definition (DTD), its MIME media type, its schema definition as specified by the xsi:schemaLocation attribute, its file extension, as well as the namespace name of its root element. Thus it is quite likely that in many cases a document will have many different types depending on what perspective one decides to take when examining the document.

Two examples of XML documents in which actual document types can be misconstrued by simply looking at the namespace URI of the root element are RDDL documents (sample, notice that its root element is from the XHTML namespace) and annotated mapping schemas, which have their root element is from the W3C XML Schema namespace.

In a nutshell, the type of a document cannot conclusively be determined by looking at the namespace URI of its root element. Thinking otherwise is folly.

Namespaces Future

There are a number of developments in the XML world focused on tackling some of the issues that have developed around XML namespaces. Firstly, the current draft of the W3C XML namespaces recommendation does not provide a mechanism for undeclaring namespaces that have been mapped to a prefix. The W3C XML namespaces v1.1 working draft is intended to rectify this oversight by providing a mechanism for undeclaring prefix namespace mappings in an instance document.

The debate on what should be returned on an attempt to dereference the contents of a namespace URI has lead to contentious debate in the XML world and is currently the focus of deliberations by the W3C's Technical Architecture Group. The current version of the XML namespaces recommendation does not require the namespace URI to actually be resolvable because a namespace URI is supposed to merely be a namespace name that is used as a unique identifier, and not the location of a resource on the Internet.

Tim Bray (one of the original editors of both the XML Language and XML namespaces recommendations) has written an exhaustive treatise on the issues around namespace URIs and the namespace documents that may or may not be retrieved from them. This document contains much of the reasoning that was behind his creation of the Resource Directory Description Language (RDDL), which is designed to be used for creating namespace documents.

This discussion has been archived. No new comments can be posted.

XML Namespaces and How They Affect XPath and XSLT

Comments Filter:
  • Helpful! (Score:5, Funny)

    by Jonboy X ( 319895 ) <jonathan,oexner&alum,wpi,edu> on Thursday May 30, 2002 @12:35PM (#3610520) Journal
    Finally, a clear, consise tutorial on XML namespaces. Just goes to show what a simple, easily understood, efficient and applicable technology XML is...
    • XML was intended to be a simple and actually useful subset of SGML. and it was in the beginning. then we got namespaces and the long slide into overdesigned bloated academic hell began. the same thing has happened to everything XML touches. look at web services (shudders). or maybe it's just the W3C. all nice in theory but it's turned into a three-ringed circus for people with way too much time on their hands and brains too big to grasp the pragmatic, useful, elegant or simple aspects of technology implementations.
      • Agreed. It's way complex for not much gain.
        All the namespace crap makes it impossible
        to read or write. And what good is it really?
        • And what good is it really?

          Can you imagine if, say, every email address in the world belonged to the same domain? Can you imagine if every Java class in the world belonged to the same package, or if every Perl method belonged to the same module? Now imagine if every XML element in the world had to come from the same (null) namespace. When you consider the millions of people that use XML, you *will* get conflicts.

          XML Namespaces is just a way to add a unique identifier to elements that happen to have the same name. <title> can mean a thousand different things, depending on whether it refers to a book, a web page, ownership of your car, or the title of "Vice President". Now if you ever want to describe both the fact that you are a Vice President and that you own a car in the same XML document, you damned well better have a way of differentiating between the two "title"s.

          Is it really that hard to use anyway? All you have to do is stick an extra little xmlns="..." at the top of your document -- you don't even have to use potentially-ugly prefixes -- and it's not any worse than typeing <!DOCTYPE foo SYSTEM "..."> at the top of the page. And if you really really don't want to play in the global game and you know that none of your elements will ever cross boundries with another document type, then you can just ignore namespaces all together.

          It's complex, but I'll contend that it has the most "gain" of any XML recomendation to date.
      • Yes, XML is overcomplicated and underpowered, but don't blame academia.

        I'd argue that it's the unscientific nature of XML standards development, in particular the remarkable failure to learn from prior art such as LISP and database theory, that is responsible for this current mess.

        I've never worked in an academic institution, but I know that I'd be scouring the research papers for ideas if I was responsible for an XML-related activity. Academics are quite good at synthesising ideas because they have time to look around. Pragmatics like us will just grab the first thing that works because normally it doesn't matter that much, and only in the worst case will some model be stretched completely beyond its appropriate domain. Unfortunately, this is just what's happened with XML.
      • Au contraire. If the originators had bothered to talk to any half-decent academics, you wouldn't have this mess.

        For Heaven's sake, XML is pretty much just an extraordinarily verbose way of describing tree structures. There are issues of naming, cross referencing, quotation, and grammatical correctness - none of which are hard problems (although the prevalent hack-it-and-ship-it mentality consistently screws up; we should teach more basic theory and less Java/C++/object-babble at universities...)
  • Duh (Score:1, Insightful)

    by TheOldFart ( 578597 )
    Blah, blah, blah... nice and dandy. Until Microsoft comes along and decides to "fix it".

    Couldn't this be linked instead of transcribing the whole shebang here?

  • by tps12 ( 105590 )
    My proxy server filtered this entire story out due to pornographical content.
  • Wow! (Score:5, Insightful)

    by dbretton ( 242493 ) on Thursday May 30, 2002 @12:37PM (#3610530) Homepage
    If slashdot posted more stuff like this, I would almost be persuaded to subscribe

    .almost.

    • Re:Wow! (Score:5, Insightful)

      by moonbender ( 547943 ) <moonbender AT gmail DOT com> on Thursday May 30, 2002 @12:40PM (#3610563)
      Yep, we need more of this. Some small tutorials, and lots of additions and enhancements to them in their discussion. Maybe someone would be willing to rewrite the tutorial after a week with all the +4 replies in mind. Sounds pretty sweet, I think.
      • Re:Wow! (Score:2, Insightful)

        by Winged Cat ( 101773 )
        Why, yes! And we could call it, oh, say, Technical Documentation Already Available On The Web (TDAAOTW for short). We could shamelessly copy-and-paste from the W3C's articles on XML (nevermind that they'd update their information as they found holes, and might not tell us to update our copy), whatever tutorials we could find by googling on the right words...

        Slashdot has a function. Setting up the infrastructure to do tutorials et al properly would distract from its function, and likely not increase the budgeted resources it has to serve its existing function. (If it paid for itself, that'd be a point in its favor, but I don't think it would.)
        • As I highlighted, I think the infrastructure to do this properly is already there.
          • On the Web at large, yes. I would have to disagree that Slashdot is properly set up to do this, though - but the difference may be that, to me, the necessary "infrastructure" is more of a human issue (for instance, people who know how to evaulate tutorials, and who have a mindset to actually edit the tutorials instead of focus on a news-with-user-comments site) than a technology issue.
      • Sounds like you're proposing to turn /. into kuro5hin [kuro5hin.org].

      • I could get flamed for endorsing a commercial establishment - and, no, I don't work for them - but I've found some of the tutorial articles that IBM posts at their developerWorks website to be quite nice.

      • I suggest something like this to LDP a while back.

        A system like this would do wonders for
        the linux documentaion.

        Knud
    • A secret (Score:2, Interesting)

      by b0z ( 191086 )
      The author of this story actually works for Microsoft. I believe he just had his first article published in whatever the MSDN publication is as well.

      I find it ironic that so many people here bash Microsoft when one of their employees has been one of the best contributors of original content to Slashdot. Of course, the moment Bill Gates gets a story posted here is the moment I delete /. from my bookmarks.

      • Unfortunately, the subject matter of the article is XML, which is mainly an W3C and MS-hyped reimplementation of some 20-year-old+ Lisp concepts.

        The article doesn't mention just how braindead XML actually is.

        It's like a religion - if you start with the premise that XML is good, then it all fits together and the article is valuable to you, but just because the article is a good report on XML doesn't mean that XML itself is any good. This is a common mistake people make,. and this article furthers MS's goals of pushing the XML-religion. The author may himself by quite clever, but, just as many reasonably intelligent people fall for religions, he may be simply misguided.
    • Whoever modded this up let me know what you are smoking - I want some too.
  • Jees... one of the longest pieces of text i read on slashdot.. Someone forgot the web could be used to hyperlink to such information and let it be viewable for a larger public..

    Okay.. there is no larger public as the /. crowd.. i know...
  • Great in theory (Score:2, Informative)

    by Geeyzus ( 99967 )
    One point he makes towards the bottom is the controversy behind actually de-referencing the namespace URIs. Right now as far as I can tell, namespaces do you no good by themselves. You still have to embed any outside XML into your application or write code to dereference the URI anyhow.

    Once de-referencing URIs is built into parsers I think namespaces (and XML in general) will be much more useful.

    Mark
    • Re:Great in theory (Score:4, Informative)

      by ajm ( 9538 ) on Thursday May 30, 2002 @02:01PM (#3611203)
      Namespace URIs are just globally unique identifiers. Though they look like URLs they don't point to any "physical" file. There is no need to dereference a namespace URI. It's a common mistake to think that there is.
      • Exactly - this is a very common mistake. What people also don't realize is that namespaces work very well with XML Schema, and allow for easy ways to compose "modular" XML instances, where the different parts are decribed in different schema. Such use also allows for customizations and overriding of the different "modules" similar to how OO programs can manipulate objects.

  • XML namespaces are touted as a wonderful new invention. Unfortunately they're just a non backwards compatible copy of what SGML could already do with the CONCUR declaration.


    SGML already had: <(p.anthology)page> long before XML was even dreamed of.


    *yawn*


    --Azaroth

  • Very good (Score:1, Interesting)

    by SkyLeach ( 188871 )
    Some good stuff in here. A good read for anyone using serious XML.

    I kindof see the point of it being a bit longish for the /. front page.

    How 'bout a link to a more permanant article source/website?
  • I hope people will look past the C# code in the write up and look at the useful information. Now if some one would post a good article detailing the ins and outs of XSL and XPath, that would be good. If I wasn't lazy, I'd write one. But I'm way too lazy and have real work to do.
  • XM Hell (Score:2, Interesting)

    Namespaces and all that are nice, but surly they should make XSL more functional first.
    If the initial idea of XML/XSL was to make data protable and transformable they should have been designed with more functionality to do this.

    1 good example is Binary operations, all kinds of data is store like this especially in legacy or the mainframe systems i've been working with ,but XSL provides no weasy way of formatting this data into somthing usefull.

    You can script a # to binary function but you have to use nasty itterative functions instead of loops

    e.g. in sudo stuff

    myfunction(somthingusefull,counter,limit){
    if(counter limit){
    bytepos = bytepos *2;
    counter = counter +1;
    dosomthingwith(somthingusefull);
    myfunction(bytepos ,counter );
    }
    }

    Well i'm sure you get my drift so I'll leave it there for now.
    • Re:XM Hell (Score:3, Insightful)

      by JimDabell ( 42870 )

      1 good example is Binary operations, all kinds of data is store like this especially in legacy or the mainframe systems i've been working with ,but XSL provides no weasy way of formatting this data into somthing usefull./p>

      XSL is a tool to transform XML documents. Of course it doesn't touch binary formats, that's outside it's scope.


    • XM Hell?

      surly they should make XSL more functional first?

      protable and transformable?

      no weasy way?

      itterative functions?

      somthingusefull?

      Well i'm sure you get my drift so I'll leave it there for now.


      I haven't got the faintest idea what your drift might be, sorry...
  • Woah!! (Score:4, Funny)

    by zsmooth ( 12005 ) on Thursday May 30, 2002 @01:04PM (#3610756)
    An intelligent, well-written technical article?? Here?? I thought I was at K5 for a minute!
    • Re:Woah!! (Score:1, Offtopic)

      by Dalroth ( 85450 )
      heh, this is exactly what SlashDot needs! Back in the day SlashDot used to have a ton of programmer oriented content like this, which is why I started reading SlashDot in the first place. Then, one day patents, Micro$oft abuse, privacy, and Jon Katz took over and SlashDot has not been the same since.

      SlashDot still has a lot of good content (especially if you block Katz), but face it, we're GNU/Linux geeks. We're programmers and sysadmins at heart! We need more real *TECH* content like this!

      Bryan
  • <wheelrant>
    XML is a becoming a very powerful, indeed, magical language. Perhaps one day before long, it'll have forloops and a query language (oops, xpath, already exists). Why before long, we might just have a reference counting garbage collector for those XML namespaces. Then maybe a cyclicle garbage collector.
    </wheelrant>
  • by Leeji ( 521631 ) <slashdot@leeholme s . com> on Thursday May 30, 2002 @01:12PM (#3610817) Homepage

    This is the first time I've seen an article on /., as opposed to comments on one written somewhere else.

    It's against Slashdot's norm -- a news site -- but I think it makes for a great idea. It lets me read a single source for both tech news and a little bleeding-edge knowledge. Although I dislike the karma whore article posting phenomenon, I love reading those articles inline.

    Truth be told, this also helps flesh out a university education. Although I learned a lot in my specialist degree, I became a well-rounded and knowledgable geek only through outside interests: clicking on Slashdot links, messing with Linux, etc. Until now, I didn't think about "XML Namespaces and how they affect XPath and XSLT," but now I can discuss it with a clue.

    Keep it up guys -- this improves the value of Slashdot immensely. At the very least, give this concept a section of its own (articles.slashdot.org?) with links from the main page.

    P.S: Why not post some articles on argument fallacies [midnightbeach.com] and how to answer lame questions yourself. [google.com]

  • Just as interesting and informative and with just a link [microsoft.com] to the information (sorry no copy-paste karma net). Here Paul Cornell shows you how to create COM add-ins for Microsoft Office using Visual Basic .NET.

    From the Article:"Before COM add-ins were invented, you could only create Office application-specific add-ins (except Microsoft Outlook® and Microsoft FrontPage®). These application-specific add-ins have file extensions such as .mda, .pwz, .wll, and .xla; some examples of Office application-specific add-ins are the Analysis ToolPak and the Solver Add-In for Microsoft Excel. Starting with Microsoft Office 2000, COM add-ins allow you to create add-ins that span multiple Office applications. This allows you to write code that is common across many applications, yet at the same time allows you to write code specific to each application that hosts the COM add-in."
  • I know the bison and yacc people have been struggling coming up with an L1 grammar for this for years, but maybe XSLT has the answer.

    Can it transform Visual Basic code into something taken seriously?
    • Can it transform Visual Basic code into something taken seriously?

      No, but if you wrote XML first and then used that to generate Visual Basic, then you could at least slap on a different code generator and spit out your business logic in Perl or Java the next time around :)

  • by Pinball Wizard ( 161942 ) on Thursday May 30, 2002 @01:48PM (#3611114) Homepage Journal
    I don't have much to add, but this article is yet another programming example using a bookstore.


    What is it about bookstores that make them ideal for explaining how a programming language, database or markup language work? I must have seen 100 different books and online articles that describe what they are trying to teach with a bookstore as the example.


    Not that I'm complaining(being that I program for a bookstore). Its just that there seems to be a disproportionate number of bookstore programming examples compared to other types of businesses. Personally, I'd like to see more manufacturing plants(hospitals, casinos, stock markets, etc).

    • Re:Good article (Score:5, Insightful)

      by ctrimble ( 525642 ) <ctrimble&thinkpig,org> on Thursday May 30, 2002 @03:34PM (#3612110)
      Bookstores are complicated enough to be interesting but easy enough not to cloud the issue with domain specific details. Everybody knows about books, not everybody knows about hospitals, casinos, stock markets, etc. (in a domain specific way).

      For example, a book has attributes like ISBN, Title, and pages that are 1:1. It has relationships with things like publishers that are N:1 (a book has one publisher, but a publisher has many books). And it has a relationship with authors that are M:N (a book can have many authors, and an author can write many books). Just this level of detail makes it an interesting example of how to create a book database that's in 3rd normal form.

      One table for book and it's main attributes like ISBN, Title, and pages. This table also has a foreign key to the publisher table. The publisher table has a primary key (pub_id) as well as name and any other main attributes you want to include (address for returns?). The author table has an author_id and the author's name. Finally, there's a book_author table that links books to authors in the required M:N way.

      If you're writing an application to manage book inventory you're presented with problems like 1) how do you add books without insertion anomalies (for instance, "Catcher in the Rye" is added with author = "Salinger, J. D." and "Raise High the Roofbeam, Carpenter" is added with author = "Salinger, JD") 2) how do you prevent deletion anomalies (if you delete all books by author Foobar McTavish, do you delete the author record?) These problems raise real issues that developers need to be aware of but doesn't cloud the issue with unnecessary detail.

      On the other hand, imagine the domain of an insurance company. Well, there are policies, and policies have lines of business, and lines of business have coverages. Each coverage has a rate schedule, but, of course, the schedule varies by state. And rates also have to take schedule modifications into account, as well as the experience modifications and the loss history of the insured. But, that also depends on the dollar amount of the coverage. And the policy might actually be covered by a reinsurer, rather than the primary insurer, each of whom has underwriters who manage the coverage schedule. Ouch!

      As a former insurance company employee who was responsible for designing a data model for the business (and as a former bookstore employee), I wish I had stuck with the latter!

    • What is it about bookstores that make them ideal for explaining how a programming language, database or markup language work?

      Well, as someone who writes computer-science textbooks and professional developer books for a living (including several with multi-tier bookstores as examples) I can honestly say that there's a very simple reason: advertising. By inserting bookstore applications into every book, and by populating the application databases with the appropriate information, we can advertise all the other books we write.

  • by Jack William Bell ( 84469 ) on Thursday May 30, 2002 @02:13PM (#3611306) Homepage Journal
    I certainly agree with the author that people should avoid using default namespaces and then clearing the namespaces with an empty namespace declaration. But I would go further to say that you should avoid default namespaces altogether if you intend your XML to be human readable.

    And, in my humber opinion, keeping it human readable is the main reason to use XML in the first place. As a purely data-transfer medium it is far to bulky and requires too much CPU at each end. Even HTML query strings are faster to parse and convert to binary equivalents.

    For machine to machine transfer we really need a binary XML standard. My understanding is that W3C is working on such...

    Jack William Bell
    • Time for a stupid question: If it is such a bad idea to undefine the default (or any?) namespace, why does XML allow it?

      Perhaps the inability to undefine non-default namespaces (see the last section of the article) should be called a "wise decision" as opposed to an "oversight"?
      • Time for a stupid question: If it is such a bad idea to undefine the default (or any?) namespace, why does XML allow it? Perhaps the inability to undefine non-default namespaces (see the last section of the article) should be called a "wise decision" as opposed to an "oversight"?

        I wouldn't call that a stupid question. It is one I have asked myself. So far as I can tell the answer is "Because some people thought it was a good idea and you don't have to use it."

        True. You don't have to use GoTo either. Doesn't make GoTo less evil, even when you run into a situation where GoTo is the easiest option.

        Jack William Bell
    • XML is a format to share between different humans and different machines. People, speaking on the same language, prefer a natural lang like English. Otherwise they translated. Machines of the same platform can use same binary format, like ELF. Otherwise they use scripts. XML is a similar format of inter-lingua.

      From the other point XML is not an inter-lingua by itself - it doesn't care about semantic. You need RDF, DAML, OIL, XML-ized Prolog and so on for it. XML is like ethernet in networks, if you will, it's on low level of system design.

      One more point - binary format still requires format tagging. What's wrong with XML tags? Are they really big? I thing XML tags are same big for inter-lingua tagging as TCP packet headers. Should we cut or comprtess TCP packet headers?

  • by verrol ( 43973 )
    i appreciate the time taken by the author to write an article to help others (even if the material is documented elsewhere). having different perspective on a given subject never hurt anyone. besides, he digested it and is now trying to make is simpler for some of us. hell, i know the basics of XML, XSL, and so on, but still find this namespace thing a little daunting.
  • And the point? (Score:1, Redundant)

    by SerpentMage ( 13390 )
    The point is????

    While the article is nice, it is referenced in the MSDN article! It is basically a rehash of that. So why not create a link or is Slashdot trying something new?

  • Reality check (Score:4, Insightful)

    by Jhan ( 542783 ) on Thursday May 30, 2002 @02:35PM (#3611514) Homepage

    Suppose I have two systems. From system 1 I need to notify system 2 of personal data in the db that has been modified, so they can update their info.

    First approach: select all persons that have been modified since last time. Write all their data to a file, one person per row, either using a fixed field width representation or a token seperated one. My C program that I wrote in 5 minutes creates the file, his Cobol program that he wrote in 1 hour reads the data, checks it and inserts it into his system.

    Moron approach: Select all persons that have been modified since last time. Invent an XML schema to represent the data. Use 5 hours to write the DTD and the program the creates the XML data. Because we are feeling super-trendy, we send this by HTTP POST to (2) ala Soap.

    To be cosher (2) now has to have a web server, CGI's that can handle Soap and store the parameters (in this case a file) on the file system, the ability to validate and parse the XML file, on a AS/400, no less.

    Perhaps he has some helpful tools, perhaps he has to code a general XML parser himself, perhaps (more likely) he writes my name on the big list he has on the wall saying "People who *will* die".

    He tries to get a XML system from the net in, say, Java, spends 5 days getting it to work, while coding the HTTP bits himself (which he does in only 40 hours. Yeah! What a coder!)

    Unfortunately, his big-wigs have just been force-fed with XML propaganda, and have decided on their own, incompatible XML representation of the data... And he must write the XSL code and so forth.

    (2) now moves my name, and those of his bosses, and some of the people at W3C from the "People who *will* die" list to the new "DIE DIE DIE DIEDIEDIE BLEED FUCKING PIGS!!!!!!!!!1!111!1!" list. And gets his AK-47 and goes out the door...

    Net time/life cost:

    Rational thinking: 1h5min
    XML: 100+h plus 15 deaths and 150 wounded.

    • In defense of XML, if (1) and (2) are separated organizationally or by a span of time, XML can come in handy. (2) parsing the fixed format or delimited file (1) generates with his 1-hour developed program can break for any number of reasons. If XML is used properly, the file generated by (1) will still be understandable by (2) even if (1)'s output or (2)'s expectations change, provided that there isn't a change to the semantics that would break everything. At least that's been my experience. It may seem overengineered after a week, but after a couple of years it may seem like a godsend.
    • If you have two systems, you can do whatever the heck you want with them. But some of us have to interact with more than that, and often, they aren't ours.

      Lets say you have one market, and n businesses in that market. Like, say, accounting, or schools, or manufacturing. All of these entities need to interact with each other, and exchange data.

      Now, they could take your approach, and create a custom excange format for every different interaction. That scales, well, not at all.

      As an alternative, they could define a few formats using an identical grammar. Then all their systems could interact without having to write custom applications for each of the custom exchange formats that their accounting department escapee came up with.

      Then they could fire the short-sighted hacks that "saved" all that time by not using XML in the first place.

    • Re:Reality check (Score:2, Interesting)

      I disagree -- XML is appropriate. If you are company A, and you want to transmit data to company B, then sure, XML may be overkill.

      But if you're company A, and you want to send info to or exchange info with companies B1, B2, ..., B1000 in a standard application domain, each with a different computer infrastructure, then writing code to a standard set of XML specs, then everybody's job gets a lot easier. It's a lot harder to misunderstand an XML spec than it is to misunderstand a complicated structure's binary representation.

      Plus with XML-to-object-to-XML parsing/generation conventions, reading XML and accessing/manipulating the data is easy. Also, databases and surrounding IT tools have an easier time shuttling data back and forth.

      For huge XML documents though, you gotta do more work, 'cause full in-memory XML-to-object isn't practical.
  • MSFT languages? (Score:4, Insightful)

    by sohp ( 22984 ) <snewtonNO@SPAMio.com> on Thursday May 30, 2002 @03:18PM (#3611948) Homepage
    Wow, a slashdot article with CeeSharp and VB as the example languages? Someone was asleep when they read that submission. Oh wait, the editors don't read past the first page before deciding to post or reject submissions.

    How about an XML processing article using languages that slashdotters actually care about and write in -- Perl, PHP3, Java, C++, Python, etc.? We're not going to pop over to freshmeat and download the latest VB4Debian, you realize.
    • We're not going to pop over to freshmeat and download the latest VB4Debian

      Oh yes you are. GNOME Basic [freshmeat.net] is an environment compatible with many applications written in the Visual Basic programming language.

    • what makes you so sure that noone here uses vb? personally i have moved my entire computing experience at home to linux. but quite a few people here still use m$ at home and work. y alienate people?
    • If the article is well-written and insightful, it could use COBOL for all I care.

      What makes you think that slashdotters do not care to read a good programming article just because it uses examples in a language some of them do not use? What makes you think that when they do care, they categorically refuse to use M$ implemented languages?

      I have seen more jihads on Slashdot between Java-C than between C++-VisualC++, and remarkably few rants against VB (which was a broken language even by microsofties judgement).

      People don't put aside information because it comes packaged in a language used by a company they don't like that much. If you want Slashdot to reject a technical article because your ideological beliefs don't let you read a few lines of code... well sir, that would be idiotic. It's not like the code will virally infect you or anything.

      But if it makes you feel better: there are many in Slashdot that are interested in Mono, which would provide ample justification for you to understand C# without working for the "dark side". The same would go for VB.NET, I think, which by the way, seems like a remarkably non-broken language compared with its predecessors.
    • Who cares what language his examples are in? I'm sure that most everyone who reads these articles can translate them to the API in whichever language they're going to write their program in.
  • Using an XML data source to hold all my games, with all the ratings from my friends, as well as photos. I dynamically allow the user to sort the games using different XSLT documents. I used to have an option for changing the look but I never liked any of the other looks.

    Take a look at the application here [singleclick.com]

    It requires a 6.0+ browser, Netscape 6.x (or 7), IE 6.x, Mozilla 0.9.8+, or IE 5.5 if you install the updated XSLT update (better to instead just upgrade your whole browser).

    Joseph Elwell.
  • It really feels like a kludge: why attributes doesn't inherit the namespace of the element?

    I think it's because of "backward compatibility", which is too bad, because it's obvious that they would need something like namespace to prevent clashes.

    *Sigh*
  • Since when is JavaScript part of XSLT? Leave it to Microsoft to jam a Javascript interpreter INTO a XSLT processor! Oh, and a VBScript interpreter too, since you can do that, too.

    (Project for today: write an XSLT processor that formats your hard drive.)

    This whole tutorial relies on using M$'s XSLT processor; otherwise, this example won't work. Feh.

    <msxsl:script language="JavaScript" implements-prefix="newfunc">
    function SayHello() {
    return "Hello World";
    }
    </msxsl:script>
  • But I heard a good argument against your article:

    SELECT * FROM books!
  • XML schema and Namespaces help to control syntax. And only syntax

    Tired from semantical hard-coding? Try RDF-schema [w3.org] and ontologies [daml.org].

  • Saying that your data is stored in XML doesn't mean your data has any more intelligent structure than saying that your C program has its data stored with pointers.

    Adding a namespace only confuses the issue more.

    Intelligent data structures don't happen by accident, and anyone who claims that XML (and by extension, namespaces) makes everything easy is a moron.

Life is a game. Money is how we keep score. -- Ted Turner

Working...