XML Namespaces and Data Islands
Posted On May 9, 2007 by Raja Kishore Reddy filed under Internet
I was quite surprised when a few of my developer friends pointed out that they were not really aware of some of the XML basics, despite using Java or C# to handle XMLized data for some time. I am thrilled when they tell me that this basic workshop on XML concepts is actually helping them in becoming a better XML developer. In my original plan, I was hoping to cover the basics of XML in about four months, and then speed into more advanced stuff that will make more sense to the programmers. However, I am now going to shift a gear down, and touch upon a few other fundamental concepts from the XML world, which all developers need to be aware of. Hence, it is likely that this workshop series will be lengthier.
Last month, we took a look at the validation of XML documents, and how you can ensure validity of the XML documents you create, by using DTDs. While DTDs are a way to ensure that XML documents follow certain rules, there are a few other concepts that deserve attention. One of them is XML Namespaces, which we will learn in this workshop.
Element name conflict
Consider two or three XML documents that may have to be merged or searched. The documents have a common element, with different meanings in different contexts. For example, XML data can be stored in a page, which has an HTML tag (title) as given below-
<title>
The Wonderful World of XML
</title>
At the same time, the XML data can be as in books.xml, which could be say, from my Book collections.
books.xml
<title>
<name>XML Unleashed</name>
<author>David Arthur Boyd</author>
<price>120</price>
<pages>577</pages>
</title>
Suppose I have yet another XML document (address.xml) that has addresses of all the authors with whom I correspond.
address.xml
<address>
<title>Mr</title>
<name>David Arthur Boyd</name>
<Street>121 Times Square</Street>
<City>New York</City>
<State>NY</State>
<Country>United States</Country>
<zip>57762</zip>
</address>
Now, if I want to integrate the above three code snippets into a single XML page, then I have a few issues to be sorted out. The element ‘title’ appears in all three pages, and in each page, it means different objects having different definitions and content. This creates an element name conflict, which will confuse an XML parser trying to understand the content in these pages. This is because, a parser trying to implement an XSL transformation or trying to generate a report based on an SQL query will only look for a physical tag called ‘title’.
There is an easy way of solving such a conflict and that is to use a prefix. For example, you can use a prefix such as ‘p’ for the first document, and ‘n’ for the second document. I have shown an example of ‘p’ being used for one of the XML documents in the snippet below.
<p:title>
<p:name>XML Unleashed</p:name>
<p:author>David Arthur Boyd</p:author>
<p:price>120</p:price>
<p:pages>577</p:pages>
</p:title>
This sorts out quite a few issues; however, prefixing each element may not be the ideal solution. Imagine if we have to traverse through hundreds or thousands of XML documents and merge and create a new one. It is easy to guess that it will be a tedious task of prefixing each and very element. Ideal business solutions that make use of XML will force you to write similar documents.
You may suggest that DTD is a good solution. But remember that DTD is nothing but a method to define documents of information interchange and not to resolve possible element name conflicts.
Namespaces to the rescue
Programmers writing code in C# or Java will be familiar with the idea of Namespaces. Namespaces in these languages are used to club code under a single virtual space. You can have code appearing in different files coming under the same Namespace and the compiler or interpreter will recognize code as a single related entity. Code that comes under different Namespaces are treated accordingly. Similarly, you can use a Namespace to distinguish between element names in an XML document.
For example, you can define a root element using the following XML Namespace (xmlns)-
<title xmlns="http://www.developeriq.com/xml/4/1.3">
Here, ‘title’ is the root element of an XML document, and ‘xmlns’ is a special element used to identify a Uniform Resource Identifier (or a URL).
The idea is very simple. To avoid probable element name conflicts, you define each XML document (which needs to be merged) under an XML Namespace.
In case you are using a prefix, you will use the following syntax-
xmlns: ns-prefix="namespace"
...where ‘ns-prefix’ is the prefix used. Code-1 demonstrates Namespace in action.
Code 1
<?xml version="1.0" ?>
<address xmlns="http://www.writeiq.com/developeriq">
<title>Mr</title>
<name>David Arthur Boyd</name>
<Street>121 Times Square</Street>
<City>New York</City>
<State>NY</State>
<Country>United States</Country>
<zip>57762</zip>
</address>
URIs used in XML Namespaces need not be real. In fact, in Code-1, the URI (www.writeiq.com/developeriq) does not exist. It really does not matter, as a URI is nothing but a string identifier for an XML Namespace. The idea of using a URI is because each URI is unique, and though XML parsers parse the URI for information, it is a good idea to store something about the Namespace in the URI.
The real use of XML Namespaces actually is when you use XSL Transformation as in Code-2. This is adapted from the articles on XML and XSL in the February 2005 edition of the magazine. Note that the non-HTML tags have the prefix ‘xsl’, identified by the namespace "http://www.w3.org/1999/XSL/Transform".
Code 2
<?xml version="1.0" ?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<html>
<body>
<h2>My Book Collection</h2>
<table border="2">
<tr bgcolor="#122312">
<th>Title</th>
<th>Author</th>
<th>Price</th>
</tr>
<tr>
<td>.</td>
<td>.</td>
<td>.</td>
</tr>
</table>
</body>
</html>
</xsl:template>
</xsl:stylesheet>
Before I conclude the discussions on XML Namespaces, here are a few more points which you need to remember.
1. XML Namespaces have to be declared in the DTD of the XML document, and conform to whatever rules apply for XML well-formedness (refer the March 2005 edition).
2. In case you have prefixes in the document, all elements and attributes need to be prefixed.
XML Data Islands
In rest of this article and rest of this series, we will take a look at how XML can be embedded inside HTML. We will also learn about Cascading Style Sheets and learn more about Extended HTML or XHTML. But before that we will try to understand the concept of the XML data island.
You can easily plug XML into an HTML document by using the HTML tag called <xml>. It is as simple as shown in Code-3.
Code 3
<html>
<body>
<xml id ="title">
<title>
<name>XML Unleashed</name>
<author>David Arthur Boyd</author>
<price>120</price>
<pages>577</pages>
</title>
</xml>
</body>
</html>
Save the file as an HTML file. You will get a blank HTML file, if you view it in your browser. You can display the document with some contents, based on your choice. We will use the address.xml file as the sample XML file. Create an HTML file with address.xml file embedded using the HTML attribute ‘src’. Use ‘datafld’ element and display content of the XML file in a table format as given in Code-4. Your display in the browser will be more or less like Figure-1.
Code 4
<html>
<body>
<xml id="add" src="address.xml"></xml>
<table border="3" datasrc="#add">
<tr>
<td><span datafld="name"></span></td>
<td><span datafld="Street"></span></td>
<td><span datafld="City"></span></td>
<td><span datafld="State"></span></td>
<td><span datafld="zip"></span></td>
</tr>
</table>
</body>
</html>
| David Arthur Boyd | 21 Times Square | New York | NY | 57762 |
Figure-1
Through this example, we have basically displayed XML content as HTML, without using any XSL transformation. In fact, this month, we will try learning all the other shortcuts in displaying XML content without really depending on XSL or XSLT.
