Understanding DOM

The first thing that you need to learn if you want to program for the XML platform is learn about XML Document Object Model (XML DOM). XML DOM essentially provides an interface for developers to create, manipulate and adapt XML documents.

XML DOM was introduced in earlier articles of R.S. Ramaswamy (written specifically for the Java platform) and also through other articles in the magazine. In this article we will use JavaScript, a bit of Python and some Dot Net programming to unravel the simple XML DOM.

DOM is not an exclusive property of XML developers. HTML also has the DOM concept as so do a number of other markup languages.

So what exactly is DOM?

Document Object Model is a W3C standard that is a language neutral programming Interface that allows programs and scripts to dynamically access and update the content, structure and style of a document. It is platform independent too. You can work with DOM using most of the languages. Apart from regular libraries, there are a number of third party tools available for most languages that support DOM.

We will try to illustrate XML DOM using JavaScript, VBScript and Python. Some tips have also been provided for other languages!

XML DOM views XML documents as a tree structure of elements embedded within other elements. All elements, the data they carry and their attributes can be accessed through the DOM tree. Their contents can be modified or deleted and XML DOM can create new elements.

JavaScript is a ubiquitous language. If you have either Microsoft Internet Explore 5.0 and above installed or any of the latest Mozilla browsers, you can be sure that you have support for both JavaScript and XML DOM.

If you are working on Windows platform, it is likely that you have Internet Explorer 6.0 on your system (if not install it). Internet Explorer 6.0 has MS XML Parser 6.0. This supports XML DOM for multiple languages and even XSLT and XPATH.

Understanding DOM

Document Object Model can be described in two ways. There are more ways, but that is beyond the scope of this article. The simplest way to visualize DOM is to look at the tree as a hierarchy of Node objects. All XML objects will be Node objects in the model. It is easier to program using this model, though there are limitations. The other model is to view the root of the DOM tree as a Document node and other elements are specialized objects. Hence, you can have specialized objects such as DocumentType object or an Element object.

Some programming languages implement the DOM tree using the second model. Check Table 1 for a list of some Node objects.



Node objects What they do
Document Root of the Document Tree.
DocType DOCTYPE declaration.
Element Represents elements of the document.
Attr Represents attributes of an element; is not the child node of Element node.
previousSibling An object to find out the node.
nextSibling/previousSibling Attributes of Attr Node.

Table 1: Some Node objects in DOM 2.0


Properties of the Node object

Node object represents a node in the node-tree. The node object's properties and methods are described in Table 2. These properties help the developer retrieve information on specific nodes in a document. There are some specific properties and methods that are browser specific too. However, Table 2 lists the most common properties recognized under DOM Level 2.

Property Description

attributes Returns a NamedNodeMap that contains all attributes of a node.
childNodes Returns a node list that contains all children of a node.
firstChild Returns the first child node of a node.
lastChild Returns the last child node of a node.
nextSibling Returns the node immediately following a node. Two nodes are siblings if they have the same parent node.
nodeName Returns the name of the node (depending on the node type).
nodeType Returns node type as a number.
nodeValue Returns the value of the node.
ownerDocument Returns the Document object of a node (returns the root node of the document).
parentNode Returns the parent node of a node.
previousSibling Returns the node immediately preceding a node. Two nodes are siblings if they have the same parent node.

Table 2: Common properties recognized under DOM Level 2

Let us consider a simple example of loading an XML file using JavaScript using MSXML Parser 3.0.

Consider the simple XML file called book.xml.

<?xml version="1.0" encoding="ISO-8859-1" ?>
- <book type="Paper back">
<name>Tom Sawyer</name> 
<author>Mark Twain</author> 
</book>
book.xml


Let us try loading this file and accessing some of its content and try displaying it using the ActiveX Object Microsoft.XMLDOM (code 1). Make sure that you have Internet Explorer 6.0 on your PC.

Code 1

<html>
<body>

<script type="text/javascript">
var xmlDoc=new ActiveXObject("Microsoft.XMLDOM")
xmlDoc.async="false"
xmlDoc.load("book.xml")

document.write("The name of the book: " + 

xmlDoc.documentElement.childNodes(0).text)
document.write("<br />")
document.write("The author of the book: " + 

xmlDoc.documentElement.childNodes(1).text)
</script>

</body>
</html>


msdom.html


Try displaying the code in IE 6.0. You will get in the output as in figure 1. Let us dissect the code.

var xmlDoc=new ActiveXObject("Microsoft.XMLDOM")

This line of code is the standard way to create an XML DOM in JavaScript. The object xmlDoc has many standard methods and properties and one of them is load, which is used to load an XML document. You can traverse any node in the XML DOM tree (xmlDoc object in this case) and get information very easily.

If you have to check the code in a Mozilla Browser such as FireFox, you have to make minor changes in the code.

var xmlDoc=document.implementation.createDocument("ns","root",null)

You can use XML DOM to traverse the nodes and find specific information that can be displayed in HTML web pages. In code 2, we create a new ActiveX Object and then load an XML file inside. Then you map a Nodes object that actually stores an array of childnodes objects of the first documentElement of the XML file book.xml.
Rest of the program is all about HTML display of information that is already available.

Code 2

<html>
<head>

<script type="text/javascript">
function loadXML()
{
var xmlDoc=new ActiveXObject("Microsoft.XMLDOM");
xmlDoc.async="false";
xmlDoc.load("book.xml");

nodes= xmlDoc.documentElement.childNodes;
book.innerText = nodes(0).text;
author.innerText = nodes(1).text;

}
</script>

</head>

<body onload="loadXML()" bgcolor="blue">

<h1>My favourite books</h1>


<b>Author:</b> <span id="author"></span>
<br />
<b>Book:</b> <span id="book"></span>
<br />
</body>
</html>

xmlnodes.html


Python also provides excellent support for XML parsing and processing, and has a few modules in the Python Standard package. There are also a few third party modules that support Python XML processing.

We will learn more about some of the methods used in XML DOM with an example in Python. For the sake of brevity we have created only one book in our book.xml. XML is fundamentally a representation of data and you will need to add a few more elements or even delete one or more of it. It is easy to hand code when the size of the document is very small. However, this will not be the case when the size of the document is large.

We will use Python to append a node called Pages and add the extra information to the file book.xml.

We will use some of the other methods in DOM. Python has a module called xml.dom that contains a package called minidom. The package minidom incorporates practically all the methods to parse, write and manipulate XML.

In example 1, we shall use a few simple methods such as createElement, which creates an element, and createTextNode, which creates a TextNode. You can append content on to the objects and they can be used to place the element at the right place in the DOM.
You can use the toxml() call, which prints the contents of the DOM file.

Using simple file objects, you can add the same content back into the file, which is elementary.

Example 1

>>> from xml.dom import minidom
>>> dom = minidom
>>> xmlDoc = dom.parse("book.xml")
>>> newnode =xmlDoc.createElement("Pages")
>>> newContent =xmlDoc.createTextNode("345")
>>> newnode.appendChild(newContent)
<DOM Text node "345">
>>> xmlDoc.documentElement.appendChild(newnode)
<DOM Element: Pages at 0x15d8580>
>>> xmlDoc.toxml()
u'<?xml version="1.0" ?>\n<book type="Paper back">\n<name> Tom Sawyer </name>\n<author> Mark Twain </author>\n<Pages>345</Pages></book>'



In this article, we took a whirlwind tour of XML Dom. In the next, we will look at more examples of DOM programming and explore some of the ideas in depth.



Added on June 29, 2007 Comment

Comments

Post a comment

Your name:

Comment: