The Document Object Model | XML and SOAP Programming for BizTalk(TM) Servers (DV-MPS Programming)

[Previous] [Next]

The Document Object Model, or DOM, exposes an XML document as a tree structure in memory and provides an easy-to-use environment for the programmer. The DOM provides an accessible object that you can interrogate and manipulate like any other object in modern-day programming languages.

The DOM defines a standard set of objects and interfaces that you can use to manipulate XML, providing access to documents, elements, and attributes. The DOM lets you express an XML document as an object, so you can work with it as you can any other object on your system—by using a well-documented application programming interface (API) with useful properties and methods.

As you learned in earlier chapters, the DOM is a World Wide Web Consortium (W3C) Recommendation. Because the DOM is a large project, the W3C DOM Working Group faced and still faces a daunting task. To better manage the project, the group broke up the work into multiple parts, adopting the first part in October 1998; I expect that the second part will be complete in the first half of 2000.

The W3C recommendation is useful as a blueprint for a common object model, but it does not go far enough in defining a specific implementation of the DOM. Each implementation of the DOM, therefore, will probably consist of a different view of the document object. For example, there are some key interfaces missing from the W3C version of the DOM that Microsoft felt were important to include. I use two methods, selectNodes and selectSingleNode, that are not in the W3C specification. Several parser providers offer implementations of the DOM in their products. Because the environment in which you implement each parser has different requirements, each of these implementations is different.

Now I will describe the Microsoft implementation of the DOM, since it has by far the best documentation and support. The Microsoft DOM is part of the Microsoft XML parser object. Microsoft includes the Microsoft XML DOM object in Microsoft Internet Explorer 5, Microsoft Office 2000, and Microsoft Windows 2000. The XML DOM object is also a redeployable object that you can include in your own applications. The DOM's filename is msxml2.dll (see the sidebar), and it is registered as a COM object with the name MSXML2.DOMDocument. Since the DOM is a COM object, you can invoke it wherever you would invoke a COM object in any COM-enabled application. You can access it as an ActiveX control in scripting using Microsoft.XMLDOM.

XMLDOM Versions

Because development of the XML DOM is ongoing, you might find a number of versions of xmldom.dll on your machine. If you program to MSXML.DOMDocument, you are accessing version 2.5 of the DLL. Using MSXML2.DOMDocument lets you access version 2.6 of the DLL, and MSXML.DOMDocument30 gives you access to version 3 of the DLL. You can run the file xmlinst.exe after installing the latest version of MSXML to point all of your registry entries to the latest version of the file. In this chapter we'll program to MSXML2.DOMDocument.

The DOM in Action

Think of the DOM as a dynamic hierarchical object with a set of interfaces, properties, and methods. It is important to note that the computer sees the object we call an XML document as just a serial collection of bytes. Because this collection of bytes takes the form of plain text, it's easy to read and easy to move around our networks and over the Internet.

However, for the computer to interrogate and manipulate the information in an XML document, you must turn the document into an in-memory object that is better suited for treatment by high-level programming languages. You do this by instantiating a copy of the DOM, which invokes a parser to break up the XML document into pieces. Figure 5-1 shows the operation of the parser in creating the DOM object.

click to view at full size.

Figure 5-1. The DOM provides a standard set of interfaces that allows a programmer to access the hierarchical objects represented by an XML stream.

At the top of Figure 5-1 is a small XML document representing a joke. You can see elements for Joke, Setup, and Punchline, and attributes for author and firstTold. The XML parser reads this document one character at a time, determining which characters are markup and which are content. If the parser doesn't find a schema, it follows the rules of well-formed XML. If the parser does find a schema, it reads the schema and then ensures that the document adheres to the structure described by the schema.

Once the parser is satisfied that the document is properly defined, it creates a set of nodes that have certain properties. (I discussed XML nodes in Chapter 4.) Table 5-1 lists the 12 node types in the Microsoft DOM implementation.

Table 5-1. Node types defined by the Microsoft implementation of the W3C DOM.

Value	Name	Description
1	NODE_ELEMENT	The node represents an element. An element node can have the following child node types: Element, Text, Comment, ProcessingInstruction, CDATA-Section, and EntityReference. An element node can be the child of the Document, Document-Fragment, EntityReference, and Element nodes.
2	NODE_ATTRIBUTE	The node represents an attribute of an element. An attribute node can have the following child node types: Text and EntityReference. The attribute does not appear as the child node of any other node type; note that it is not considered a child node of an element.
3	NODE_TEXT	The node represents the text content of a tag. A text node cannot have any child nodes. The text node can appear as the child node of the Attribute, DocumentFragment, Element, and EntityReference nodes.
4	NODE_CDATA_SECTION	The node represents a CDATA section in the XML source. CDATA sections are used to escape blocks of text that would otherwise be recognized as markup. A CDATA section node cannot have any child nodes. The CDATA section node can appear as the child of the DocumentFragment, Entity-Reference, and Element nodes.
5	NODE_ENTITY_REFERENCE	The node represents a reference to an entity in the XML document. This node type applies to all entities, including character entity references. An entity reference node can have the following child node types: Element, ProcessingInstruction, Comment, Text, CDATASection, and EntityReference. The entity reference node can appear as the child of the Attribute, DocumentFragment, Element, and EntityReference nodes.
6	NODE_ENTITY	The node represents an expanded entity. An entity node can have child nodes that represent the expanded entity (for example, Text and Entity-Reference nodes). The entity node can appear as the child of the DocumentType node.
7	NODE_PROCESSING_INSTRUCTION	The node represents a processing instruction (PI) from the XML document. A PI node cannot have any child nodes. The PI node can appear as the child of the Document, DocumentFragment, Element, and EntityReference nodes.
8	NODE_COMMENT	The node represents a comment in the XML document. A comment node cannot have any child nodes. The comment node can appear as the child of the Document, DocumentFragment, Element, and EntityReference nodes.
9	NODE_DOCUMENT	The node represents a document object, which, as the root of the document tree, provides access to the entire XML document. It is created by using the ProgID "MSXML2.DOMDocument", or through a data island using <SCRIPT LANGUAGE=XML> or <XML>. The document node can have the following child node types: Element (maximum of one), Processing Instruction, Comment, and Document-Type. The document node cannot appear as the child of any node types.
10	NODE_DOCUMENT_TYPE	The node represents the document type declaration, indicated by the <!DOCTYPE> tag. The document type node can have the following child node types: Notation and Entity. The document type node can appear as the child of the Document node.
11	NODE_DOCUMENT_FRAGMENT	The node represents a document fragment. The document fragment node associates a node or subtree with a document without actually being contained within the document. The document fragment node can have the following child node types: Element, ProcessingInstruction, Comment, Text, CDATASection, and EntityReference. The DocumentFragment node cannot appear as the child of any node types.
12	NODE_NOTATION	A node represents a notation in the document type declaration. The notation node cannot have any child nodes. The notation node can appear as the child of the DocumentType node.

The properties and methods available for each node depend on the type of node it is. For example, you can load a document node with a serialized (raw text) XML document, but you can't load an element or attribute node directly. To access an element node, you must first successfully read the document into a document node.

In Figure 5-1, Joke has four nodes. The first two are the element nodes Setup and Punchline. The second two nodes are the attribute nodes author and firstTold. In the <Joke> start tag, a namespace points to a schema on an external site. This schema is specified using XML Data Reduced (XDR) syntax. Notice the nodeDataType property of each node. All are strings except for the firstTold attribute, which has been declared a dateTime data type. If you don't specify a namespace, all nodeDataType properties are strings. By accessing the typedValue property of nodeDataType , the object will return the value of the date as a date variant, so your application does not need to validate the data type or convert the value for processing.

You'll find the full Microsoft DOM API at http://msdn.microsoft.com. Like most APIs, the DOM API is rich, allowing you to do a number of things including loading and saving, creating elements and nodes, and of course, parsing XML documents. And as with most APIs, the DOM API has only a couple of methods and properties that you will use in your day-to-day work. Let's see how to use some of these more common properties and methods.

Creating a DOM Object

The first requirement when you work with the DOM is to instantiate a copy of the XML parser/DOM object in your application. In JavaScript, you use the ActiveXObject function to create the object, as shown in the following code:

 var objDocument = new ActiveXObject("MSXML2.DOMDocument"); objDocument.async = false;

The first line instantiates the object and creates a variant called objDocument. This object will contain the document node after you've loaded the document. You can test this by accessing the value of the objDocument.nodeType property. In this case, the property contains the value 9, which maps to the NODE_DOCUMENT node type in Table 5-1.

The async property indicates whether the parser should load the entire document before making it available to the programmer. Setting the async property to false ensures that no actions will be taken against a document that is not fully loaded. This is the safe and easy way to program, but it might make your application work more slowly. When set to true (the default setting), the control returns to the caller before the download is finished. You can then use the readyState property to check the status of the download. You can also attach an onreadystatechange handler or connect to the onreadystatechange event to notify you when the ready state changes and the download are complete.

For loading large documents, you will probably want to set the async property to true so that you can continue to do other processing while the object loads. In effect, the load is spun off as a separate thread. If you do set the async property to true, you should check the value of the readyState property before you try to access the document. Table 5-2 describes the values of readyState.

Table 5-3 describes the two ways to load a stream of XML text into the object.

Table 5-2. Values returned by the readyStateproperty.

Value	State	Description
1	LOADING	The object is bootstrapping, which means it is reading any persisted properties, not parsing data.
2	LOADED	The object is finished bootstrapping and is beginning to read and parse data.
3	INTERACTIVE	Some data has been read and parsed, and the object model is now available on the partially retrieved data set.
4	COMPLETED	The document has been loaded, successfully or unsuccessfully.

Table 5-3. Methods for loading an XML document into a DOM object.

Method	Description
load(url)	This method loads an XML document from the location specified by the URL. If the URL cannot be resolved or accessed or does not reference an XML document, the documentElement property is set to null and an error is returned.
loadXML(xmlString)	This method loads an XML document using the supplied string. The xmlString argument can be a well-formed or valid document. If the XML within xlmString cannot be loaded (because of parsing errors), the documentElement property is set to null and an error is returned.

The following code loads an XML string into a DOM object:

 objDocument.loadXML("<fact verified='2000-01-24'>Movies are " + "better than books because you can't " + "spill coffee on them.</fact>");

Once the object is loaded, the parseError object should be checked. A correctly parsed object will return 0, as in this example:

 if (objDocument.parseError.errorCode != 0) { alert("Error: " + objDocument.parseError.reason + " on line " + objDocument.parseError.line); }

Table 5-4 describes the properties of this read-only parseError object.

Table 5-4. Properties of the parseError object.

Property	Description
errorCode	The error code number in decimal format
url	The URL of the XML file containing the error
reason	The reason for the error in human-readable form
srcText	The full text of the line containing the error
line	The number of the line containing the error (Note that the line number is relative to the top of the document, so if you have a document type definition (DTD), the numbering will start counting at the point immediately following the DTD, not necessarily at the first line of the document content.)
linepos	The character position within the line where the error occurred
filepos	The absolute character position in the file where the error occurred

Accessing the documentElement

Once we are satisfied that the document has been loaded properly, we can start to access the contents of the object. We have many ways to do this. The easiest approach is to access the properties of the document element. The following code adds the documentElement.nodeName and the documentElement.text properties to the result string:

 result += "objDocument.parseError.errorCode: " + objDocument.parseError.errorCode + "\n"; result += "objDocument.documentElement.nodeName: " + objDocument.documentElement.nodeName + "\n"; result += "objDocument.documentElement.text: " + objDocument.documentElement.text + "\n"; alert (result);

The nodeName and text properties work on any element node. The objDocument object is a document node. The documentElement property of this node gives us the element node. We can set a variant to this element to make our coding a little simpler:

 var rootElem = objDocument.documentElement;

Attributes are contained in the XMLDOMElement object as a collection of named items. This collection belongs to the element in which the attributes are specified. You can think of the attributes collection as an associative array—that is, a collection of like objects, each keyed by a string rather than an offset index. To access the value of the verified attribute, you need to use the getNamedItem method:

 result += "rootElem.attributes.getNamedItem('verified').nodeValue: " + rootElem.attributes.getNamedItem("verified").nodeValue + "\n";

It's easy to access the properties of the document element, but what about other elements in the document? They are a bit harder to access, but in the next section I'll show you some methods that help you access elements directly.

Getting Items in the Document

Let's load a more complex document. How about our favorite duck-bar joke from Chapter 4.

 <?xml version="1.0"?> <joke type="story" keywords="duck bar grapes nails"> <scene number="1"> A duck walks into a bar, goes to the bartender, and says, "Do you have any grapes?" The bartender says, "No, this is a bar, of course we don't have any grapes." </scene> <scene number="2"> The next day, the duck walks into the bar, goes up to the bartender, and says, "Do you have any grapes?" The bartender says, "I told you yesterday, 'no, we don't have any grapes.' If you come in here one more time asking for grapes, I'm going to nail your beak to that bar!" </scene> <scene number="3"> The next day, the duck walks into the bar, goes up to the bartender, and asks, "Do you have any nails?" The bartender says, "No, this is a bar, of course we don't have any nails." Then the duck says, "Do you have any grapes?" </scene> </joke>

Assume this document is loaded into our object and the parser returns an errorCode of 0. We can easily access the element and attributes of any element node, as you learned in the previous section, but what about the scene elements? To get them, we can use the childNodes property. Then we can interrogate the scene elements to get the information we need:

 result += " rootElem.childNodes.item(1).text: " + rootElem.childNodes.item(1).text + "\n"

The item property returns the child node. The collection of nodes returned from the childNodes method is zero-based, so the example here returns the text of the second scene in which the bartender threatens our little hero with physical violence.

You can access an array of child nodes through the XMLDOMNodeList object. Table 5-5 lists the properties and methods available.

Table 5-5. The XMLDOMNodeList interface exposes these properties and methods.

Property	Description
length	Returns the number of nodes in the node list. The length of the list will change dynamically as children or attributes are added and deleted from the parent element.
*Method*
item(index)	Returns the node in the node list with the specified index. Index is zero-based.
nextNode	Returns the next node in the node list based on the current node.
reset	Returns the iterator to the uninstantiated state; that is, before the first node in the node list.

We can use the length property to iterate through the collection one member at a time:

 for (i = 0; i < rootElem.childNodes.length; i++) { result += " rootElem.childNodes.item(" + i + ").text: " + rootElem.childNodes.item(i).text + "\n"; }

To make the preceding code more readable, we can create a new variant that contains the collection of items:

 var colScenes = rootElem.childNodes; for (i = 0; i < colScenes.length; i++) { result += "colScenes.item(" + i + ").text: " + colScenes.item(i).text + "\n"; }

The colScenes variant is a DOM NodeList object containing all the direct child elements of the root element joke. Using this object makes accessing elements in the DOM very straightforward.

What if you want to access one of the scenes, but only if its attribute is a certain value? Here's one approach:

 var colScenes = rootElem.childNodes for (i = 0; i < colScenes.length; i++) { if (colScenes.item(i).attributes.getNamedItem("number").nodeValue == "1") { result += "colScenes.item(" + i + ").text: " + colScenes.item(i).text + "\n" } }

It works, but it's pretty clumsy. Let's take a look at an alternative approach for accessing nodes, and then I'll show you an easy way to get a particular node. The Microsoft DOM implements two handy methods for accessing exactly what you want: selectNodes(query) and selectSingleNode(query). These methods are described in Table 5-6.

Table 5-6. You can access an element or set of elements by using the selectNodes and selectSingleNode methods in the Microsoft DOM implementation. The query argument contains a pattern defined by the W3C XPath specification.

Method	Description
selectNodes(query)	Returns a node list containing the results of the query indicated by the query string by using the current node as the query context. If no nodes match the query, an empty node list is returned. If the query string has an error, DOM error reporting is used.
selectSingleNode(query)	Returns a single node that is the first node the node list returned from the query, using the current node as the query context. If no nodes match the query string, null is returned. If the query string has an error, an error is returned.

The query strings passed to the methods in Table 5-6 are Extensible Stylesheet Language (XSL) patterns. I'll discuss XSL patterns in Chapter 6. For now, to access our document, let's take a look at the selectNodes method as an alternative to childNodes:

 var colScenes = rootElem.selectNodes("scene") for (i = 0; i < colScenes.length; i++) { result += "colScenes.item(" + i + ").text: " + colScenes.item(i).text + "\n" }

Using the selectNodes method is a little bit of an improvement over using the childNodes method, and it is clearly more self-explanatory. The advantages of the selectNodes method become more obvious once you start delving deeper into a complex document. For example, you can easily access a collection of line items deep in an invoice document by using a complex XSL query pattern such as the following:

selectNodes("/invoice/body/items/item")

Accessing an item using the childNodes method to drill down through the hierarchy would require quite a bit of code. We would need to iterate through our node list a number of times to select the nodes that we want. To eliminate most of that code, you can use the selectSingleNode method:

 result += rootElem.selectSingleNode("scene[@number='2']").text;

The XSL pattern here returns the first scene, which has an attribute (@) named number that has a value of 2.

Now you have enough practical knowledge of the DOM to create an application that produces actual results.