The W3C DOM specifies a way of treating a document as a tree of nodes. In this model, every discrete data item is a node, and child elements or enclosed text become subnodes. Treating a document as a tree of nodes is one good way of handling XML documents (although there are others, as we'll see when we start working with Java): It makes it explicit which elements contain which other elements because the contained elements become subnodes (called child nodes) of the container nodes. Everything in a document becomes a node in this modelelements, element attributes, text, and so on. Here are the possible node types in the W3C DOM:
For example, take a look at this document: <?xml version="1.0" encoding="UTF-8"?> <DOCUMENT> <GREETING> Hello From XML </GREETING> <MESSAGE> Welcome to the wild and woolly world of XML. </MESSAGE> </DOCUMENT> This document has a processing instruction node and a root element node corresponding to the <DOCUMENT> element. The <DOCUMENT> node has two subnodes, the <GREETING> and <MESSAGE> nodes. These nodes are child nodes of the <DOCUMENT> node and are sibling nodes of each other. Both the <GREETING> and <MESSAGE> elements themselves have one subnode: a text node that holds character data. We'll get used to handling documents like this one as a tree of nodes in this chapter. Looked at as a tree, this is what this document looks like: <DOCUMENT> -------------------------------------- <GREETING> <MESSAGE> Hello From XML Welcome to the wild and woolly world of XML. Every discrete data item is itself treated as a node. Using the methods defined in the W3C DOM, you can navigate along the various branches of a document's tree using methods such as nextChild to move to the nextChild node, or lastSibling to move to the last sibling node of the current node. Working with a document this way takes a little practice, and that's what this chapter is all about. There are a number of different levels of DOM:
Practically speaking, the only nearly complete implementation of the XML DOM today is that in Internet Explorer version 5 or later; you can find the documentation for the Microsoft DOM at http://msdn.microsoft.com/library/psdk/xmlsdk/xmld20ab.htm as of this writing. However, the Microsoft sites are continually (and annoyingly) being reorganized, so it's quite possible that by the time you read this, that page will be long gone. In that case, your best bet is to go to http://msdn.microsoft.com and search for "xml dom." (The general rule is not to trust a URL at a Microsoft site for more than about two months.) Because Internet Explorer provides substantial support for the W3C DOM Level 1, I'm going to use it in this chapter. Let us hope that translation to other W3C-compliant browsers as those browsers begin to support the W3C DOM won't be terribly difficult. |