Document Object Model (DOM)

I l @ ve RuBoard

The Document Object Model (DOM) is a standard interface to access and manipulate structured data.

As the name suggests, it does this by modeling , or representing, a document as a hierarchical tree of objects . A number of different object types are defined in the W3C's DOM specification; these objects expose methods and attributes that can be used by the application layer to navigate and process the DOM tree, exploit the relationships between the different branches of the tree, and extract information from it.

The W3C's DOM specification defines a number of different objects to represent the different structures that appear within an XML document. For example, elements are represented by an Element object, whereas attributes are represented by Attr objects.

Each of these different object types exposes specific methods and properties. Element objects expose a tagName property containing the element name and getAttribute() and setAttribute() methods for attribute manipulation, whereas Attr objects expose a value property containing the value of the particular attribute. These methods and properties can be used by the application layer to navigate and process the DOM tree, exploit the relationships between the different branches of the tree, and extract information from it.

The very first specification of the DOM (DOM Level 1) appeared on the W3C's web site in October 1998, and simply specified the "core" features of the DOM ”the basic objects and the interfaces to them. The next major upgrade, DOM Level 2, appeared in November 2000; it examined the DOM from the perspective of core functions, event handling, and document traversal. DOM Level 3, which is currently under development, builds on past work, and incorporates additions and changes from other related technologies (XPath, abstract schemas, and so on).

As a standard interface to structured data, the DOM was designed from the get-go to be platform- and language-independent. It can be (and is) used to represent structured HTML and XML data, with DOM (or DOM-based) implementations currently available for Java, JavaScript, Python, C/C++,Visual Basic, Delphi, Perl, SMIL, SVG, and PHP. (The PHP implementation is discussed in detail in the next section.)

In order to better understand how the DOM works, consider Listing 3.1.

Listing 3.1 A Simple XML Document
 <?xml version="1.0"?>  <sentence>What a wonderful profusion of colors and smells in the market  <vegetable color='green'>cabbages</vegetable>, <vegetable  color='red'>tomatoes</vegetable>, <fruit color='green'>apples</fruit>,  <vegetable color='purple'>aubergines</vegetable>, <fruit  color='yellow'>bananas</fruit></sentence> 

Once a DOM parser chewed on this document, it would spit out the tree structure shown in Figure 3.1.

Figure 3.1. A DOM tree.

graphics/03fig01.gif

As you can see, the parser returns a tree containing multiple nodes linked to each other by parent-child relationships. Developers can then write code to move around the tree, access node properties, and manipulate node content.

This approach is in stark contrast to the event driven approach you studied in Chapter 2, "PHP and the Simple API for XML (SAX)." A SAX parser progresses sequentially through a document, firing events based on the tags it encounters and leaving it to the application layer to decide how to process each event. A DOM parser, on the other hand, reads the entire document into memory, and builds a tree representation of its structure; the application layer can then use standard DOM interfaces to find and manipulate individual nodes on this tree, in a non-sequential manner.

I l @ ve RuBoard


XML and PHP
XML and PHP
ISBN: 0735712271
EAN: 2147483647
Year: 2002
Pages: 84

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net