19.1 DOM Foundations

     

At its heart, the DOM is a set of abstract interfaces. Various DOM implementations use their own objects to support the interfaces defined in the DOM specification. The DOM interfaces themselves are specified in modules, making it possible for implementations to support parts of the DOM without having to support all of it. XML parsers, for instance, aren't required to provide support for the HTML-specific parts of the DOM, and modularization has provided a simple mechanism that allows software developers to identify which parts of the DOM are supported or not supported by a particular implementation.

Successive versions of the DOM are defined as levels . The Level 1 DOM was the W3C's first release, and it focused on working with HTML and XML in a browser context. Effectively, it supported dynamic HTML and provided a base for XML document processing. Because it expected documents to exist already in a browser context, Level 1 only described an object structure and how to manipulate it, not how to load a document into that structure or reserialize a document from that structure.

Subsequent levels have added functionality. DOM Level 2, which was published as a set of specifications, one per module, includes updates for the Core and HTML modules of Level 1, as well as new modules for Views, Events, Style, Traversal, and Range. DOM Level 3 added Abstract Schemas, Load, Save, XPath, and updates to the Core and Events modules.

Other W3C specifications have defined extensions to the DOM particular to their own needs. Mathematical Markup Language (MathML), Scalable Vector Graphics (SVG), Synchronized Multimedia Integration Language (SMIL), and SMIL Animation have all defined DOMs that provide access to details of their own vocabularies.

For a complete picture of the requirements these modules are supposed to address, see http://www.w3.org/TR/DOM-Requirements. For a listing of all of the DOM specifications, including those still under development, see http://www.w3.org/DOM/DOMTR. The DOM has also been included by reference in a variety of other specifications, notably the Java API for XML Processing (JAXP).


Developers using the DOM for XML processing typically rely on the Core module as the foundation for their work.

19.1.1 DOM Notation

The Document Object Model is intended to be operating system- and language- neutral; therefore, all DOM interfaces are specified using the Interface Description Language (IDL) notation defined by the Object Management Group. To conform to the language of the specification, this chapter and Chapter 25 will use IDL terminology when discussing interface specifics. For example, the word "attribute" in IDL-speak refers to what would be a member variable in C++. This should not be confused with the XML term "attribute," which is a name -value pair that appears within an element's start-tag.

The language-independent IDL interface must then be translated (according to the rules set down by the OMG) into a specific language binding. Take the following interface, for example:

 interface NodeList {   Node               item(in unsigned long index);   readonly attribute unsigned long    length; }; 

This interface would be expressed as a Java interface like this:

 package org.w3c.dom;    public interface NodeList {     public Node item(int index);        public int getLength( );    } 

The same interface would be described for ECMAScript this way:

 Object NodeList    The NodeList object has the following properties:      length        This read-only property is of type Number.    The NodeList object has the following methods:      item(index)        This method returns a Node object.        The index parameter is of type Number.        Note: This object can also be dereferenced using square        bracket notation (e.g. obj[1]). Dereferencing with an        integer index is equivalent to invoking the item method        with that index. 

The tables in this chapter represent the information DOM presents as IDL, conveying both the available features and when they became available. DOM implementations vary in their interpretations of these featuresbe sure to check the documentation of the implementation you choose for details on how it maps the standard DOM interfaces to your particular language.

19.1.2 DOM Strengths and Weaknesses

Like all programming tools, the DOM is better for addressing some classes of problems than others. Since the DOM object hierarchy stores references between the various nodes in a document, the entire document must be read and parsed before it is available to a DOM application. This step also demands that the entire document be stored in memory, often with a significant amount of overhead. Some early DOM implementations required many times the original document's size when stored in memory. This memory usage model makes DOM unsuitable for applications that deal with very large documents or have a need to perform some intermediate processing on a document before it has been completely parsed.

However, for applications that require random access to different portions of a document at different times, or applications that need to modify the structure of an XML document on the fly, DOM is one of the most mature and best-supported technologies available.



XML in a Nutshell
XML in a Nutshell, Third Edition
ISBN: 0596007647
EAN: 2147483647
Year: 2003
Pages: 232

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net