DOM and the MSXML Interfaces

only for RuBoard

The DOM models the documents by using objects, and this model encompasses not only the structure of a document, but also the behavior of a document and the objects of which it is composed . The DOM model identifies these objects and interfaces, their semantics, relationships, and collaborations among them. You can access and manipulate parsed XML content by using the set of interfaces exposed by these objects.

These objects in the DOM tree are referred to as nodes . Nodes implement other, more specialized interfaces. The DOM treats nodes as generic objects, which makes it possible for you to load a document and then traverse and manipulate all the nodes. The node types can be found in the XML DOM Enumerated Constants , which also defines valid parent and children nodes for each node type. MSXML DOM handles this with four interfaces: DOMDocument , IXMLDOMNode , IXMLDOMNodeList and IXMLDOMNamedNodeMap . This manner of manipulation through the Node interface is referred to as the simplified or flattened view by the DOM specification. Alternatively, DOM also allows an object-oriented interface to a document with a hierarchy of inheritance. This approach requires casts (in Java and other C languages) or queries interface calls in COM environments. These operations are expensive.

The MSXML DOM Interfaces

The DOM Specification defines the DOM Core API , which is a set of objects and interfaces used to access and manipulate document objects. These interfaces are broken down into fundamental interfaces and extended interfaces . Fundamental interfaces must be fully implemented by all conforming implementations of the DOM, including all HTML DOM implementations. The extended interfaces do not need to be implemented by DOM implementations that deal only with HTML, but are only required with the implementations that deal with XML. The MSXML DOM implements both fundamental and extended interfaces. The MSXML objects/interfaces include Microsoft extensions to support namespaces, data types, XML schemas, Extensible Stylesheet Language (XSL), XSL Transformations (XSLT) operations, asynchronous loading, and saving documents. The approach of providing extensions in the same API enables developers to work with a single consistent API for document processing and transformations.

Table 5.1 lists the fundamental interfaces.

Table 5.1. Fundamental Interfaces

W3C Interface

MSXML Interface

Description

Node

IXMLDOMNode

Represents a single node in the document tree; the base interface for accessing data in the XML object model.Valid node types are defined in the XML DOM Enumerated Constants. IXMLDOMNode includes support for data types, namespaces, document type definitions (DTDs), and XML schemas.

Document

DOMDocument

Represents the top node of the XML DOM tree.

 

IXMLDOMDocument2

Extension of DOMDocument that supports schema caching, runtime validation, and a way to switch on XML Path Language (XPath) support.

DOM Implementation

IXMLDOMImplementation

Provides methods that are independent of any particular instance of the DOM. Useful for finding out whether a specific version of the MSXML parser implementation supports a specified feature.

Document Fragment

IXMLDOMDocumentFragment

Represents a lightweight object that is useful for tree insert operations.

NodeList

IXMLDOMNodeList

Supports iteration and indexed access operations on the live collection of IXMLDOMNode .

Element

IXMLDOMElement

Represents the element object.

Named NodeMap

IXMLDOMNamedNodeMap

Provides iteration and access by name to the collection of attributes. IXMLDOMNamedNodeMap includes support for namespaces.

Attr

IXMLDOMAttribute

Represents an attribute of the IXMLDOMElement .Valid and default values for the attribute are defined in a DTD or schema.

CharacterData

IXMLDOMCharacterData

Provides text manipulation methods used by several objects.

Text

IXMLDOMText

Represents the text content of an element or attribute.

Comment

IXMLDOMComment

Represents the content of an XML comment.

The DOMException , defined in the DOM specification, is raised when a requested operation cannot be performed either because the data is lost or the implementation has become unstable. For languages and object systems that do not support the concept of exceptions, error conditions can be indicated by using native error-reporting mechanisms. MSXML supports parse error reporting through the XMLDOMParseError object, which holds information about the most recent parse error. It returns detailed information about the last error, including the error number, line number, character position, and a text description.

Table 5.2 contains the extended interfaces with their corresponding COM equivalents in MSXML 4.0.

Table 5.2. Extended Interfaces

W3C Interface

MSXML Interface

Description

CDATASection

IXMLDOMCDATASection

Quotes or escapes blocks of text so that text is not interpreted as markup language.

DocumentType

IXMLDOMDocumentType

Contains information associated with the document type declaration.

Notation

IXMLDOMNotation

Contains a notation declared in the DTD or schema.

Entity

IXMLDOMEntity

Represents a parsed or unparsed entity in the XML document.

EntityReference

IXMLDOMEntityReference

Represents an entity reference node.

Processing Instruction

IXMLDOMProcessing Instruction

Represents a processing instruction that XML defines to keep processor-specific information in the text of the document.

Microsoft-Specific Interfaces

The objects/interfaces listed in Table 5.3 include Microsoft extensions to support namespaces, data types, XML schemas, XSL, and XSL Transformations (XSLT) operations.

Table 5.3. Microsoft Extensions

MSXML Interface

Description

XMLSchemaCache

Represents a set of namespace Uniform Resource Identifiers (URIs).

Used by the schemas and namespaces properties on IXMLDOMDocument2 .

IXMLDOMSchemaCollection

Represents a SchemaCache object.

IXSLProcessor

Used for transformations with compiled stylesheets.

IXSLTemplate

Represents a cached XSL stylesheet.

IXMLDOMSelection

Represents the list of nodes that match a given XSL Pattern or XML Path Language (XPath) expression.

IXTLRuntime

Implements methods that can be called from XSLT stylesheets.

The two objects/interfaces in Table 5.4 are provided for communication with HTTP servers.

Table 5.4. The Object/Interfaces for Establishing HTTP Connection to Web Servers

MSXML Interface

Description

IXMLHTTPRequest

Provides client-side protocol support for communication with HTTP servers.

IServerXMLHTTPRequest/ServerXMLHTTP

Provides methods and properties that enable you to establish an HTTP connection between files or objects on different web servers.

The DOM interfaces model the entire XML document structure, which allows you to work with all the nodes in the document tree. Look at the sample XML document that's shown in Listing 5.1 and see how the information items in the XML document are modeled by the DOM interfaces.

Listing 5.1 A Sample XML Document
 <?xml version="1.0" encoding="utf-8" ?>  <!DOCTYPE customers SYSTEM "customers.dtd">  <Customers>       <Customer id="ALFKI">            <CompanyName>Alfreds Futterkiste</CompanyName>            <Contact>                 <FirstName>Maria</FirstName>                 <LastName>Anders</LastName>                 <Title>Sales Representative</Title>            </Contact>       </Customer>  <!--insert the new customer here-->       <Customer id="THEBI">            <CompanyName>The Big Cheese</CompanyName>            <Contact>                 <FirstName>Liz</FirstName>                 <LastName>Nixon</LastName>                 <Title>Marketing Manager</Title>            </Contact>       </Customer>  </Customers> 

After loading this document into the MSXML DOM, it builds an in-memory representation (see Figure 5.1). Each of the informational items in the document is modeled by the interfaces shown in the boxes beside them. The interfaces on the top denote the collection of the objects. The IXMLDOMNamedNodeMap represents the collection of the attribute nodes and the IXMLDOMNodeList represents the collection of the nodes.

Figure 5.1. The document modeled as tree nodes and the corresponding MSXML interfaces.
graphics/05fig01.gif

Now that you have an overview of the MSXML DOM interfaces, it's time find out how to use these interfaces to access and manipulate the XML documents. You will see examples of this manipulation on the client-side using HTML and JavaScript in IE browsers (version 5.0 and later) and then on the server-side using ASP.NET.

Instantiating the Parser

The following code shows how you create an instance of the MSXML DOMDocument object in JavaScript with MSXML 3.0. All other objects are accessed or created from this object:

 var objXML = new ActiveXObject("MSXML2.DOMDocument") 

MSXML 3.0 is designed to exist in side-by-side mode with the current version of MSXML on your system. So the old MSXML- related entries in the registry keep pointing to MSXML.dll even after you install MSXML3.dll. You are required to be explicit when you provide the ProgID.

MSXML2.DOMDocument is the version-independent ProgID for a rental-threaded component. If your application requires caching the component into an application or session scope variable, you must create an instance of the freethreaded model with the ProgID as MSXML2.FreeThreadedDOMDocument. Rental-threaded documents exhibit better performance because the parser doesn't need to manage concurrent access among threads. You might want t o use the free-threaded model when you use cached documents, which is discussed in the section, "Caching Compiled Stylesheets Using IXMLDOMXSLTemplate and IXSLProcessor."

MSXML 3.0 is backwards compatible with earlier versions, so if you want your old code to start using this version so that you can take advantage of the new features and the performance enhancements, use the xmlinst.exe utility to run MSXML 3.0 in replace mode . The utility also provides an option to roll back to side-by-side mode.

The following code shows you how to create an instance of the MSXML DOMDocument object using the version-dependent ProgID with MSXML 4.0:

 var objXML = new ActiveXObject("MSXML2.DOMDocument.4.0") 

Note

Because IE 6.0 ships with MSXML 3.0, we use the MSXML 3.0 version for all client-side examples, in this chapter.


only for RuBoard


XML and ASP. NET
XML and ASP.NET
ISBN: B000H2MXOM
EAN: N/A
Year: 2005
Pages: 184

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net