What Is DOM?


Some overhead is involved when using XML documents, because extracting data from the tags in an XML document can be arduous. A parser is used to take care of checking a document's validity and extracting the data from the XML syntax. A layer of abstraction between the application and the XML document is made possible by the XML Document Object Model (DOM) specification, which has been standardized by the W3C. This layer of abstraction comes in the form of interfaces that have methods and properties to manipulate an XML document. In other words, when using the DOM, you don't need to worry about the XML syntax directly. For example, the methods, getAttribute(…) and setAttribute(…), enable you to manipulate the attributes on an element in an elegant fashion. Legacy systems can use these interfaces to provide access to legacy data as if the data was natively stored in XML. In other words, your legacy data can be made to look like an XML document by implementing the DOM interfaces on top of the legacy database.

Why Client-Side XML Processing?

At first glance, it seems pretty silly to process XML data on the client side when powerful languages such as ASP.NET, Java, and Perl exist to handle processing on the back end. But, if you have been around the world of Web development for any length of time, will know that in some circumstances it makes sense to handle things on the server side, and other conditions that suit processing on the client side.

Processing data on the client side can help relieve server load and give the visitor a better, more responsive experience on your site. For example, the use of server-side programming to perform a task as simple as sorting a column in a table, or formatting some data, is unnecessary; it also forces the users to wait longer than they should have to for such trivial operations. Client-side processing of XML data can be a big help in situations like this.

XML DOM Object Model

Document Object Model is a W3C standard that allows you to put together a document dynamically, and to navigate and manipulate its structure and content. To work with DOM, you use an XML parser to load XML documents into memory. After the documents are loaded, you can then easily manipulate the information in the documents through the Document Object Model (DOM).

You can visualize the DOM's structure as a tree of nodes. The root of the tree is a Document node, which has one or more child nodes that branch off from this trunk. Each of these child nodes may in turn contain child nodes of its own, and so on. For example, consider the XML file shown in Listing 12-1.

Listing 12-1: A sample XML file

image from book
      <?xml version="1.0" encoding="utf-8"?>      <Products>        <Product Category="Helmets">          <ProductID>707</ProductID>          <Name>Sport-100 Helmet, Red</Name>          <ProductNumber>HL-U509-R</ProductNumber>        </Product>        <Product Category="Socks">          <ProductID>709</ProductID>          <Name>Mountain Bike Socks, M</Name>          <ProductNumber>SO-B909-M</ProductNumber>        </Product>        <Product Category="Socks">          <ProductID>710</ProductID>          <Name>Mountain Bike Socks, L</Name>          <ProductNumber>SO-B909-L</ProductNumber></Product>                <Product Category="Caps">          <ProductID>712</ProductID>          <Name>AWC Logo Cap</Name>          <ProductNumber>CA-1098</ProductNumber>        </Product>      </Products> 
image from book

The root element of this XML document is <Products>, which contains an arbitrary number of <Product> elements. Each <Product> element, in turn, contains <ProductID>, <Name>, and <ProductNumber> elements. In addition, the <Product > element also contains a category attribute.

If you load this XML file into DOM, DOM loads the XML file into a tree-like structure with the elements, attributes, and text defined as nodes. Some of these node objects have child objects or child nodes. Nodes with no child object are called leaf nodes. Figure 12-1 provides a visual representation of the Products.xml file.

image from book
Figure 12-1

According to W3C recommendations, the DOM Level 1 allows navigation within an HTML or XML document and the manipulation of its content. DOM Level 2 extends Level 1 with a number of features such as XML Namespace support, filtered views, ranges, and events. DOM Level 3 builds on Level 2 that allows programs to dynamically access and update the content, structure, and style of documents. The following table describes the main interfaces that form the DOM Level 3 Core module.

Open table as spreadsheet

Interface

Description

Attr

The Attr interface represents an attribute in an Element object

CDataSection

CDATA sections escape blocks of text containing characters that would otherwise be regarded as markup

CharacterData

The CharacterData interface extends Node with a set of attributes and methods for accessing character data in the DOM

Comment

This interface inherits from CharacterData and represents the content of a comment (in other words, all the characters between the starting <!-- and ending -->)

Document

The Document interface represents the entire Hypertext Markup Language (HTML) or XML document

DocumentFragment

DocumentFragment is a light-weight or minimal Document object

DocumentType

Each Document has a doctype attribute whose value is either null or a DocumentType object

DOMImplementation

The DOMImplementation interface provides a number of methods for performing operations that are independent of any particular instance of the DOM

Element

The Element interface represents an element in an HTML or XML document

Entity

This interface represents an entity, either parsed or unparsed, in an XML document

EntityReference

EntityReference objects may be inserted into the structure model when an entity reference is in the source document or when the user wants to insert an entity reference

NamedNodeMap

Objects implementing the NamedNodeMap interface represent collections of nodes that can be accessed by name

Node

The Node interface is the primary data type for the entire DOM

NodeList

The NodeList interface provides the abstraction of an ordered collection of nodes, without defining or constraining how this collection is implemented

Notation

This interface represents a notation declared in the document type definition (DTD)

ProcessingInstruction

The ProcessingInstruction interface represents a PI, which is used in XML as a way to keep processor-specific information in the text of the document

Text

The Text interface inherits from CharacterData and represents the textual content (termed character data in XML) of an Element or Attr

Note that every interface that represents a node in the DOM tree extends the Node interface. The next few sections explore some of the important interfaces and the steps involved in using its methods and properties.

Using the Document Interface

The Document interface is the uppermost object in the XML DOM hierarchy. It implements all the basic DOM methods required to work with an XML document. It also provides methods that help you navigate, query, and modify the content and the structure of an XML document. Some of the important methods of the Microsoft's implementation of Document object are described in the following table:

Open table as spreadsheet

Method

Description

createElement

Takes an element name as a parameter and creates an element node by using the name. You cannot create namespace-qualified elements using the createElement() method. To create namespace-qualified elements, you need to use the createElementNS() method

createAttribute

Takes an attribute name as a parameter and creates an attribute node with that name

createTextNode

Takes a string as a parameter and creates a text node containing the specified string

createNode

Takes three parameters. The type parameter is a variant that can be either a string or an integer. The second parameter is a string that represents the name of the node to be created. The third parameter is a string that represents the namespace-URI

createComment

Takes a string as a parameter and creates a comment node containing this string

getElementsByTagName

Takes a string as a parameter. The string represents the element to be searched. This method returns an instance of the IXMLDOMNodeList object, which contains the collection of nodes with the specified element name. You can use the node list to navigate and manipulate the values stored in the named elements

load

Takes a string as a parameter that represents the URL or the path of an XML document as its argument and loads the specified document in the DOMDocument object

loadXML

Takes a string as a parameter, which contains well-formed XML code or an entire XML document, to load it in the DOMDocument object

transformNode

Takes a style sheet object as a parameter, processes the node by applying the corresponding style sheet template on the XML document, and returns the result of transformation

save

Takes an object as a parameter. This object can be either DOMDocument or a filename. The save() method saves the DOMDocument object at the specified destination

In addition to the preceding methods, the Microsoft implementation of the Document interface also exposes the following properties that can be used to manipulate the information contained in the Document object.

Open table as spreadsheet

Property

Description

async

Specifies whether an asynchronous download is permitted. If you set this property to true, the script executes while the XML document is still being loaded. If this property is set to false, the script waits until the XML document is loaded before it starts processing the content.

childNodes

Returns a list of child nodes that belong to a parent node. The value of this property is of the type IXMLDOMNodeList.

documentElement

Contains the root element of the XML document represented by the DOM-Document object.

firstChild

Returns the first child node of a parent element. This is a read-only property.

lastChild

Returns the last child of a parent node.

parseError

Returns an IXMLDOMParseError object that contains information about the most recently generated error.

readyState

Returns the state of the XML document. It indicates whether the document has been loaded completely.

xml

Returns an XML representation of a node and its child nodes.

validateOnParse

Specifies whether the parser should validate the XML document when parsing.

Now that you have had a brief look at the properties and methods of the Document interface, take a look at an example that shows how to load an XML document through the Document interface.

Loading an XML Document

To traverse an XML document in Internet Explorer, you first have to instantiate the Microsoft XMLDOM parser. In Internet Explorer 5.0 and above, you can instantiate the parser using JavaScript:

      <script type="text/javascript">        function loadDocument()        {          var doc = new ActiveXObject("Microsoft.XMLDOM");        ...        }      </script> 

Note that the previous XML parser is implemented as an ActiveX object and works only in Internet Explorer.

After the parser is instantiated, you can load a file into it using a series of commands. For example, to load the Products.xml file in the parser:

      <script type="text/javascript">        function loadDocument()        {          var doc = new ActiveXObject("Microsoft.XMLDOM");        doc.async = false;        doc.load("Products.xml");          ...          }      </script> 

Note that you set the async property of the XMLDOM object to false to ensure that the parser will wait until the document is fully loaded before it does anything else. Next, you invoke the load() method to load the contents of the Products.xml file into the parser.

At times you might want to load the XML from a string variable and then feed it directly to the parser. To do this, you must use the loadXML() method instead of the load() method, as in the following example:

      <script type="text/javascript">        function loadDocument()        {          var xmlContents = '<?xml version="1.0" encoding="iso-8859-1"?>';          xmlContents += '<Products><Product>';          xmlContents += '<ProductID>707</ProductID>';          xmlContents += '<Name>Sport-100 Helmet, Red</Name>';          xmlContents += '<ProductNumber>HL-U509-R</ProductNumber>';          xmlContents += '</Product></Products>';          var doc = new ActiveXObject("Microsoft.XMLDOM");          doc.async = false;          doc.loadXML(xmlContents);          ...        }      </script> 

The loadXML() method can be extremely useful in scenarios where you are retrieving XML data from the server side dynamically as a string variable. You can take that XML and load it onto an XML DOM object using the loadXML() method for subsequent processing.

Using the readyState Property

To check whether a document has been loaded completely, use the readyState property. This property stores a numeric value, which represents one of the following states:

  • q LOADING (1)-The loading process is in progress, and data is not yet parsed.

  • q LOADED (2)-The data has been read and parsed, but the object model is not ready.

  • q INTERACTIVE (3)-The object model is available with partially retrieved data set and is in read-only mode.

  • q COMPLETED (4)-The loading process is complete.

To determine whether the XML document is completely loaded and display a message using JavaScript, use the code:

      if (doc.readyState==4)      {        alert ("Document is completely loaded");      } 

Using the Element Interface

The Element interface represents each element in the XML document. It supports the manipulation of elements and the attributes associated with the elements. If the element node contains text, this text is represented in a text node. The Element interface helps manage attributes because this is the only node type that has attributes. This interface has only one read-only property, tagName, which retrieves the tag name of the element as a string.

An element is also a Node object and inherits different properties of the Node object. The methods of the Element interface are shown in the following table:

Open table as spreadsheet

Method

Description

getAttribute

Returns the string containing the value of the specified attribute

getAttributeNode

Returns the specified attribute node as an Attr object

getElementsByTagName

Returns the NodeList of all descendant elements with a given tag name

removeAttribute

Removes the specified attribute's value

removeAttributeNode

Removes the specified attribute node

setAttribute

Creates a new attribute and sets the value for the attribute. If an attribute is present, changes the value for it

setAttributeNode

Inserts a new specified attribute to the element, replacing any existing attribute

As mentioned previously, the getElementsByTagName() method retrieves all elements of the specified name that occur under the node on which the method is called. For example, to print the value contained in the Name element of the first product, you could write the following code:

      document.write(doc.getElementsByTagName("Name").item(0).text); 

To display all the values of the Name elements, you could loop through the collection of NodeList object returned by the getElementsByTagName() method:

      var names = doc.getElementsByTagName("Name");      for (var i = 0; i < names.length; i++)      {        document.write(names.item(i).text + "  ");      } 

Creating a New Element

You can create a new element for an XML document using the createElement() method of the DOM object. The createElement() method takes one parameter-the name of the element that is to be created, as shown:

      var prodElement = doc.createElement("Product"); 

In the previous code, a variable named prodElement is declared and a new element node, Product, is created. The reference of the new node is stored in the prodElement variable.

Using the Node Interface

The Node interface represents a single node in the document tree structure. All the objects inherit the properties from the Node interface. In addition to the properties and functions, which are associated with them, the Node interface provides basic information like the name of the Node, its text, and its content. The following table lists the different properties of the Node interface:

Open table as spreadsheet

Property

Description

attributes

This returns a NamedNodeMap for nodes that have attributes

baseName

A read-only property that returns the base name for a node

childNodes

A read-only property containing a node list of all children for all the elements that can have them

dataType

A read-only property that specifies the data type for the node

definition

This property returns the definition of the node in the DTD

firstChild

A read-only property that returns the first child node of a node

lastChild

A read-only property that returns the last child node of a node

namespaceURI

A read-only property. This property returns the Universal Resource Identifier (URI) of the namespace

nextSibling

This property returns the next node in the parent's child list

nodeName

A read-only property and contains the name of the node, depending on node type

nodeType

A read-only property specifying the type of the node

nodeTypedValue

This property contains the value of this node as expressed in its data type

nodeTypeString

A read-only property and returns the node type in string form

nodeValue

This property contains the value of the node, depending on its type

ownerDocument

This property returns the Document interface to which the node belongs

parentNode

A read-only property and returns the parent node of all nodes except Document, DocumentFragment and Attr, which cannot have parent nodes

parsed

This property returns a value of True if this node and all of its child nodes have been parsed. Otherwise, it returns False

prefix

This property is read-only property and returns the namespace prefix

previousSibling

This property returns the previous node in the parent's child list

specified

This property returns a value indicating whether this node is specified or derived from a default value in the DTD or schema

text

This property returns the text content of this node and its sub trees

xml

This property contains the XML representation of this node and its child nodes

Note 

Note that the properties baseName, dataType, definition, nodeTypedValue, nodeTypeString, parsed, text, and xml are available only in the Microsoft implementation of DOM.

The following table lists the different methods of the Node interface:

Open table as spreadsheet

Method

Description

appendChild

Adds a new child node to the list of children for this node

cloneNode

Creates a clone node that is an exact duplicate of this node

hasChildNodes

Determines whether a node has child nodes

insertBefore

Inserts a new child node before an existing one. If no child node exists, the new child node becomes the first

removeChild

Removes the specified node from the list of child nodes

replaceChild

Replaces one child of a node with another and returns the old child

selectNodes

Creates a NodeList of all the matching child nodes returned after matching the specified pattern

selectSingleNode

Returns a Node interface for the first child node to match the specified pattern

transformNode

Processes this node and its child nodes using the specified XSL style sheet and returns the resulting transformation

transformNodeToObject

Processes this node and its descendants using the specified XSL style sheet and returns the resulting transformation in the specified object

Note 

Note that the methods selectNodes, selectSingleNode, transformNode, and transform NodeToObject are available only in the Microsoft implementation of DOM.

Now that you have had an understanding of the properties and methods of the Node object, look at an example.

When the parser loads an XML document, it gives you a reference to the document itself. From this, you can get a reference to the root element in the document (in this example, the Products element) with the property name documentElement. The children of that element are, in turn, accessible through the childNodes property.

      var nodes = doc.documentElement.childNodes; 

The childNodes property, and thus the nodes variable in this example, contains a node list that is represented by NodeList interface. In accordance with the DOM standard, you can access the elements of a node list by passing a numerical index to the item() method, with 0 corresponding to the first node in the list. In this example, therefore, nodes.item(0) returns a reference to the first child element of the Products element-the Product element.

      document.write(nodes.item(0).text); 

The result should look something like this:

      707 Sport-100 Helmet, Red HL-U509-R 

As you can see, the output shows the concatenated the values of the ProductID, Name and ProductNumber elements. If you just want to print the ProductID element value of the first Product element, you need to modify the code to look as follows:

      var nodes = doc.documentElement.childNodes.item(0).childNodes;      document.write(nodes.item(0).text); 

When you run the code now, the text 707 is displayed in the browser dialog box.

Note that Internet Explorer (and indeed many other DOM implementations) allows you to treat NodeList objects as arrays to simplify the code you need to work with them. For example, you could use array syntax to access nodes instead of the item method:

      var nodes = doc.documentElement.childNodes[0].childNodes;      alert(nodes[0].text); 

This method of accessing text values within an XML file by numerical index is useful, but it can get a little cumbersome and it can be sometimes error prone as well. Fortunately, there is another way to approach the problem.

Creating a New Node

You create a new node using the createNode() method. To create a root element using the createNode() method in JavaScript, use the following code:

      var doc = new ActiveXObject("Microsoft.XMLDOM");      doc.async = false;      doc.load("Products.xml");       if (doc.childNodes.length == 0)      {         rootNode = doc.createNode(1,"Products"," ");         doc.appendChild(rootNode);         doc.save("Products.xml");      } 

In the previous code, the DOM object serves as the root node for the tree structure. The length property of the NodeList object is used to check the number of child nodes that the root node contains. If this number is equal to 0, a new node is created using the createNode() method. This new node is then added as the root document element using the appendChild() method.

Appending a New Child Node

You append a new child node to a DOM tree using the appendChild() method of the Node object, as shown:

      var rootElement = doc.documentElement;      var prodElement=doc.createElement("Product");      rootElement.appendChild(prodElement); 

In the previous code, you first create a reference to the root element of the DOM object. You then create a new element using the createElement() method of the DOMDocument object in JavaScript. Finally, you append the created element to the last child of the root element using the appendChild() method of the Node object.

Inserting a Node Before an Existing Node

You insert a node before an existing node in a DOM tree using the insertBefore() method of the Node object, as shown:

      var newElement= doc.createElement("ProductIdentifier");      var oldElement = doc.documentElement.childNodes.item(0).childNodes.item(0);      doc.documentElement.childNodes.item(0).insertBefore(newElement, oldElement); 

In the previous code, you first create a new element called ProductIdentifier. You then obtain the reference of the first child of the first node-set within the root element and store a reference to this child node in a variable, oldElement. Finally, you insert the newly created node before the first child node using the insertBefore() method of the Node object.

Removing a Child Node

You can remove a child node from a DOM tree using the removeChild() method of the Node object, as shown:

      var elementToBeRemoved = doc.documentElement.childNodes.item(0).firstChild;      doc.documentElement.childNodes.item(0).removeChild(elementToBeRemoved); 

In the previous code, you first obtain a reference to the first child node of the first node-set of the root element and store this reference in the variable, elementToBeRemoved. You use the removeChild() method of the Node object to remove the node contained in elementToBeRemoved.

Replacing a Node

You replace an existing node with a new node using the replaceChild() method of the Node object. The replaceChild() method takes two parameters, the first parameter is the new element and the second parameter is the existing element that needs to be replaced. In the following code, the first ProductID element in the document is replaced with the new element named ProductIdentifier.

      var newElement= doc.createElement("ProductIdentifier");      var oldElement=doc.documentElement.childNodes.item(0).childNodes.item(0);      doc.documentElement.childNodes.item(0).replaceChild(newElement, oldElement);      doc.documentElement.childNodes.item(1).childNodes.item(0).        replaceChild(newElement, oldElement); 

Accessing Text Values of Elements

In the Microsoft implementation of DOM, the text enclosed within the tags in an XML document is used as a node value, which can be the value of an attribute or the text within an element.

You can display the text within an element using the text property of the Node object, as shown:

      alert(productIDElement.text); 

You can also set the value of an element or an attribute using this property, as shown:

      productIDElement.text="100"; 

Using the NodeList Interface

The NodeList interface is a collection of Node and its childNode interfaces. It allows access to all the child nodes. The length property of the NodeList interface is a very important property that returns the number of items in the NodeList collection. The following table describes the different methods of the NodeList interface.

Open table as spreadsheet

Method

Description

item

Returns the item at the index of the Node collection

nextNode

Returns null if an invalid index is entered

reset

Resets the sequence of the collection

The following code creates a NodeList interface of the Product elements using the XML document's getElementByTagName() method. With the Length property, you can determine the number of nodes in the list and display the node values by accessing each node through the index.

      var productNodes = doc.getElementsByTagName("Product");      var length = productNodes.length;      for (i = 0; i < length; i++)        document.write(productNodes.item(i).text + "<br>"); 

When you open the HTML file in the browser, the browser displays the output shown in Figure 12-2.

image from book
Figure 12-2

Using the NamedNodeMap Interface

The NamedNodeMap interface represents a collection of nodes that can be accessed by name. The following code shows how to create a NamedNodeMap interface of all the attribute nodes of the class element. Then iterate through the collection using the item method to display the attribute name and associated text.

      var firstChildElement = doc.documentElement.firstChild;      var attributes = firstChildElement.attributes;      for (i = 0; i < attributes.length; i++)        document.write(attributes.item(i).name + "="        + attributes.item(i).text + "<br>"); 

When you open the HTML file in the browser, the browser displays the attribute name and associated text. If you use the Products.xml file as an example, you will get “Category=Helmets” as the output because the Product element has only one attribute.

Using the Attr Interface

The Attr interface represents an attribute of an Element object. The DOM considers Attr to be a property of an element. The values that are allowed for an Attr interface are defined in DTD. An Attr interface is similar to a Node interface and has the properties and methods of a Node interface. The following table discusses the important properties of the Attr interface.

Open table as spreadsheet

Property

Description

Name

Sets the name of the attribute. It is same as the nodeName property for this Node interface

specified

Indicates if the value of the attribute is set in the document

Value

Returns or sets the value of the attribute

In addition to the previous methods, all the methods of the Node interface also apply to Attr because Attr is also a Node interface. The following code shows a simple example of using the Attr interface to retrieve the name and value of attributes in an XML document.

      var firstChildElement = doc.documentElement.firstChild;      var attributes = firstChildElement.attributes;      for (i = 0; i < attributes.length; i++)        document.write(attributes.item(i).name + "=" +        attributes.item(i).value + "<br>"); 

When you open the HTML file in the browser, the browser displays the name and the value of the attribute of the first node. In the case of Products.xml file, it just displays Category=Helmets as the output.

Creating Attributes

Most of the functionality that is included with the Element node is the management of attributes. This example shows how to add new attributes to an existing Element node and how to view attribute contents. Creating attributes can be accomplished with the Document method createAttribute(…). It can then be inserted into the tree with setAttributeNode(…). An even simpler method exists by using the setAttribute(…) method on the Element node. This method allows you to work with attribute names that are strings instead of attribute nodes. Listing 12-2 shows an example of how to create an attribute and retrieve its value for display purposes.

Listing 12-2: Using XML DOM to manipulate attributes

image from book
      <html xmlns="http://www.w3.org/1999/xhtml">      <head>        <title>Working with Attributes</title>        <script type="text/javascript" language="javascript">          var doc;          function btnCreateAndDisplayAttribute_Click()          {            loadDocument();            createAndDisplayAttribute();          }          function loadDocument()          {            doc = new ActiveXObject("Microsoft.XMLDOM");            doc.async = false;             doc.load("Products.xml");          }          function createAndDisplayAttribute()          {            var docElement = doc.documentElement;            //Put the attribute myAtt='hello' on rootElement            docElement.setAttribute('CategoryID', '1');            //Display the value of the added attribute            result.innerText = docElement.getAttribute('CategoryID');          }          </script>      </head>      <body>        <input type="button"           value="Create and display attribute"          onclick="btnCreateAndDisplayAttribute_Click()" />        <br/><br/><br/>        <div ></div>      </body>      </html> 
image from book

When you click the button control, the page displays the value of the CategoryID attribute, which is 1 in this case.

Using the CharacterData Interface

The CharacterData interface provides the Node object with various properties and methods to manipulate text. These interfaces can handle very large amounts of text and can be implemented by the CDATA Section, Comment, and Text Nodes. The CharacterData interface has the following properties:

Open table as spreadsheet

Property

Description

data

This property contains the data for this node, depending on node type

length

This property is read-only and contains the length of the data string in characters

The following table lists the methods for CharacterData Interface.

Open table as spreadsheet

Property

Description

appendData

Adds the specified string to existing string data

deleteData

Deletes the specified range of characters from string data

insertData

Inserts a string of data at the specified position in the string

replaceData

Replaces the characters from the specified position in the string with the supplied string data

substringData

Returns a substring consisting of the specified range of characters

Look at the following simple example to understand the use of one of the methods of the CharacterData interface.

      var prodElement = doc.documentElement.firstChild;      var text = prodElement.firstChild.firstChild;      document.write(text.data + "<br>");      var lastTwoCharacters = text.substringData(1, 2)      document.write(lastTwoCharacters  + "<br>"); 

The previous code displays the character data of the first ProductID element using the data. The substringData() method gets the specified range of characters from the substring of the text (char-offset = 1 and num-count= 2) and displays that specific data. The output produced by the page looks as follows in the browser:

      707      07 

Using the Comment Interface

The Comment represents the content which appear between ‘<!-’ and ‘-->’ as a comment entry. The Comment object does not have any properties of its own. It inherits the properties of Node objects as well as CharacterData objects. It inherits the properties as well as the methods of Node and CharacterData objects.

Using the Text Interface

The Text object represents the text of an Element or an Attr object. There is only one node of Text for each block of text. The Text object has properties of Node and CharacterData objects. The Text is also a Node object and therefore inherits the methods of Node objects. The Text interface has one method of its own named splitText(number). This method splits the text in two parts, at the specified character, and returns the rest of the text, till the end of the string into a new text node.

Using the CDATA Section Interface

The CDATA Section interface represents the content within the CDATA section brackets ![…]]. The CDATA Section provides characters that should not be parsed by the XML parser. The content of CDATA Section is stored as a childNode of a Text node. The CDATA Section interface has no methods or properties of its own but inherits those of the Text and Node objects.

If the CDATA Section contains text, which includes HTML tags, the CDATA Section object allows it to escape from the XML parser. The content of the CDATA Section is displayed without the brackets ![…]]. You can use CDATA Section interface to exclude HTML tags while parsing as shown here:

      <?xml version="1.0"?>      <Products>        <Product>          <ProductID></ProductID>          <Name><![CDATA[<span style="color:red"> Cotton Shirt </span>]]> </Name>          ----          ----      </Products> 

The code required to handle a CDATA Section is exactly the same as processing any other node since the CDATA Section is also a node.

Handling Errors in XML DOM

At times the XML parsing might generate errors due to reasons such as invalid XML, schema compliant reasons, and so on. To process these errors, the Document object exposes a property called parseError through which you can get more details about the exception. This object, derived from the interface IXMLDOMParseError provides a set of properties to retrieve the error information. The following table describes the commonly used properties of the IXMLDOMParseError object:

Open table as spreadsheet

Property

Description

reason

Stores a string explaining the reason for the error

line

Stores a long integer representing the line number for the error

errorCode

Contains long integer error code. This property contains the value 0 if there are no errors in the XML document

linepos

Stores a long integer representing the line position for the error

srcText

Stores a string containing the line that caused the error

You use the IXMLDOMParseError object to display the information about the errors encountered while parsing an XML document, as shown here:

      var doc = new ActiveXObject("Microsoft.XMLDOM");      doc.async = false;      doc.load("Products.xml");      if (doc.parseError.errorCode != 0)      {        alert("Error Code: " + doc.parseError.errorCode);        alert("Error Reason: " + doc.parseError.reason);        alert("Error Line: " + doc.parseError.line);      }      else      {        alert(doc.documentElement.xml);      } 

In the previous code, you first create a new DOM object and then use the if construct to determine whether the parseError property of this object returns any error code. If the error code is greater than 1, you display the details of the error indicating the error code, reason, and the line number where the error occurred. Otherwise, you display a message box showing the XML of the document.

XML Transformation Using XSL

In this section, you see the steps involved in transforming the contents of an XML file into HTML using the built-in support provided by XML DOM. You can accomplish this in the client side by invoking the methods of XML DOM through JavaScript. First, let's create the XSL file that will be used to transform the Products.xml file as shown in Listing 12-3.

Listing 12-3: Products.xsl file used for transforming the Products.xml file

image from book
      <?xml version="1.0" ?>      <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">        <xsl:output method="html" />        <xsl:template match="/">          <table border="1" cellSpacing="1" cellPadding="1">            <center>              <xsl:element name="tr">                <xsl:element name="td">Product ID</xsl:element>                <xsl:element name="td">                  <xsl:attribute name="align">center</xsl:attribute>                  Name               </xsl:element>                <xsl:element name="td">Product Number</xsl:element>              </xsl:element>              <xsl:for-each select="//Product">                <!-- Each product on a separate row -->                <xsl:element name="tr">                  <xsl:element name="td">                   <xsl:value-of select="ProductID" />                  </xsl:element>                  <xsl:element name="td">                   <xsl:value-of select="Name" />                  </xsl:element>                  <xsl:element name="td">                   <xsl:value-of select="ProductNumber" />                  </xsl:element>                </xsl:element>              </xsl:for-each>            </center>          </table>        </xsl:template>      </xsl:stylesheet> 
image from book

The XSL logic shown in Listing 12-3 simply loops through all the <Product> elements and for each element it retrieves the values of the ProductID, Name, and ProductNumber elements and displays them in the browser. Now that you have created the XSL file, look at the code of the Web page in Listing 12-4 to perform the transformation.

Listing 12-4: Transforming XML to HTML using XML DOM

image from book
      <html xmlns="http://www.w3.org/1999/xhtml">      <head>        <title>Transforming XML to HTML</title>        <script type="text/javascript" language="javascript">          var xmlDoc;          var xslDoc;          function btnTransformXmlToHtml_Click()          {            loadDocuments();            tranformXmlToHtml();          }          function loadDocuments()          {            //Load the XML Document            xmlDoc = new ActiveXObject("Microsoft.XMLDOM");            xmlDoc.async = false;            xmlDoc.load("Products.xml");            //Load the XSL Document            xslDoc = new ActiveXObject("Microsoft.XMLDOM");            xslDoc.async = false;            xslDoc.load("Products.xsl");          }          function tranformXmlToHtml()          {            var output = xmlDoc.transformNode(xslDoc);            result.innerHTML = output;          }          </script>      </head>      <body>         <input type="button"  value="Transform XML"           onclick="btnTransformXmlToHtml_Click()" />          <br/><br/><br/>          <div ></div>      </body>      </html> 
image from book

The preceding Web page contains mostly JavaScript code that loads the XML and XSLT files into memory, processes them, and displays the results. First, you create an instance of the XML DOM and load the Products.xml file into memory. Next, you create another instance of XML DOM and load the Products.xsl file into memory. Since XSLT files are formatted as XML, you can load them just as you would any other XML file:

You then transform the XML document using the XSL style sheet, and assign the HTML output of the transformation to the innerHTML property of the div control.

            function tranformXmlToHtml()      {        var output = xmlDoc.transformNode(xslDoc);        result.innerHTML = output;      } 

The transformNode() method takes the object that holds the XSL file as an argument. Figure 12-3 shows how the output looks when you click the Transform XML button in the browser.

image from book
Figure 12-3




Professional XML
Professional XML (Programmer to Programmer)
ISBN: 0471777773
EAN: 2147483647
Year: 2004
Pages: 215

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net