Parsing XML Documents in Code


We've seen how to access an individual element in an XML document now, so let's take the next stepparsing (that is, reading and interpreting) an entire XML document at once.

To do that, I'll create a new example, which will display the whole structure of an XML document. In this case, I'll name the function that parses the document parse . You pass a node to this function, and it will navigate through all of that node's contained nodes automatically. To parse the entire document, then, you just pass the document element to parse like this, where we're working on our XML document, 22-01.xml:

 <HTML>      <HEAD>          <TITLE>              Parsing XML Documents          </TITLE>  <XML ID="xml1" SRC="22-01.xml"></XML>  <SCRIPT LANGUAGE="JavaScript">              <!--  function parseDocument()   {   documentXML = document.all("xml1").XMLDocument   div1.innerHTML = parse(documentXML, "")   }  .          .          .  <INPUT TYPE="BUTTON" VALUE="Parse XML document"   ONCLICK="parseDocument()">  <DIV ID="div1"></DIV>      </BODY>  </HTML> 

The parse function itself will create a string, giving all the details of each node in the XML document. I'll also indent the output to show which nodes are contained by other nodes.

That's how it will work. I'll pass a node to parse to work through all the node's contained nodes, and also pass an indentation string, incrementing that string each time we go deeper into a set of nested nodes to indent the output further. To get the current node's name, we can use the nodeName property; and to get its value, we can use the nodeValue property (both of these properties are covered in Chapter 5also in this code, you'll see the HTML code for a non-breaking space &nbsp; to make sure that the browser preserves our indentation as we want it):

 <HTML>      <HEAD>          <TITLE>              Parsing XML Documents          </TITLE>          <XML ID="xml1" SRC="22-01.xml"></XML>          <SCRIPT LANGUAGE="JavaScript">              <!--             function parseDocument()              {                  documentXML = document.all("xml1").XMLDocument                  div1.innerHTML = parse(documentXML, "")              }  function parse(node1, indent)   {   var text   if (node1.nodeValue != null) {   text = indent + node1.nodeName   + "&nbsp; = " + node1.nodeValue   } else {   text = indent + node1.nodeName   }   text += "<BR>"   if (node1.childNodes.length > 0) {   for (var loopIndex = 0; loopIndex <   node1.childNodes.length; loopIndex++) {   text += parse(node1.childNodes(loopIndex),   indent + "&nbsp;&nbsp;&nbsp;&nbsp;")   }   }   return text   }  //-->          </SCRIPT>      </HEAD>      <BODY>         .         .         .      </BODY>  </HTML> 

Note that at the end of the code, we check whether there are any child nodes (using the childNodes property; see "The childNodes Property" in Chapter 5) and, if so, call parse again on each of those child nodes. In this way, the parse function will parse not only each child node, but also each child node's children, if there are any. In this way, we're calling parse recursively (see "Handling Recursion" in Chapter 3, "The JavaScript Language: Loops, Functions, and Errors") to navigate through an entire XML document without having to know that document's structure beforehand.

So far, we've indicated each node's type and value, but we can do more. We also can indicate the type of each node we display by checking the nodeType property (see Chapter 5). Here are the possible values for this property in an XML document:

  • 1 Element

  • 2 Attribute

  • 3 Text

  • 4 CDATA section

  • 5 Entity reference

  • 6 Entity

  • 7 Processing instruction

  • 8 Comment

  • 9 Document

  • 10 Document type

  • 11 Document fragment

  • 12 Notation

Now we can determine the type of each node using a switch statement, and display that information like this:

 <HTML>      <HEAD>          <TITLE>              Parsing XML Documents          </TITLE>          <XML ID="xml1" SRC="22-01.xml"></XML>          <SCRIPT LANGUAGE="JavaScript">              <!--                .                 .                 .               function parse(node1, indent)              {                  var type  switch (node1.nodeType) {   case 1:   type = "element"   break   case 2:   type = "attribute"   break   case 3:   type = "text"   break   case 4:   type = "CDATA section"   break   case 5:   type = "entity reference"   break   case 6:   type = "entity"   break   case 7:   type = "processing instruction"   break   case 8:   type = "comment"   break   case 9:   type = "document"   break   case 10:   type = "document type"   break   case 11:   type = "document fragment"   break   case 12:   type = "notation"   }  var text                    if (node1.nodeValue != null) {                        text = indent + node1.nodeName                        + "&nbsp; = " + node1.nodeValue  + "&nbsp; (Node type: " + type  + ")"                    } else {                        text = indent + node1.nodeName  + "&nbsp; (Node type: " + type  + ")"                    }                   text += "<BR>"                   .                   .                   .                  return text              }              //-->          </SCRIPT>      </HEAD>      <BODY>         .         .         .      </BODY>  </HTML> 

We can also display the attributes of each node, if there are any, and we've seen how to use the attributes collection for that:

 <HTML>      <HEAD>          <TITLE>              Parsing XML Documents          </TITLE>          <XML ID="xml1" SRC="22-01.xml"></XML>          <SCRIPT LANGUAGE="JavaScript">              <!--                .                 .                 .              function parse(node1, indent)              {                  var type                  .                  .                  .                  if (node1.nodeValue != null) {                      text = indent + node1.nodeName                      + "&nbsp; = " + node1.nodeValue                      + "&nbsp; (Node type: " + type                      + ")"                  } else {                      text = indent + node1.nodeName                      + "&nbsp; (Node type: " + type                      + ")"                  }  if (node1.attributes != null) {   if (node1.attributes.length > 0) {   for (var loopIndex = 0; loopIndex <   node1.attributes.length; loopIndex++) {   text += " (Attribute: " +   node1.attributes(loopIndex).nodeName +   " = \"" +   node1.attributes(loopIndex).nodeValue   + "\")"   }   }   }  .                 .                 .                  return text              }              //-->          </SCRIPT>      </HEAD>      <BODY>         .         .         .      </BODY>  </HTML> 

And that's it! You can see the results in Figure 22.4, where we're laying bare the entire structure of our XML document.

Figure 22.4. Parsing an XML document.

graphics/22fig04.gif

Here's the entire code:

(Listing 22-07.html on the web site)
 <HTML>      <HEAD>          <TITLE>              Parsing XML Documents          </TITLE>          <XML ID="xml1" SRC="22-01.xml"></XML>          <SCRIPT LANGUAGE="JavaScript">              <!--             function parseDocument()              {                  documentXML = document.all("xml1").XMLDocument                  div1.innerHTML = parse(documentXML, "")              }              function parse(node1, indent)              {                  var type                  switch (node1.nodeType) {                      case 1:                          type = "element"                          break                      case 2:                          type = "attribute"                          break                      case 3:                          type = "text"                          break                      case 4:                          type = "CDATA section"                          break                      case 5:                          type = "entity reference"                          break                      case 6:                          type = "entity"                          break                      case 7:                          type = "processing instruction"                          break                      case 8:                          type = "comment"                          break                      case 9:                          type = "document"                          break                           case 10:                              type = "document type"                              break                          case 11:                              type = "document fragment"                              break                          case 12:                              type = "notation"                      }                        var text                        if (node1.nodeValue != null) {                            text = indent + node1.nodeName                            + "&nbsp; = " + node1.nodeValue                            + "&nbsp; (Node type: " + type                            + ")"                        } else {                            text = indent + node1.nodeName                            + "&nbsp; (Node type: " + type                            + ")"                        }                      if (node1.attributes != null) {                           if (node1.attributes.length > 0) {                               for (var loopIndex = 0; loopIndex <                                   node1.attributes.length; loopIndex++) {                                   text += " (Attribute: " +                                       node1.attributes(loopIndex).nodeName +                                       " = \"" +                                       node1.attributes(loopIndex).nodeValue                                       + "\")"                              }                          }                      }                      text += "<BR>"                      if (node1.childNodes.length > 0) {                         for (var loopIndex = 0; loopIndex <                             node1.childNodes.length; loopIndex++) {                             text += parse(node1.childNodes(loopIndex),                             indent + "&nbsp;&nbsp;&nbsp;&nbsp;")                         }                     }                     return text              }              //-->          </SCRIPT>      </HEAD>      <BODY>          <H1>              Parsing XML Documents          </H1>          <INPUT TYPE="BUTTON" VALUE="Parse XML document"              ONCLICK="parseDocument()">          <DIV ID="div1"></DIV>      </BODY>  </HTML> 

At this point, then, we have a good handle on working with the contents of XML documents. When you know how to access those contents, you can read all the data you need from an XML document, and you can use JavaScript to work with that data. An XML document might tell you what HTML elements to create in a page and where to place them, for example, or it might give you data to fill controls such as select controls with.

Tip

You also can write XML documents, but not directly with Internet Explorer (which can't write data to the user 's disk from code). Instead, you should send the data back to the server using HTML forms and create or install software on the server to write the XML documents and send them back.




Inside Javascript
Inside JavaScript
ISBN: 0735712859
EAN: 2147483647
Year: 2005
Pages: 492
Authors: Steve Holzner

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net