Recipe15.1.Reading and Accessing XML Data in Document Order


Recipe 15.1. Reading and Accessing XML Data in Document Order

Problem

You need to read in all the elements of an XML document and obtain information about each element, such as its name and attributes.

Solution

Create an XmlReader and use its Read method to process the document as shown in Example 15-1.

Example 15-1. Reading an XML document

 using System;  using System.Xml; // … public static void Indent(int level)  {      for (int i = 0; i < level; i++)        Console.Write(" ");  } public static void AccessXML( )  {     string xmlFragment = "<?xml version='1.0'?>" +         "<!-- My sample XML -->" +         "<?pi myProcessingInstruction?>" +         "<Root>" +         "<Node1 nodeId='1'>First Node</Node1>" +         "<Node2 nodeId='2'>Second Node</Node2>" +         "<Node3 nodeId='3'>Third Node</Node3>" +         "</Root>";     byte[] bytes = Encoding.UTF8.GetBytes(xmlFragment);     using (MemoryStream memStream = new MemoryStream(bytes))     {         XmlReaderSettings settings = new XmlReaderSettings();         // Check for any illegal characters in the XML.         settings.CheckCharacters = true;         using (XmlReader reader = XmlReader.Create(memStream, settings))         {             int level = 0;             while (reader.Read())             {                 switch (reader.NodeType)                 {                     case XmlNodeType.CDATA:                          Indent(level);                          Console.WriteLine("CDATA: {0}", reader.Value);                          break;                     case XmlNodeType.Comment:                          Indent(level);                          Console.WriteLine("COMMENT: {0}", reader.Value);                          break;                     case XmlNodeType.DocumentType:                          Indent(level);                         Console.WriteLine("DOCTYPE: {0}={1}",                              reader.Name, reader.Value);                          break;                     case XmlNodeType.Element:                          Indent(level);                          Console.WriteLine("ELEMENT: {0}", reader.Name);                         level++;                         while (reader.MoveToNextAttribute())                          {                             Indent(level);                              Console.WriteLine("ATTRIBUTE: {0}='{1}'",                                 reader.Name, reader.Value);                          }                          break;                     case XmlNodeType.EndElement:                          level--;                          break;                     case XmlNodeType.EntityReference:                          Indent(level);                          Console.WriteLine("ENTITY: {0}", reader.Name);                          break;                     case XmlNodeType.ProcessingInstruction:                          Indent(level);                          Console.WriteLine("INSTRUCTION: {0}={1}",                             reader.Name, reader.Value);                         break;                     case XmlNodeType.Text:                          Indent(level);                          Console.WriteLine("TEXT: {0}", reader.Value);                         break;                     case XmlNodeType.XmlDeclaration:                          Indent(level);                          Console.WriteLine("DECLARATION: {0}={1}",                             reader.Name, reader.Value);                         break;                  }              }          }     }  } 

This code dumps the XML document in a hierarchical format:

 DECLARATION: xml=version='1.0' COMMENT: My sample XML INSTRUCTION: pi=myProcessingInstruction ELEMENT: Root  ELEMENT: Node1   ATTRIBUTE: nodeId='1'   TEXT: First Node  ELEMENT: Node2   ATTRIBUTE: nodeId='2'   TEXT: Second Node  ELEMENT: Node3   ATTRIBUTE: nodeId='3'   TEXT: Third Node 

Discussion

Reading existing XML and identifying different node types is one of the fundamental actions that you will need to perform when dealing with XML. The code in the Solution creates an XmlReader from a string (it could also have used a stream), then iterates over the nodes while re-creating the formatted XML for output to the console window.

The Solution shows creating a MemoryStream from an XML fragment in a string like this:

     string xmlFragment = "<?xml version='1.0'?>" +         "<!-- My sample XML -->" +         "<?pi myProcessingInstruction?>" +         "<Root>" +         "<Node1 nodeId='1'>First Node</Node1>" +         "<Node2 nodeId='2'>Second Node</Node2>" +         "<Node3 nodeId='3'>Third Node</Node3>" +         "</Root>";     byte[] bytes = Encoding.UTF8.GetBytes(xmlFragment);     MemoryStream memStream = new MemoryStream(bytes); 

Once the MemoryStream has been established, the settings for the XmlReader need to be set up on an XmlReaderSettings object instance. These settings tell the XmlReader to check for any illegal characters in the XML fragment:

     XmlReaderSettings settings = new XmlReaderSettings();     // Check for any illegal characters in the XML.     settings.CheckCharacters = true; 

The while loop iterates over the XML by reading one node at a time and examining the NodeType property of the current node that the reader is on to determine what type of XML node it is:

     while (reader.Read( ))     {         switch (reader.NodeType)         { 

The NodeType property is an XmlNodeType enumeration value that specifies the types of XML nodes that can be present. The XmlNodeType enumeration values are shown in Table 15-1.

Table 15-1. The XmlNodeType enumeration values

Name

Description

Attribute

An attribute node of an element.

CDATA

A marker for sections of text to escape that would usually be treated as markup.

Comment

A comment in the XML:

<!my comment -->.

Document

The root of the XML document tree.

DocumentFragment

Document fragment node.

DocumentType

The document type declaration.

Element

An element tag:

<myelement>.

EndElement

An end element tag:

</myelement>.

EndEntity

Returned at the end of an entity after calling ResolveEntity.

Entity

Entity declaration.

EntityReference

A reference to an entity.

None

This is the node returned if Read has not yet been called on the XmlReader.

Notation

A notation in the DTD (document type definition).

ProcessingInstruction

The processing instruction:

<?pi myProcessingInstruction?>.

SignificantWhitespace

Whitespace when mixed content model is used or when whitespace is being preserved.

Text

Text content for a node.

Whitespace

The whitespace between markup entries.

XmlDeclaration

The first node in the document that cannot have children:

<?xml version='1.0'?>.


See Also

See the "XmlReader Class," "XmlNodeType Enumeration," and "MemoryStream Class" topics in the MSDN documentation.



C# Cookbook
Secure Programming Cookbook for C and C++: Recipes for Cryptography, Authentication, Input Validation & More
ISBN: 0596003943
EAN: 2147483647
Year: 2004
Pages: 424

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net