Reading XML

As mentioned previously, XmlReader is an abstract base class for XML reader classes. This class provides fast, non-cached, forward-only cursors to read XML documents.

The XmlTextReader, XmlNodeReader, and XmlValidatingReader classes are defined from the XmlReader class. Figure 6-2 shows XmlReader and its derived classes.

click to expand
Figure 6-2: XmlReader and its derived classes

You use the XmlTextReader, XmlNodeReader, and XmlValidatingReader classes to read XML documents. These classes define overloaded constructors to read XML files, strings, streams, TextReader objects, XmlNameTable, and combinations of these. After creating an instance, you simply call the Read method of the class to read the document. The Read method starts reading the document from the root node and continues until Read returns False, which indicates there's no node left to read in the document. Listing 6-1 reads an XML file and displays some information about the file. In this example, we use the books.xml file. You can use any XML by replacing the string name.

Listing 6-1: Reading an XML File

start example
 Dim reader As XmlTextReader = New XmlTextReader("C:\\books.xml") Console.WriteLine("General Information") Console.WriteLine("===================") Console.WriteLine(reader.Name) Console.WriteLine(reader.BaseURI) Console.WriteLine(reader.LocalName) 
end example

Getting Node Information

The Name property returns the name of the node with the namespace prefix, and the LocalName property returns the name of the node without the prefix.

The Item property works as an indexer and returns the value of the attribute at the specified index. The Value property returns the value of current node. You can even get the level of the node by using the Depth property, as shown in Listing 6-2.

Listing 6-2: Getting XML Node Information

start example
 Dim reader As XmlTextReader = New XmlTextReader("C:\\books.xml") While reader.Read()        If reader.HasValue Then              Console.WriteLine("Name: "+reader.Name)              Console.WriteLine("Node Depth: " +reader.Depth.ToString())             Console.WriteLine("Value: "+reader.Value)        End If End While 
end example

The NodeType property returns the type of the current node in the form of an XmlNodeType enumeration:

 XmlNodeType type = reader.NodeType; 

which defines the type of a node. The XmlNodeType enumeration members are Attribute, CDATA, Comment, Document, Element, WhiteSpace, and so on. These represent XML document node types.

In Listing 6-3, you read a document's nodes one by one and count them. Once the reading and counting are done, you see how many comments, processing instructions, CDATAs, elements, whitespaces, and so on that a document has and display them on the console. The XmlReader.NodeType property returns the type of node in the form of an XmlNodeType enumeration. The XmlNodeType enumeration contains a member corresponding to each node type. You can compare the return value with the XmlNodeType members to find out the node's type.

Listing 6-3: Getting Node Information

start example
 Sub Main()     Dim DecCounter As Integer = 0     Dim PICounter As Integer = 0     Dim DocCounter As Integer = 0     Dim CommentCounter As Integer = 0     Dim ElementCounter As Integer = 0     Dim AttributeCounter As Integer = 0     Dim TextCounter As Integer = 0     Dim WhitespaceCounter As Integer = 0     Dim reader As XmlTextReader = New XmlTextReader("C:\\books.xml")     While reader.Read()       Dim nodetype As XmlNodeType = reader.NodeType       Select Case nodetype         Case XmlNodeType.XmlDeclaration           DecCounter = DecCounter + 1           Exit Select         Case XmlNodeType.ProcessingInstruction           PICounter = PICounter + 1           Exit Select         Case XmlNodeType.DocumentType           DocCounter = DocCounter + 1           Exit Select         Case XmlNodeType.Comment           CommentCounter = CommentCounter + 1           Exit Select         Case XmlNodeType.Element           ElementCounter = ElementCounter + 1           If reader.HasAttributes Then             AttributeCounter += reader.AttributeCount           End If           Exit Select         Case XmlNodeType.Text           TextCounter = TextCounter + 1           Exit Select         Case XmlNodeType.Whitespace           WhitespaceCounter = WhitespaceCounter + 1           Exit Select       End Select     End While     ' Print the info     Console.WriteLine("White Spaces:" + WhitespaceCounter.ToString())     Console.WriteLine("Process Instructions:" + PICounter.ToString())     Console.WriteLine("Declaration:" + DecCounter.ToString())     Console.WriteLine("White Spaces:" + DocCounter.ToString())     Console.WriteLine("Comments:" + CommentCounter.ToString())     Console.WriteLine("Attributes:" + AttributeCounter.ToString())   End Sub 
end example

The Case statement can have the values XmlNodeType.XmlDeclaration, XmlNodeType.ProcessingInstruction, XmlNodeType.DocumentType, XmlNodeType.Comment, XmlNodeType.Element, XmlNodeType.Text, XmlNodeType.Whitespace , and so on.

The XmlNodeType enumeration specifies the type of node. Table 6-1 describes its members.

Table 6-1: The XmlNodeType Enumeration's Members




Attribute node


CDATA section


Comment node


Document object


Document fragment


The DTD, indicated by the <!DOCTYPE> tag


Element node


End of element


End of an entity


Entity declaration


Reference to an entity


Returned if XmlReader is not called yet


A notation in the document type


Represents a processing instruction (PI) node


Represents whitespace between markup in a mixed content model


Represents the text content of an element


Represents whitespace between markup


Represents an XML declaration node

Moving to a Content Node

You can use the MoveToMethod to move from the current node to the next content node of an XML document. A content's node is an item of the following type: Text, CDATA, Element, EntityReference, or Entity. So, if you call the MoveToContent method, it skips other types of nodes besides the content type nodes. For example, if the next node of the current node is PI, DxlDeclaration, or DocumentType, it'll skip these nodes until it finds a content type node. Listing 6-4 reads books.xml and moves through its nodes using the MoveToContent method.

Listing 6-4: Using the MoveToContent Method

start example
 Dim reader As XmlTextReader = New XmlTextReader("C:\\books.xml")     While reader.Read()       Console.WriteLine(reader.Name)       reader.MoveToContent()       Console.WriteLine(reader.Name)     End While 
end example

Using the GetAttributes of a Node

The GetAttribute method is an overloaded method. You can use this method to return attributes with the specified name, index, local name, or namespace Uniform Resource Indicator (URI). You use the HasAttributes property to check if a node has attributes, and AttributeCount returns the number of attributes on the node. The local name is the name of the current node without prefixes. For example, if <bk:book> represents a name of a node, where bk is a namespace and : refers to the namespace, the local name for the <bk:book> element is book. MoveToFirstAttribute moves to the first attribute. The MoveToElement method moves to the element that contains the current attribute node (see Listing 6-5).

Listing 6-5: Using the GetAttribute of a Node

start example
 Imports System.Xml Module Module1   Sub Main()     Dim reader As XmlTextReader = _     New XmlTextReader("C:\\books.xml")     reader.MoveToContent()     reader.MoveToFirstAttribute()     Console.WriteLine("First Attribute Value" + reader.Value)     Console.WriteLine("First Attribute Name" + reader.Name)     While reader.Read()       If reader.HasAttributes Then         Console.WriteLine(reader.Name + " Attribute")         Dim i As Integer         Dim counter As Integer = reader.AttributeCount - 1         For i = 0 To counter           reader.MoveToAttribute(i)           Console.WriteLine("Name: " + reader.Name)         Next i         reader.MoveToElement()       End If     End While   End Sub End Module 
end example

You can move to attributes by using MoveToAttribute, MoveToFirstAttribute, and MoveToNextAttribute. MoveToFirstAttribute and MoveToNextAttribute move to the first and next attributes, respectively. After calling MoveToAttribute, the Name, Namespace, and Prefix properties will reflect the properties of the specified attribute.

Searching for a Node

The Skip method skips the current node. It's useful when you're looking for a particular node and want to skip other nodes. In Listing 6-6, you read the books.xml document and compare its (through XmlTextReader) to look for a node with the name bookstore and display the name, level, and value of that node using XmlReader's Name, Depth, and Value properties.

Listing 6-6: Using the Skip Method

start example
 New XmlTextReader("C:\\books.xml")     While reader.Read()       ' Look for a node with name bookstore       If reader.Name <> "bookstore" Then         reader.Skip()       Else         Console.WriteLine("Name: " + reader.Name)         Console.WriteLine("Level of the node: " + reader.Depth.ToString())         Console.WriteLine("Value: " + reader.Value)       End If     End While     reader.Close() 
end example

Closing the Document

Finally, you use Close to close the opened XML document.

Tables 6-2 and 6-3 describe the XmlReader class properties and methods. We already discussed some of them in the previous discussion.

Table 6-2: The XmlReader Class Properties




Returns the number of attributes on the current node


Returns the base URI of the current node


Returns the level of the current node


Indicates whether its pointer is at the end of the stream


Indicates if a node has attributes


Indicates if a node has a value


Indicates whether the current node is an attribute generated from the default value defined in the DTD or schema


Returns if the current node is empty or not


Returns the value of the attribute


Name of the current node without the namespace prefix


Name of the current node with the namespace prefix


Namespace Uniform Resource Name (URN) of the current namespace scope


Returns the XmlNameTable associated with this implementation


Returns the type of node


Returns the namespace associated with a node


Read state


Returns the text value of a node


Returns the current xml:lang scope


Returns the current xml:space scope

Table 6-3: The XmlReader Class Methods




Closes the stream and changes ReadState to Closed


Returns the value of an attribute


Checks if a node has start tag


Resolves a namespace prefix in the current element's scope

MoveToAttribute, MoveToContent, MoveToElement

Moves to specified attribute, content, and element

MoveToFirstAttribute, MoveToNextAttribute

Moves to the first and next attributes


Reads a node


Parses the attribute value into one or more Text and/or EntityReference node types

ReadXXX (ReadChar, ReadBoolean, ReadDate, ReadInt32, and so on)

Reads the contents of an element into the specified type including char, integer, double, string, date, and so on


Reads all the content as a string


Skips the current element

Applied ADO. NET(c) Building Data-Driven Solutions
Applied ADO.NET: Building Data-Driven Solutions
ISBN: 1590590732
EAN: 2147483647
Year: 2006
Pages: 214 © 2008-2017.
If you may any questions please contact us: