Accessing an XML File

   


Access and Manipulate XML Data: Access an XML file by using the Document Object Model ( DOM ) and an XmlReader.

The most basic thing you can do with an XML file is open it and read it to find out what the file contains. The .NET Framework offers both unstructured and structured ways to access the data within an XML file. That is, you can either treat the XML file as a simple stream of information, or you can treat it as a hierarchical structure composed of different entities, such as elements and attributes.

In this section of the chapter, you'll learn how to extract information from an XML file. I'll start by showing you how you can use the XmlReader object to move through an XML file, extracting information as you go. Then you'll see how other objects, including the XmlNode and XmlDocument objects, provide a more structured view of an XML file.

I'll work with a very simple XML file named Books.xml that represents three books that a computer book store might stock. Here's the raw XML file:

 <?xml version="1.0" encoding="UTF-8"?> <Books>     <Book Pages="1046">        <Author>Delaney, Kalen</Author>        <Title>Inside Microsoft SQL Server 2000</Title>        <Publisher>Microsoft Press</Publisher>     </Book>     <Book Pages="1000">        <Author>Gunderloy, Mike</Author>        <Title>ADO and ADO.NET Programming</Title>        <Publisher>Sybex</Publisher>     </Book>     <Book Pages="484">        <Author>Cooper, James W.</Author>        <Title>Visual Basic Design Patterns</Title>        <Publisher>Addison Wesley</Publisher>     </Book> </Books> 

Understanding the DOM

The Document Object Model, or DOM, is an Internet standard for representing the information contained in an HTML or XML document as a tree of nodes. Like many other Internet standards, the DOM is an official standard of the World Wide Web Consortium, better known as the W3C.

Even though a DOM standard exists, not all vendors implement the DOM in exactly the same way. The major issue is that several different standards are actually grouped together under the general name of DOM. Also, vendors pick and choose which parts of these standards to implement. The .NET Framework includes support for the DOM Level 1 Core and DOM Level 2 Core specifications, but it also extends the DOM by adding additional objects, methods , and properties to the specification.

NOTE

DOM Background You can find the official DOM specifications at www.w3.org/DOM. For details of Microsoft's implementation in the .NET Framework, see the "XML Document Object Model (DOM)" topic in the .NET Framework Developer's Guide.


Structurally, an XML document is a series of nested items, including elements and attributes. Any nested structure can be transformed to an equivalent tree structure by making the outermost nested item the root of the tree, the next -in items the children of the root, and so on.

The DOM provides the standard for constructing this tree, including a classification for individual nodes and rules for which nodes can have children. Figure 2.1 shows how the Books.xml file might be represented as a tree.

Figure 2.1. XML file represented as a tree of nodes.

EXAM TIP

Attributes in the DOM In the DOM, attributes are not represented as nodes within the tree. Rather, attributes are considered to be properties of their parent elements. You'll see later in the chapter that this is reflected in the classes provided by the .NET Framework for reading XML files.


In its simplest form, the DOM defines an XML document as consisting as a tree of nodes. The root element in the XML file becomes the root node of the tree, and other elements become child nodes.

Using an XmlReader Object

The XmlReader class is designed to provide forward-only, read-only access to an XML file. This class treats an XML file similar to the way that a cursor treats a resultset from a database. At any given time, there is one current node within the XML file, represented by a pointer that you can move around within the file. The class implements a Read method that returns the next XML node to the calling application. There are also many other members in the XmlReader class; I've listed some of these in Table 2.1.

Table 2.1. Important Members of the XmlReader Class

Member

Type

Description

Depth

Property

The depth of the current node in the XML document.

EOF

Property

A Boolean property that is True when the current node pointer is at the end of the XML file.

GetAttribute

Method

Gets the value of an attribute.

HasAttributes

Property

True when the current node contains attributes.

HasValue

Property

True when the current node can have a Value property.

IsEmptyElement

Property

True when the current node represents an empty XML element.

IsStartElement

Method

Determines whether the current node is a start tag.

Item

Property

Indexed collection of attributes for the current node (if any).

MoveToElement

Method

Moves to the element containing the current attribute.

MoveToFirstAttribute

Method

Moves to the first attribute of the current element.

MoveToNextAttribute

Method

Moves to the next attribute.

Name

Property

Qualified name of the current node.

NodeType

Property

Type of the current node.

Read

Method

Reads the next node from the XML file.

Skip

Method

Skips the children of the current element.

Value

Property

Value of the current node.

The XmlReader class is a purely abstract class. That is, this class is marked with the MustInherit modifier; you cannot create an instance of XmlReader in your own application. Generally, you'll use the XmlTextReader class instead. The XmlTextReader class implements XmlReader for use with text streams. Step By Step 2.1 shows you how to use the XmlTextReader class.

STEP BY STEP

2.1 Using the XmlTextReader Class

  1. Create a new Visual Basic .NET Windows application. Name the application 310C02.

  2. Right-click on the project node in Solution Explorer and select Add, Add New Item.

  3. Select the Local Project Items node in the Categories treeview. Select the XML File template. Name the new file Books.xml and click OK.

  4. Modify the code for the Books.xml file as follows :

     <?xml version="1.0" encoding="UTF-8"?> <Books>     <Book Pages="1046">        <Author>Delaney, Kalen</Author>        <Title>Inside Microsoft SQL Server 2000</Title>        <Publisher>Microsoft Press</Publisher>     </Book>     <Book Pages="1000">        <Author>Gunderloy, Mike</Author>        <Title>ADO and ADO.NET Programming</Title>        <Publisher>Sybex</Publisher>     </Book>     <Book Pages="484">        <Author>Cooper, James W.</Author>        <Title>Visual Basic Design Patterns</Title>        <Publisher>Addison Wesley</Publisher>     </Book> </Books> 
  5. Add a new form to the project. Name the new form StepByStep2-1.vb.

  6. Add a Button control named btnReadXML and a ListBox control named lbNodes to the form.

  7. Double-click the Button control to open the form's module. Add this line of code at the top of the module:

     Imports System.Xml 
  8. Add code to handle the Button's Click event:

     Private Sub btnReadXML_Click(_  ByVal sender As System.Object, _  ByVal e As System.EventArgs) Handles btnReadXML.Click     Dim intI As Integer     Dim strNode As String     ' Create a new XmlTextReader on the file     Dim xtr As XmlTextReader = _      New XmlTextReader("..\Books.xml")     ' Walk through the entire XML file     Do While xtr.Read         strNode = ""         For intI = 1 To xtr.Depth             strNode &= " "         Next         strNode = strNode & xtr.Name & " "         strNode &= xtr.NodeType.ToString         If xtr.HasValue Then             strNode = strNode & ": " & xtr.Value         End If         lbNodes.Items.Add(strNode)     Loop     ' Clean up     xtr.Close() End Sub 
  9. Set the form as the startup form for the project.

  10. Run the project. Click the button. You'll see a schematic representation of the XML file, as shown in Figure 2.2.

    Figure 2.2. An XML file translated into schematic form by an XmlTextReader object.

As you can see in Step By Step 2.1, the DOM includes nodes for everything in the XML file, including the XML declaration and any whitespace (such as the line feeds and carriage returns that separate lines of the files). On the other hand, the node tree doesn't include XML attributes. But the DOM and the XmlTextReader are flexible enough that you can customize their work as you like. Step By Step 2.2 shows an example in which the code displays only elements, text, and attributes.

STEP BY STEP

2.2 Using the XmlTextReader Class to Read Selected XML Entities

  1. Add a new form to the project. Name the new form StepByStep2-2.vb.

  2. Add a Button control named btnReadXml and a ListBox control named lbNodes to the form.

  3. Double-click the Button control to open the form's module. Add this line of code at the top of the module:

     Imports System.Xml 
  4. Add code to handle the Button's Click event:

     Private Sub btnReadXml_Click(_  ByVal sender As System.Object, _  ByVal e As System.EventArgs) Handles btnReadXML.Click     Dim intI As Integer     Dim intJ As Integer     Dim strNode As String     ' Create a new XmlTextReader on the file     Dim xtr As XmlTextReader = _      New XmlTextReader("..\Books.xml")     ' Walk through the entire XML file     Do While xtr.Read         If (xtr.NodeType = XmlNodeType.Element) Or _          (xtr.NodeType = XmlNodeType.Text) Then             strNode = ""             For intI = 1 To xtr.Depth                 strNode &= " "             Next             strNode = strNode & xtr.Name & " "             strNode &= xtr.NodeType.ToString             If xtr.HasValue Then                 strNode = strNode & ": " & xtr.Value             End If             lbNodes.Items.Add(strNode)             ' Now add the attributes, if any             If xtr.HasAttributes Then                 While xtr.MoveToNextAttribute                     strNode = ""                     For intI = 1 To xtr.Depth                         strNode &= " "                     Next                     strNode = strNode & xtr.Name & " "                     strNode &= xtr.NodeType.ToString                     If xtr.HasValue Then                         strNode = strNode & ": " & _                          xtr.Value                     End If                     lbNodes.Items.Add(strNode)                 End While             End If         End If     Loop     ' Clean up     xtr.Close() End Sub 
  5. Set the form as the startup form for the project.

  6. Run the project. Click the button. You'll see a schematic representation of the elements and attributes in the XML file, as shown in Figure 2.3.

    Figure 2.3. Selected entities from an XML file translated into schematic form by an XmlTextReader object.

Note that although the DOM does not consider attributes to be nodes, Microsoft has provided the MoveToNextAtttibute method to treat them as nodes. Alternatively, you can retrieve attributes by using the Item property of the XmlTextReader. If the current node represents an element in the XML file, this code will retrieve the value of the first attribute of the element:

 xtr.Items(0) 

This code will retrieve the value of an attribute named Page:

 xtr.Item("Page") 

The XMLNode Class

The individual items in the tree representation of an XML file are called nodes. As you've seen in Step By Steps 2.1 and 2.2, many different entities within the XML file can be represented by nodes: elements, attributes, whitespace, end tags, and so on. The DOM distinguishes these different types of nodes by assigning a node type to each one. In the .NET Framework, the possible node types are listed by the XmlNodeType enumeration. Table 2.2 lists the members of this enumeration.

Table 2.2. Members of the XmlNodeType Enumeration

Member

Represents

Attribute

An XML attribute

CDATA

An XML CDATA section

Comment

An XML comment

Document

The outermost element of the XML document (that is, the root of the tree representation of the XML)

DocumentFragment

The outermost element of a subsection of an XML document

DocumentType

A Document Type Definition (DTD) reference

Element

An XML element

EndElement

The closing tag of an XML element

EndEntity

The end of an included entity

Entity

An XML entity declaration

EntityReference

A reference to an entity

None

An XmlReader that has not been initialized

Notation

An XML notation

ProcessingInstruction

An XML processing instruction

SignificantWhitespace

Whitespace that must be preserved to re-create the original XML document

Text

The text content of an attribute, element, or other node

Whitespace

Space between actual XML markup items

XmlDeclaration

The XML declaration

The code you've seen so far in this chapter deals with nodes as part of a stream of information returned by the XmlTextReader object. But the .NET Framework also includes another class, XmlNode, that can be used to represent an individual node from the DOM representation of an XML document. If you instantiate an XmlNode object to represent a particular portion of an XML document, you can alter the properties of the object and then write the changes back to the original file. The DOM provides two-way access to the underlying XML in this case.

NOTE

Specialized Node Classes In addition to XmlNode, the System.Xml namespace also contains a set of classes that represent particular types of nodes: XmlAttribute, XmlComment, XmlElement, and so on. These classes all inherit from the XmlNode class.


The XmlNode class has a rich interface of properties and methods. You can retrieve or set information about the entity represented by an XmlNode object, or you can use its methods to navigate the DOM. Table 2.3 shows the important members of the XmlNode class.

Table 2.3. Important Members of the XmlNode Class

Member

Type

Description

AppendChild

Method

Adds a new child node to the end of this node's list of children.

Attributes

Property

Returns the attributes of the node as an XmlAttributeCollection.

ChildNodes

Property

Returns all child nodes of this node.

CloneNode

Method

Creates a duplicate of this node.

FirstChild

Property

Returns the first child node of this node.

HasChildNodes

Property

True if this node has any children.

InnerText

Property

The value of the node and all of its children.

InnerXml

Property

The markup representing only the children of this node.

InsertAfter

Method

Inserts a new node after this node.

InsertBefore

Method

Inserts a new node before this node.

LastChild

Property

Returns the last child node of this node.

Name

Property

The name of the node.

NextSibling

Property

Returns the next child of this node's parent node.

NodeType

Property

The type of this node.

OuterXml

Property

The markup representing this node and its children.

OwnerDocument

Property

The XmlDocument object that contains this node.

ParentNode

Property

Returns the parent of this node.

PrependChild

Method

Adds a new child node to the beginning of this node's list of children.

PreviousSibling

Property

Returns the previous child of this node's parent node.

RemoveAll

Method

Removes all children of this node.

RemoveChild

Method

Removes a specified child of this node.

ReplaceChild

Method

Replaces a child of this node with a new node.

SelectNodes

Method

Selects a group of nodes matching an XPath expression.

SelectSingleNode

Method

Selects the first node matching an XPath expression.

WriteContentTo

Method

Writes all children of this node to an XmlWriter object.

WriteTo

Method

Writes this node to an XmlWriter.

The XmlDocument Class

You can't directly create an XmlNode object that represents an entity from a particular XML document. Instead, you can retrieve XmlNode objects from an XmlDocument object. The XmlDocument object represents an entire XML document. Step by Step 2.3 shows how you can use the XmlNode and XmlDocument objects to navigate through the DOM representation of an XML document.

STEP BY STEP

2.3 Using the XmlDocument and XmlNode Classes

  1. Add a new form to the project. Name the new form StepByStep2-3.vb.

  2. Add a Button control named btnReadXML and a ListBox control named lbNodes to the form.

  3. Double-click the Button control to open the form's module. Add this line of code at the top of the module:

     Imports System.Xml 
  4. Add code to handle the Button's Click event:

     Private Sub btnReadXML_Click(_  ByVal sender As System.Object, _  ByVal e As System.EventArgs) Handles btnReadXML.Click     Dim intI As Integer     Dim intJ As Integer     Dim strNode As String     ' Create a new XmlTextReader on the file     Dim xtr As XmlTextReader = _      New XmlTextReader("..\Books.xml")     ' Load the XML file to an XmlDocument     xtr.WhitespaceHandling = WhitespaceHandling.None     Dim xd As XmlDocument = New XmlDocument()     xd.Load(xtr)     ' Get the document root     Dim xnodRoot As XmlNode = xd.DocumentElement     ' Walk the tree and display it     Dim xnodWorking As XmlNode     If xnodRoot.HasChildNodes Then         xnodWorking = xnodRoot.FirstChild         While Not IsNothing(xnodWorking)             AddChildren(xnodWorking, 0)             xnodWorking = xnodWorking.NextSibling         End While     End If     ' Clean up     xtr.Close() End Sub Private Sub AddChildren(ByVal xnod As XmlNode, _  ByVal Depth As Integer)     ' Add this node to the listbox     Dim strNode As String     Dim intI As Integer     Dim intJ As Integer     Dim atts As XmlAttributeCollection     ' Only process Text and Element nodes     If (xnod.NodeType = XmlNodeType.Element) Or _      (xnod.NodeType = XmlNodeType.Text) Then         strNode = ""         For intI = 1 To Depth             strNode &= " "         Next         strNode = strNode & xnod.Name & " "         strNode &= xnod.NodeType.ToString         strNode = strNode & ": " & xnod.Value         lbNodes.Items.Add(strNode)         ' Now add the attributes, if any         atts = xnod.Attributes         If Not atts Is Nothing Then             For intJ = 0 To atts.Count - 1                 strNode = ""                For intI = 1 To Depth + 1                     strNode &= " "                Next                strNode = strNode & _                  atts(intJ).Name & " "                strNode &= atts(intJ).NodeType.ToString                strNode = strNode & ": " & _                  atts(intJ).Value                lbNodes.Items.Add(strNode)             Next         End If         ' And recursively walk         ' the children of this node         Dim xnodworking As XmlNode         If xnod.HasChildNodes Then             xnodworking = xnod.FirstChild             While Not IsNothing(xnodworking)                 AddChildren(xnodworking, Depth + 1)                 xnodworking = xnodworking.NextSibling             End While         End If     End If End Sub 
  5. Set the form as the startup form for the project.

  6. Run the project. Click the button. You'll see a schematic representation of the elements and attributes in the XML file.

Step By Step 2.3 uses recursion to visit all the nodes in the XML file. That is, it starts at the root node of the document (returned by the DocumentElement property of the XmlDocument object) and visits each child of that node in turn. For each child, it displays the desired information, and then visits each child of that node in turn , and so on.

In addition to the properties used in this Step By Step, the XmlDocument class includes a number of other useful members. Table 2.4 lists the most important of these.

Table 2.4. Important Members of the XmlDocument Class

Member

Type

Description

CreateAttribute

Method

Creates an attribute node.

CreateElement

Method

Creates an element node.

CreateNode

Method

Creates an XmlNode object.

DocumentElement

Property

Returns the root XmlNode for this document.

DocumentType

Property

Returns the node containing the DTD declaration for this document, if it has one.

ImportNode

Method

Imports a node from another XML document.

Load

Method

Loads an XML document into the XmlDocument.

LoadXml

Method

Loads the XmlDocument from a string of XML data.

NodeChanged

Event

Fires after the value of a node has been changed.

NodeChanging

Event

Fires when the value of a node is about to be changed.

NodeInserted

Event

Fires when a new node has been inserted.

NodeInserting

Event

Fires when a new node is about to be inserted.

NodeRemoved

Event

Fires when a node has been removed.

NodeRemoving

Event

Fires when a node is about to be removed.

PreserveWhitespace

Property

True if whitespace in the document should be preserved when loading or saving the XML.

Save

Method

Saves the XmlDocument as a file or stream.

WriteTo

Method

Saves the XmlDocument to an XmlWriter.

REVIEW BREAK

  • The Document Object Model (DOM) is a W3C standard for representing the information contained in an HTML or XML document as a tree of nodes.

  • The XmlReader class defines an interface for reading XML documents. The XmlTextReader class inherits from the XmlReader class to read XML documents from streams.

  • The XmlNode object can be used to represent a single node in the DOM.

  • The XmlDocument object represents an entire XML document.


   
Top


MCAD. MCSD Training Guide (Exam 70-310. Developing XML Web Services and Server Components with Visual Basic. NET and the. NET Framework)
MCAD/MCSD Training Guide (70-310): Developing XML Web Services and Server Components with Visual Basic(R) .NET and the .NET Framework
ISBN: 0789728206
EAN: 2147483647
Year: 2002
Pages: 166

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net