Section 5.10. Use the New XPathDocument and XPathNavigator | Visual Basic 2005: A Developers Notebook

5.10. Use the New XPathDocument and XPathNavigator

.NET provides a range of options for dealing with XML in the System.Xml namespaces. One common choice is XmlDocument, which lets you navigate in-memory XML as a collection of node objects. For more efficient performance, the XmlWriter and XmlReader classes offer a streamlined way to read and write a stream of XML. Unfortunately, neither solution is perfect. The XmlDocument consumes too much memory, and navigating its structure requires too much code. Furthermore, because the XmlDocument is based on a third-party standard (the XML DOM, or document object model), it's difficult to improve it without breaking compatibility. On the other hand, the XmlWriter and XmlReader are too restrictive, forcing you to access information linearly from start to finish. They also make it prohibitively difficult for a developer to provide an XML interface to non-XML data.

Note: Talk about an improvement! The revamped XPathDocument sets a new standard for XML parsing in . NET.

.NET 2.0 proposes a solution with the System.Xml.XPath.XPathDocument. The XPathDocument is a cursor-based XML reader that aims to become the only XML interface you need to use. It gives you the freedom to move to any position in a document, and it provides blistering speed when used with other XML standards such as XQuery, XPath, XSLT, and XML Schema validation.

5.10.1. How do I do that?

To use an XPathDocument, you begin by loading the document from a stream, XmlReader, or URI (which can include a file path or an Internet address). To load the content, you can use the Load( ) method or a constructor argumentthey both work in the same way. In this example, the XPathDocument is filled with the content from a local file:

Dim Doc As New XPathDocument("c:\MyDocument.xml")

To actually move around an XPathDocument, you need to create an XPathNavigator by calling the CreateNavigator( ) method.

Dim Navigator As XPathNavigator = Doc.CreateNavigator( )

The XPathNavigator includes a generous group of methods for navigating the structure of the XML document. Some of the methods include:

MoveToRoot( ): Jumps to the root, or document element that contains all the other elements.
MoveToID( ): Moves to an element that has a specific ID, as identified with the ID attribute.
MoveToNext( ): Moves to the next node at the same level (technically called a sibling).
MoveToPrevious( ): Moves to the previous node at the same level (technically called a sibling).
MoveToFirstChild( ): Moves down a level to the first node contained by the current node.
MoveToParent( ): Moves up a level to the parent that contains the current node.

Once you're positioned on an element, you can read the element name from the Name property. You can retrieve the contained text content from the Value property.

Now that you've learned this much, it's worth trying a basic example. In it, we'll use an XML document that contains a product catalog based on Microsoft's ASP.NET Commerce Starter Kit. This XML file (which is available with the downloadable content for this chapter) has the structure shown in Example 5-10.

Example 5-10. Sample XML for a product catalog

<?xml version="1.0" standalone="yes"?> <Products>   <Product>     <ProductID>356</ProductID>     <ModelName>Edible Tape</ModelName>     <ModelNumber>STKY1</ModelNumber>     <UnitCost>3.99</UnitCost>     <CategoryName>General</CategoryName>   </Product>   <Product>     <ProductID>357</ProductID>     <ModelName>Escape Vehicle (Air)</ModelName>     <ModelNumber>P38</ModelNumber>     <UnitCost>2.99</UnitCost>     <CategoryName>Travel</CategoryName>   </Product>  ... </Products>

Example 5-11 loads this document, creates an XPathNavigator, and moves through the nodes, looking for the <ModelName> element for each <Product>. When that element is found, its value is displayed.

Example 5-11. Navigating an XML document with XPathNavigator

Imports System.Xml.XPath Imports System.Xml      Module XPathNavigatorTest          Sub Main( )         ' Load the document.         Dim Doc As New XPathDocument( _           My.Computer.FileSystem.CurrentDirectory & _           "\ProductList.xml")              ' Navigate the document with an XPathNavigator.         Dim Navigator As XPathNavigator = Doc.CreateNavigator( )              ' Move to the root <Products> element.         Navigator.MoveToFirstChild( )              ' Move to the first contained <Product> element.         Navigator.MoveToFirstChild( )              ' Loop through all the <Product> elements.         Do             ' Search for the <ModelName> element inside <Product>             ' and display its value.             Navigator.MoveToFirstChild( )             Do                 If Navigator.Name = "ModelName" Then                     Console.WriteLine(Navigator.Value)                 End If             Loop While Navigator.MoveToNext( )                  ' Move back to the <Product> element.             Navigator.MoveToParent( )         Loop While Navigator.MoveToNext( )    End Sub      End Module

When you run this code, you'll see a display with a list of model names for all the products.

Interestingly, the XPathNavigator also provides strong typing for data values. Instead of retrieving the current value as a string using the Value property, you can use one of the properties that automatically converts the value to another data type. Supported properties include:

ValueAsBoolean

ValueAsDateTime

ValueAsDouble

ValueAsInt

ValueAsLong

To try this out, you can rewrite the loop in Example 5-11 so that it converts the price to a double value and then displays a total with added sales tax:

Do     If Navigator.Name = "ModelName" Then         Console.WriteLine(Navigator.Value)     ElseIf Navigator.Name = "UnitCost" Then         Dim Price As Double = Navigator.ValueAsDouble * 1.15         Console.WriteLine(vbTab & "Total with tax: " & Math.Round(Price, 2))     End If Loop While Navigator.MoveToNext( )

5.10.2. What about...

...other ways to search an XML document with the XPathNavigator? To simplify life, you can select a portion of the XML document to work with in an XPathNavigator. To select this portion, you use the Select( ) or SelectSingleNode( ) methods of the XPathNavigator class. Both of these methods require an XPath expression that identifies the nodes you want to retrieve. (For more information about the XPath standard, see the "Introducing XPath" sidebar.)

For example, the following code selects the <ModelName> element for every product that's in the Tools category:

' Use an XPath expression to get just the nodes that interest you ' (in this case, all product names in the Tools category). Dim XPathIterator As XPathNodeIterator XPathIterator = Navigator.Select ( _   "/Products/Product/ModelName[../CategoryName='Tools']")      Do While (XPathIterator.MoveNext( ))     ' XPathIterator.Current is an XPathNavigator object pointed at the     ' current node.     Console.WriteLine(XPathIterator.Current.Value) Loop

Tip: The examples in this lab use an XML document with no namespace. However, namespaces are often used in programming scenarios to allow your program to uniquely identify the type of document it references. If your document uses namespaces, you need to use the XmlNamespaceManager class and rewrite your XPath expressions to use a namespace prefix. If you'd like an example of this technique, refer to the downloadable samples for this lab, which demonstrate an example with a product catalog that uses XML namespaces.

Introducing XPath

Basic XPath syntax uses a path-like notation to describe locations in a document. For example, the path /Products/Product/ModelName indicates a ModelName element that is nested inside a Product element, which, in turn, is nested in a root Products element. This is an absolute path (indicated by the fact that it starts with a single slash, representing the root of the document).

You can also use relative paths, which search for nodes with a given name regardless of where they are. Relative paths start with two slashes. For example, //ModelName will find all ModelName elements no matter where they are in the document hierarchy. Other path characters that you can use include the period (.), which refers to the current node; the double period (..) to move up one level; and the asterisk (*) to select any node.

XPath gets really interesting when you start to add filter conditions. Filter conditions are added to a path in square brackets. For example, the XPath expression //Product[CategoryName='Tools'] finds all Product elements that contain a CategoryName element with the text "Tools." You can use the full range of logical operators, such as less than and greater than (< and >) or not equal to (!=). For much more information about the wonderful world of XPath, refer to XML in a Nutshell (O'Reilly).

5.10.3. Where can I learn more?

The XPathNavigator class is too detailed to cover completely in this lab. For more information, refer to both classes in the MSDN Help. Additionally, you can learn about XML standards like XPath, XQuery, and XML Schema from the excellent online tutorials at http://www.w3schools.com.

In addition, you'll find one more lab that can help you extend your XPathDocument skills: "Edit an XML Document with XPathDocument," which explains the editing features of the XPathDocument.

Warning: The editable XPathNavigator has undergone extensive changes, and the features demonstrated in the next lab (Section 5.11) weren't working in the last build we tested. Although it's expected to return, features are sometimes cut even at this late stage. If the coding model changes, you'll find updated code in the downloadable examples for the book.