Understanding XPath

   


Access and Manipulate XML Data: Use XPath to query XML data.

To pass the exam, you should also have basic knowledge of XPath. XPath is another W3C standard, formally known as the XML Path Language. XPath is described by the W3C as "a language for addressing parts of an XML document." The .NET implementation of XPath supports the Version 1.0 Recommendation standard for XPath, which you can find at www.w3.org/TR/xpath.

You can think of XPath as being a query language, conceptually similar to SQL. Just as SQL allows you to select a set of information from a table or group of tables, XPath allows you to select a set of nodes from the DOM representation of an XML document. In this section, I'll introduce you to the basic syntax of XPath, and then show you how to use XPath in the .NET System.Xml namespace.

The XPath Language

XPath is not itself an XML standard. XPath expressions are not valid XML documents. Rather, XPath is a language for talking about XML. By writing an appropriate XPath expression, you can select particular elements or attributes within an XML document.

XPath starts with the notion of current context. The current context defines the set of nodes that will be inspected by an XPath query. In general, there are four choices to specify the current context for an XPath query:

  • ./ uses the current node as the current context.

  • / uses the root of the XML document as the current context.

  • .// uses the entire XML hierarchy starting with the current node as the current context.

  • // uses the entire XML document as the current context.

To identify a set of elements using XPath, you use the path down the tree structure to those elements, separating tags by forward slashes . For example, this XPath expression selects all the Author elements in the Books.xml file:

 /Books/Book/Author 

You can also select all the Author elements without worrying about the full path to get to them by using this expression:

 //Author 

You can use * as a wildcard at any level of the tree. So, for example, this expression selects all the Author nodes that are grandchildren of the Books node:

 /Books/*/Author 

XPath expressions select a set of elements, not a single element. Of course, the set might only have a single member, or no members at all. In the context of the XmlDocument object, an XPath expression can be used to select a set of XmlNode objects to operate on later.

To identify a set of attributes, you trace the path down the tree to the attributes, just as you do with elements. The only difference is that attribute names must be prefixed with an @ character. For example, this XPath expression selects all the Pages attributes from Book elements in the Books.xml file:

 //Book/@Pages 

EXAM TIP

Be Explicit When Possible In general, the expression with the explicit path ( //Book/@Pages ) can be evaluated more rapidly than the expression that searches the entire document ( //@Pages ). The former only has to search a limited number of nodes to return results, whereas the latter needs to look through the entire document.


Of course, in the Books.xml file, only Book elements have a Pages attribute. So in this particular context, this XPath expression is equivalent to the previous one:

 //@Pages 

You can select multiple attributes with the @* operator. To select all attributes of Book elements anywhere in the XML, use this expression:

 //Book/@* 

XPath also offers a predicate language to allow you to specify smaller groups of nodes or even individual nodes in the XML tree. You might think of this as a filtering capability similar to a SQL WHERE clause. One thing you can do is specify the exact value of the node that you'd like to work with. To find all Publisher nodes with the value "Addison Wesley," you could use the XPath expression

 /Books/Book/Publisher[.="Addison Wesley"] 

Here the dot operator stands for the current node. Alternatively, you can find all Books published by Addison Wesley:

 /Books/Book[./Publisher="Addison Wesley"] 

Note that there is no forward slash between an element and a filtering expression in XPath.

Of course, you can filter on attributes as well as elements. You can also use operators and Boolean expressions within filtering specifications. For example, you might want to find Books that have a thousand or more pages:

 /Books/Book[./@Pages>=1000] 

Because the current node is the default context, you can simplify this expression a little bit:

 /Books/Book[@Pages>=1000] 

XPath also supports a selection of filtering functions. For example, to find books whose titles start with A, you could use this XPath expression:

 /Books/Book[starts-with(Title,"A")] 

Table 2.6 lists some additional XPath functions.

Table 2.6. Selected XPath Functions

Function

Description

concat

Concatenates strings.

contains

Determines whether one string contains another.

count

Counts the number of nodes in an expression.

last

Last element in a collection.

normalize-space

Removes whitespace from a string.

not

Negates its argument.

number

Converts its argument to a number.

position

Ordinal of a node within its parent.

starts-with

Determines whether one string starts with another.

string-length

Returns the number of characters in a string.

substring

Returns a substring from a string.

Square brackets are also used to indicate indexing. Collections are indexed starting at one. To return the first Book node, you'd use this expression:

 /Books/Book[1] 

To return the first title of the second book:

 /Books/Book[2]/Title[1] 

To return the first Author in the XML file, regardless of Book:

 (/Books/Book/Author)[1] 

The parentheses are necessary because the square brackets have a higher operator precedence than the path operators. Without the brackets, the expression would return the first author of every book in the file. There's also a last() function that you can use to return the last element in a collection, without needing to know how many elements are in the collection:

 /Books/Book[last()] 

Another useful operator is the vertical bar, which is used to form the union of two sets of nodes. This expression returns all the authors for books published by Addison Wesley or Microsoft Press:

 /Books/Book[./Publisher="Addison Wesley"]/Author  /Books/Book[./Publisher="Microsoft Press"]/Author 

One way to see XPath in action is to use the SelectNodes method of the XmlDocument object, as shown in Step By Step 2.7.

STEP BY STEP

2.7 Selecting Nodes with XPath

  1. Add a new form to the project. Name the new form StepByStep2-7.vb.

  2. Add a Label control, a TextBox control named txtXPath , a Button control named btnEvaluate , and a ListBox control named lbNodes to the form.

  3. Double-click the Button control to open the form's module. Add this line of code at the top of the module:

     Imports System.Xml 
  4. Add code to handle the Button's Click event:

     Private Sub btnEvaluate_Click(_  ByVal sender As System.Object, _  ByVal e As System.EventArgs) _  Handles btnEvaluate.Click     ' Load the Books.xml file     Dim xtr As XmlTextReader = _      New XmlTextReader("..\Books.xml")     xtr.WhitespaceHandling = WhitespaceHandling.None     Dim xd As XmlDocument = _      New XmlDocument()     xd.Load(xtr)     ' Retrieve nodes to match the expression     Dim xnl As XmlNodeList = _      xd.DocumentElement.SelectNodes(txtXPath.Text)     ' And dump the results     lbNodes.Items.Clear()     Dim xnod As XmlNode     For Each xnod In xnl         ' For elements, display the corresponding         ' Text entity         If xnod.NodeType = XmlNodeType.Element Then             lbNodes.Items.Add(xnod.NodeType.ToString _              & ": " & xnod.Name & " = " & _              xnod.FirstChild.Value)         Else             lbNodes.Items.Add(xnod.NodeType.ToString _              & ": " & xnod.Name & " = " & xnod.Value)         End If     Next     ' Clean up     xtr.Close() End Sub 
  5. Set the form as the startup form for the project.

  6. Run the project. Enter an XPath expression such as //Books/Book/Title in the TextBox control. Click the button to see the nodes that the expression selects from the Books.xml file, as shown in Figure 2.7.

    Figure 2.7. Displaying the XmlDataDocument derived from a DataSet.

The SelectNodes method of the XmlDocument takes an XPath expression and evaluates that expression over the document. The resulting nodes are returned in an XmlNodeList object, which is just a collection of XML nodes.

Using the XPathNavigator Class

You've seen how you can use the XmlReader class to move through an XML document. But the XmlReader allows only forward-only, read-only access to the document. There is another set of navigation classes in the System.Xml.XPath namespace. In particular, the XPathNavigator class provides you with read-only, random access to XML documents.

You can perform two distinct tasks with an XPathNavigator object:

  • Selecting a set of nodes with an XPath expression

  • Navigating the DOM representation of the XML document

In the remainder of this section, I'll show you how to use the XPathNavigator class for these tasks.

Selecting Nodes with XPath

To use the XPathNavigator class, you should start with an XmlDocument, XmlDataDocument, or XPathDocument object. In particular, if you're mainly interested in XPath operations, you should use the XPathDocument class. The XPathDocument class provides a representation of the structure of an XML document that is optimized for query operations. You can construct an XPathDocument object from a URI (including a local filename), a stream, or a reader containing XML.

The XPathDocument object has a single method of interest, CreateNavigator (you'll also find this method on the XmlDocument and XmlDataDocument objects). As you've probably guessed, the CreateNavigator method returns an XPathNavigator object that can perform operations with the XML document represented by the XPathDocument object. Table 2.7 lists the important members of the XPathNavigator object.

Table 2.7. Important Members of the XPathNavigator Class

Member

Type

Description

Clone

Method

Creates a duplicate of this object with the current state.

ComparePosition

Method

Compares two XPathNavigator objects to determine whether they have the same current node.

Compile

Method

Compiles an XPath expression for faster execution.

Evaluate

Method

Evaluates an XPath expression.

HasAttributes

Property

Indicates whether the current node has any attributes.

HasChildren

Property

Indicates whether the current node has any children.

IsEmptyElement

Property

Indicates whether the current node is an empty element.

Matches

Method

Determines whether the current node matches an XSLT pattern.

MoveToFirst

Method

Moves to the first sibling of the current node.

MoveToFirstAttribute

Method

Moves to the first attribute of the current node.

MoveToFirstChild

Method

Moves to the first child of the current node.

MoveToNext

Method

Moves to the next sibling of the current node.

MoveToNextAttribute

Method

Moves to the next attribute of the current node.

MoveToParent

Method

Moves to the parent of the current node.

MoveToPrevious

Method

Moves to the previous sibling of the current node.

MoveToRoot

Method

Moves to the root node of the DOM.

Name

Property

Qualified name of the current node.

Select

Method

Selects a set of nodes using an XPath expression.

Value

Property

Value of the current node.

EXAM TIP

XPathNavigator Can Move Backward Note that unlike the XmlReader class, the XPathNavigator class implements methods such as MovePrevious and MoveParent that can move backward in the DOM. The XPathNavigator class provides random access to the entire XML document.


Like the XmlReader class, the XPathNavigator class maintains a pointer to a current node in the DOM at all times. But the XPathNavigator brings additional capabilities to working with the DOM. For example, you can use this class to execute an XPath query, as shown in Step By Step 2.8.

STEP BY STEP

2.8 Selecting Nodes with an XPathNavigator Object

  1. Add a new form to the project. Name the new form StepByStep2-8.vb.

  2. Add a Label control, a TextBox control named txtXPath , a Button control named btnEvaluate , and a ListBox control named lbNodes to the form.

  3. Double-click the Button control to open the form's module. Add this line of code at the top of the module:

     Imports System.Xml.XPath 
  4. Add code to handle the Button's Click event:

     Private Sub btnEvaluate_Click(_  ByVal sender As System.Object, _  ByVal e As System.EventArgs) _  Handles btnEvaluate.Click     ' Load the Books.xml file     Dim xpd As XPathDocument = _      New XPathDocument("..\Books.xml")     ' Get the associated navigator     Dim xpn As XPathNavigator = _      xpd.CreateNavigator()     ' Retrieve nodes to match the expression     Dim xpni As XPathNodeIterator = _      xpn.Select(txtXPath.Text)     ' And dump the results     lbNodes.Items.Clear()     While xpni.MoveNext         lbNodes.Items.Add(_          xpni.Current.NodeType.ToString _          & ": " & xpni.Current.Name & " = " & _          xpni.Current.Value)     End While End Sub 
  5. Set the form as the startup form for the project.

  6. Run the project. Enter an XPath expression in the TextBox control. Click the button to see the nodes that the expression selects from the Books.xml file.

The Select method of the XPathNavigator class returns an XPathNodeIterator object, which lets you visit each member of the selected set of nodes in turn . It has Count and Current properties, as well as (as you saw in the code for Step By Step 2.8) a Move method that advances it through the set of nodes.

Navigating Nodes with XPath

You can also use the XPathNavigator object to move around in the DOM. Step By Step 2.9 demonstrates the Move methods of this class.

STEP BY STEP

2.9 Navigating with an XPathNavigator Object

  1. Add a new form to the project. Name the new form StepByStep2-9.vb.

  2. Add four Button controls ( btnParent , btnPrevious , btnNext , and btnChild ) and a ListBox control named lbNodes to the form.

  3. Double-click the Button control to open the form's module. Add this line of code at the top of the module:

     Imports System.Xml.XPath 
  4. Add code to load an XML document when you load the form:

     Dim xpd As XPathDocument Dim xpn As XPathNavigator Private Sub StepByStep2_9_Load(_  ByVal sender As System.Object, _  ByVal e As System.EventArgs) Handles MyBase.Load     ' Load the Books.xml file     xpd = New XPathDocument("..\Books.xml")     ' Get the associated navigator     xpn = xpd.CreateNavigator()     xpn.MoveToRoot()     ListNode() End Sub Private Sub ListNode()     ' Dump the current node to the listbox     lbNodes.Items.Add(_      xpn.NodeType.ToString _      & ": " & xpn.Name & " = " & _      xpn.Value) End Sub 
  5. Add code to handle events from the Button controls:

     Private Sub btnParent_Click(_  ByVal sender As System.Object, _  ByVal e As System.EventArgs) Handles btnParent.Click     ' Move to the parent of the current node     If xpn.MoveToParent() Then         ListNode()     Else         lbNodes.Items.Add("No parent node")     End If End Sub Private Sub btnPrevious_Click(_  ByVal sender As System.Object, _  ByVal e As System.EventArgs) _  Handles btnPrevious.Click     ' Move to the previous sibling of the current node     If xpn.MoveToPrevious() Then         ListNode()     Else         lbNodes.Items.Add("No previous node")     End If End Sub Private Sub btnNext_Click(_  ByVal sender As System.Object, _  ByVal e As System.EventArgs) Handles btnNext.Click     ' Move to the next sibling of the current node     If xpn.MoveToNext() Then         ListNode()     Else         lbNodes.Items.Add("No next node")     End If End Sub Private Sub btnChild_Click(_  ByVal sender As System.Object, _  ByVal e As System.EventArgs) Handles btnChild.Click     ' Move to the first child of the current node     If xpn.MoveToFirstChild() Then         ListNode()     Else         lbNodes.Items.Add("No child node")     End If End Sub 
  6. Set the form as the startup form for the project.

  7. Run the project. Experiment with the buttons . You'll find that you can move around in the DOM, as shown in Figure 2.8.

    Figure 2.8. Exploring an XML Document with the XPathNavigator.

Step By Step 2.9 demonstrates two important things about the XPathNavigator class. First, the value of a node is the concatenated text of all the nodes beneath that node. Second, the MoveTo methods of the XPathNavigator will never throw an error, whether or not there is an appropriate node to move to. Instead, they simply return False when the requested navigation cannot be performed.

REVIEW BREAK

  • XPath is a language for specifying or selecting parts of an XML document. XPath is a query language for XML.

  • An XPath expression returns a set of zero or more nodes from the DOM representation of an XML document.

  • The SelectNodes method of the XmlDocument object returns a set of nodes selected by an XPath expression.

  • The XPathDocument and XPathNavigator objects are optimized for fast execution of XPath queries.

  • The XpathNavigator object allows random-access navigation of the structure of an XML document.


   
Top


MCAD. MCSD Training Guide (Exam 70-310. Developing XML Web Services and Server Components with Visual Basic. NET and the. NET Framework)
MCAD/MCSD Training Guide (70-310): Developing XML Web Services and Server Components with Visual Basic(R) .NET and the .NET Framework
ISBN: 0789728206
EAN: 2147483647
Year: 2002
Pages: 166

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net