Understanding XPath

Team-Fly    

Developing XML Web Services and Server Components with Visual C#™ .NET and the .NET Framework, Exam Cram™ 2 (Exam 70-320)
By Amit Kalani, Priti Kalani

Table of Contents
Chapter 3.  Accessing and Manipulating XML Data


To pass the exam, you should also have basic knowledge of XPath. You can think of XPath as being a query language that is conceptually similar to SQL. Just as SQL enables you to select a set of information from a database table or a group of tables, XPath enables you to select a set of nodes from the DOM representation of an XML document. By writing an appropriate XPath expression, you can select particular elements or attributes within an XML document.

The XPath Language

XPath starts with the notion of current context. The current context defines the set of nodes that an XPath query will inspect. In general, there are four choices for specifying the current context for an XPath query:

  • ./ uses the current node as the current context.

  • / uses the root of the XML document as the current context.

  • .// uses the entire XML hierarchy from the current node down as the current context.

  • // uses the entire XML document as the current context.

To use XPath to identify a set of elements, you use the path down the tree structure to those elements, separating elements by forward slashes. For example, this XPath expression selects all the Author elements in the Books.xml file:

 /Books/Book/Author 

You can also select all the Author elements without worrying about the full path to get to them by using this expression:

 //Author 

You can use * as a wildcard at any level of the tree. For example, this expression selects all the Author nodes that are grandchildren of the Books node:

 /Books/*/Author 

XPath expressions select a set of elements, not a single element. Of course, the set might have only a single member, or it might have no members. In the context of the XmlDocument object, an XPath expression can be used to select a set of XmlNode objects to operate on later.

To identify a set of attributes, you trace the path down the tree to the attributes, just as you do with elements. The only difference is that attribute names must be prefixed with an @ character. For example, this XPath expression selects all the Pages attributes from Book elements in the Books.xml file:

 /Books/Book/@Pages 

Of course, in the Books.xml file, only Book elements have a Pages attribute. So, in this particular context, this XPath expression is equivalent to the previous one:

 //@Pages 

You can select multiple attributes with the @* operator. To select all attributes of Book elements anywhere in the XML, use this expression:

 //Book/@* 

XPath also offers a predicate language to enable you to specify smaller groups of nodes or even individual nodes in the XML tree. You might think of this as a filtering capability similar to a SQL WHERE clause. One thing you can do is specify the exact value of the node that you'd like to work with. To find all Publisher nodes with the value Addison-Wesley you could use this XPath expression:

 /Books/Book/Publisher[.="Addison-Wesley"] 

Here the [] operator specifies a filter pattern, and the dot operator stands for the current node. Filters are always evaluated with respect to the current context. Alternatively, you can find all Book elements published by Addison-Wesley:

 /Books/Book[./Publisher="Addison-Wesley"] 

Note that there is no forward slash between an element and a filtering expression in XPath.

Of course, you can filter on attributes as well as elements. You can also use operators and Boolean expressions within filtering specifications. For example, you might want to find Books that have a thousand or more pages:

 /Books/Book[./@Pages>=1000] 

Because the current node is the default context, you can simplify this expression a little bit:

 /Books/Book[@Pages>=1000] 

XPath also supports a selection of filtering functions. For example, to find books whose title starts with E, you could use this XPath expression:

 /Books/Book[starts-with(Title,"E")] 

Table 3.5 lists some additional XPath functions.

Table 3.5. Selected XPath Functions

Function

Description

concat()

Concatenates strings

contains()

Determines whether one string contains another

count()

Returns the number of nodes in an expression

last()

Specifies the last element in a collection

normalize-space()

Removes whitespace from a string

not()

Negates its argument

number()

Converts its argument to a number

position()

Specifies the ordinal of a node within its parent

starts-with()

Determines whether one string starts with another

string-length()

Returns the number of characters in a string

substring()

Returns a substring from a string

Square brackets are also used to indicate indexing. Collections are indexed starting at 1. To return the first Book node, you would use this expression:

 /Books/Book[1] 

To return the first title of the second book, you would use the following:

 /Books/Book[2]/Title[1] 

To return the first author in the XML file, regardless of the book, you would use the following:

 (/Books/Book/Author)[1] 

The parentheses are necessary because the square brackets have a higher operator precedence than the path operators. Without the brackets, the expression would return the first author of every book in the file. There's also a last() function that you can use to return the last element in a collection, no matter how many elements are in the collection:

 /Books/Book[last()] 

Another useful operator is the vertical bar, which is used to form the union of two sets of nodes. This expression returns all the authors for books published by Addison-Wesley or Que Certifications:

[View full width]

/Books/Book[./Publisher="Addison-Wesley"]/Author | /Books/Book[./Publisher="Que graphics/ccc.gif Certifications"]/Author

One way to see XPath in action is to use the SelectNodes() method of the XmlNode object. Take the following steps to learn how to select nodes with XPath:

  1. Add a new Windows application project (Example3_3) to the solution. Add the Books.xml file to the project.

  2. Add a TextBox control (txtXPath), a Button control (btnEvaluate), and a ListBox control (lbNodes) to the form. Switch to Code view and add the following using directive:

     using System.Xml; 
  3. Double-click the Button control and add the following code to handle the button's Click event:

     private void btnEvaluate_Click(object sender, System.EventArgs e) {     // Load the Books.xml file     XmlTextReader xtr = new XmlTextReader(@"..\..\Books.xml");     xtr.WhitespaceHandling = WhitespaceHandling.None;     XmlDocument xd = new XmlDocument();     xd.Load(xtr);     // Retrieve nodes to match the expression     XmlNodeList xnl = xd.DocumentElement.SelectNodes(txtXPath.Text);     // And dump the results     lbNodes.Items.Clear();     foreach (XmlNode xnod in xnl)         // For elements, display the corresponding text entity         if (xnod.NodeType == XmlNodeType.Element)             lbNodes.Items.Add(xnod.NodeType.ToString() + ": " +                 xnod.Name + " = " + xnod.FirstChild.Value);         else             lbNodes.Items.Add(xnod.NodeType.ToString()+ ": " +                 xnod.Name + " = " + xnod.Value);     xtr.Close(); } 
  4. Build and run the project. Enter an XPath expression such as //Books/Book/Title in the TextBox control. Click the button to see the nodes that the expression selects from the Books.xml file.

The SelectNodes() method of the XmlNode object takes an XPath expression and evaluates that expression over the document. The resulting nodes are returned in an XmlNodeList object, which is just a collection of XML nodes.

Using the XPathNavigator Class

You've seen how you can use the XmlReader class to move through an XML document. But the XmlReader allows only forward-only, read-only access to the document. There is another set of navigation classes in the System.Xml.XPath namespace. In particular, the XPathNavigator class provides you with read-only, random access to XML documents.

You can perform two distinct tasks with an XPathNavigator object:

  • Select a set of nodes with an XPath expression

  • Navigate the DOM representation of the XML document

In the remainder of this section, I'll show you how to use the XPathNavigator class for these tasks.

Selecting Nodes with XPath

To use the XPathNavigator class, you should start with an XmlDocument, XmlDataDocument, or XPathDocument object. In particular, if you're mainly interested in XPath operations, you should use the XPathDocument class. The XPathDocument class provides a representation of the structure of an XML document that is optimized for query operations. You can construct an XPathDocument object from a URI (including a local filename), a stream, or a reader containing XML.

The XPathDocument object has a single method of interest, CreateNavigator(). (You'll also find this method on the XmlDocument and XmlDataDocument objects.) As you've probably guessed, the CreateNavigator() method returns an XPathNavigator object that can perform operations with the XML document represented by the XPathDocument object. Table 3.6 lists the important members of the XPathNavigator object.

Table 3.6. Important Members of the XPathNavigator Class

Member

Type

Description

Clone()

Method

Creates a duplicate of this object with the current state

ComparePosition()

Method

Compares two XPathNavigator objects to determine whether they have the same current node

Compile()

Method

Compiles an XPath expression for faster execution

Evaluate()

Method

Evaluates an XPath expression

HasAttributes

Property

Indicates whether the current node has any attributes

HasChildren

Property

Indicates whether the current node has any children

IsEmptyElement

Property

Indicates whether the current node is an empty element

Matches()

Method

Determines whether the current node matches an XSLT pattern

MoveToFirst()

Method

Moves to the first sibling of the current node

MoveToFirstAttribute()

Method

Moves to the first attribute of the current node

MoveToFirstChild()

Method

Moves to the first child of the current node

MoveToNext()

Method

Moves to the next sibling of the current node

MoveToNextAttribute()

Method

Moves to the next attribute of the current node

MoveToParent()

Method

Moves to the parent of the current node

MoveToPrevious()

Method

Moves to the previous sibling of the current node

MoveToRoot()

Method

Moves to the root node of the DOM

Name

Property

Specifies the qualified name of the current node

Select()

Method

Uses an XPath expression to select a set of nodes

Value

Property

Specifies the value of the current node

graphics/alert_icon.gif

Unlike the XmlReader class, the XPathNavigator class implements methods such as MoveToPrevious() and MoveToParent() that can move backward in the DOM. The XPathNavigator class provides random access to the entire XML document.


Like the XmlReader class, the XPathNavigator class maintains a pointer to a current node in the DOM at all times. But XPathNavigator brings additional capabilities to working with the DOM. For example, you can use this class to execute an XPath query, as shown in the following code segment:

 // Load the Books.xml file XPathDocument xpd = new XPathDocument(@"..\..\Books.xml"); // Get the associated navigator XPathNavigator xpn = xpd.CreateNavigator(); // Retrieve nodes to match the expression XPathNodeIterator xpni = xpn.Select(txtXPath.Text); // And dump the results lbNodes.Items.Clear(); while (xpni.MoveNext())   lbNodes.Items.Add(xpni.Current.NodeType.ToString() + ": "        + xpni.Current.Name + " = " + xpni.Current.Value); 

The Select() method of the XPathNavigator class returns an XPathNodeIterator object, which lets you visit each member of the selected set of nodes in turn.

Navigating Nodes with XPath

You can also use the XPathNavigator object to move around in the DOM. Take the following steps to learn how:

  1. Add a new Windows application project (Example3_4) to the solution. Add the Books.xml file to the project.

  2. Add four Button controls (btnParent, btnPrevious, btnNext, and btnChild) and a ListBox control (lbNodes) to the form. Switch to Code view and add the following using directive:

     using System.Xml.XPath; 
  3. Double-click the form and add the following code to the class definition:

     XPathDocument xpd; XPathNavigator xpn; private void ListNode() {     // Dump the current node to the listbox     lbNodes.Items.Add(xpn.NodeType.ToString() +         ": " + xpn.Name + " = " + xpn.Value); } 
  4. Add the following event handler to the Load event of the form.

     private void Form1_Load(object sender, System.EventArgs e) {     xpd = new XPathDocument(@"..\..\Books.xml");     xpn = xpd.CreateNavigator();     xpn.MoveToRoot();     ListNode(); } 
  5. Double-click the Button controls and add the following code to their Click event handlers:

     private void btnParent_Click(object sender, System.EventArgs e) {     // Move to the parent of the current node     if(xpn.MoveToParent())         ListNode();     else         lbNodes.Items.Add("No parent node"); } private void btnChild_Click(object sender, System.EventArgs e) {     // Move to the first child of the current node     if(xpn.MoveToFirstChild())         ListNode();     else         lbNodes.Items.Add("No child node"); } private void btnPrevious_Click(object sender, System.EventArgs e) {     // Move to the previous sibling of the current node     if(xpn.MoveToPrevious())         ListNode();     else         lbNodes.Items.Add("No previous node"); } private void btnNext_Click(object sender, System.EventArgs e) {     // Move to the next sibling of the current node     if(xpn.MoveToNext())         ListNode();     else         lbNodes.Items.Add("No next node"); } 
  6. Build and run the project. Experiment moving around in the DOM by clicking on the buttons.

The previous example demonstrates two important things about the XPathNavigator class. First, the value of a node is the concatenated text of all the nodes beneath that node. Second, the MoveTo methods of the XPathNavigator will never throw an error, whether there is an appropriate node to move to or not. Instead, they simply return false when the requested navigation cannot be performed.


    Team-Fly    
    Top


    MCAD Developing XML Web Services and Server Components with Visual C#. NET and the. NET Framework Exam Cram 2 (Exam Cram 70-320)
    Managing Globally with Information Technology
    ISBN: 789728974
    EAN: 2147483647
    Year: 2002
    Pages: 179

    flylib.com © 2008-2017.
    If you may any questions please contact us: flylib@qtcs.net