Navigating XML Documents with XPath

We’ve seen how easy it is to represent business data by using an XML document. The next step is to understand how an application processes that data. Of course, to process the data in an XML document, an application needs a way to navigate the document to retrieve the values from the document’s elements and attributes. This is what XPath is designed for. In this section I’ll cover some of the fundamentals of XPath. For a complete reference, view the specification at http://www.w3c.org/TR/xpath.

XPath provides a syntax for addressing the data in an XML document by treating the document as a tree of nodes. Each element, attribute, or value in the document is represented as a node in the tree, and XPath expressions are used to identify the node or nodes you want to process. To understand how this works, let’s take a simple XML document as an example:

 <?xml version="1.0"?> <Order OrderNo="1234">     <OrderDate>2001-01-01</OrderDate>     <Customer>Graeme Malcolm</Customer>     <Item>         <Product Product UnitPrice="18">Chai</Product>         <Quantity>2</Quantity>     </Item>     <Item>         <Product Product UnitPrice="19">Chang</Product>         <Quantity>1</Quantity>     </Item> </Order> 

Figure A1-1 shows a node tree that could represent this document.

Figure A1.1 - XML node tree

You can use XPath expressions to define location paths to nodes in the tree or to return a node or set of nodes that meet specified criteria. The expressions can be absolute paths or they can be relative to the currently selected node (known as the context node).

Specifying a Location Path

You can express XPath location paths by using either unabbreviated or abbreviated syntax. Both syntaxes define the root of the document by using a backslash (/) and allow forward and backward navigation through the nodes in the tree.

Absolute Location Paths

Let’s examine some absolute location paths in the node tree produced by the order document described earlier. To select the Order element node by using unabbreviated syntax, we would use the following XPath expression:

 /child::Order 

Translated into abbreviated syntax, this expression becomes:

 /Order 

To drill further down into the document, we could retrieve the Customer node by using the following unabbreviated XPath expression:

 /child::Order/child::Customer 

Here’s the abbreviated equivalent:

 /Order/Customer 

If you want to retrieve an attribute node, you must indicate this by using the attribute keyword in unabbreviated syntax or the @ character in abbreviated syntax. To retrieve the OrderNo attribute of the Order element, use the following unabbreviated syntax:

 /child::Order/attribute::OrderNo 

The abbreviated syntax for the OrderNo attribute is

 /Order/@OrderNo 

To retrieve descendant nodes—that is, nodes anywhere farther down the hierarchy—you can use the descendant keyword in unabbreviated syntax or a double slash (//) in abbreviated syntax. For example, to retrieve all the Product nodes in the order document, you could specify the following unabbreviated location path:

 /child::Order/descendant::Product 

The abbreviated equivalent is

 /Order//Product 

You can use wildcards to indicate nodes whose names aren’t relevant. For example, the asterisk (*) wildcard indicates that any node name can be used. The following unabbreviated location path selects all the child elements of Order:

 /child::Order/child::* 

The equivalent abbreviated syntax is

 /Order/* 

Relative Location Paths

XPath location paths are often relative to a context node, in which case the path describes how to retrieve a node or set of nodes relative to the current one. For example, if the first Item element in the order document is the context node, the relative location path to retrieve the Quantity child element is

 child::Quantity 

In abbreviated syntax, the relative location path is

 Quantity 

Similarly, to retrieve the ProductID attribute of the Product child element, the location path is

 child::Product/attribute::ProductID 

This path translates to

 Product/@ProductID 

To navigate back up the tree, use the parent keyword. The abbreviated equivalent for this keyword is a double period (..). For example, if the context node is the OrderDate element, the OrderNo can be retrieved from the Order element using the following location path:

 parent::Order/attribute::OrderNo 

Note that this syntax will return a value only if the parent node is called Order. To retrieve the OrderNo attribute from the parent regardless of its name, you have to use the following unabbreviated syntax:

 parent::*/attribute::OrderNo 

The abbreviated version is simpler because you don’t need to provide a specific identifier for the parent. The parent of the context node is simply referred to by using the double period, as shown here:

 ../@OrderNo 

In addition, you can reference the context node itself by using either the self keyword or a single period. This can be useful in a number of circumstances, especially when you must determine the currently selected node.

Using Criteria in Location Paths

You can limit the nodes returned using an XPath expression by including search criteria in the location path. The criteria for the node or nodes to be returned are appended to the location path in square brackets.

For example, to retrieve all the Product elements with a UnitPrice attribute greater than 18, you can use the following XPath expression:

 /child::Order/child::Item/child::Product[attribute::UnitPrice>18] 

In abbreviated syntax, you can use the following expression:

 /Order/Item/Product[@UnitPrice>18] 

The criteria include a relative path to the node being retrieved, so you can use nodes from anywhere in the hierarchy in your criteria. The following example retrieves the Item nodes where the Product child element has a ProductID attribute of 1:

 /child::Order/child::Item[child::Product/attribute::ProductID=1] 

The abbreviated syntax for this expression is shown here:

 /Order/Item[Product/@ProductID=1] 

Now that you understand how to use XPath expressions to locate data in an XML document, you’re ready to examine one of the most commonly used XML-related technologies, Extensible Stylesheet Language (XSL).



Programming Microsoft SQL Server 2000 With Xml
Programming Microsoft SQL Server(TM) 2000 with XML (Pro-Developer)
ISBN: 0735613699
EAN: 2147483647
Year: 2005
Pages: 89

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net