XPath 1.0

XPath 1.0 was published as a W3C Recommendation on the same day as XSLT 1.0: 16 November 1999. The two specifications were necessarily closely related because of the intimate way in which XPath expressions are embedded in XSLT stylesheets. However, XPath was deliberately published as a free-standing document, with the expectation that it could be used in many contexts other than XSLT. In fact, the original decision to make XPath separate from XSLT was motivated by the fact that XSLT and XPointer (the hyperlink format used by the XLink specification for document linking) were developing different languages that had a high degree of functional overlap, and everyone agreed that it would be better if W3C defined a single basic language for addressing into XML documents.

The decision to make XPath separate has been justified by subsequent events. Many implementers have provided XPath implementations that are either free-standing or coupled with an implementation of either the Document Object Model (DOM) [1] or one of the other tree-based XML models, such as JDOM. [2] Subsets of XPath have been adopted by other specifications in the XML family, such as XML Schema. And, of course, XPath now forms a core subset of XQuery.

[1] DOM: the Document Object Model, see http://www.w3.org/TR/DOM.

[2] JDOM: a variant of the DOM, designed more specifically for Java. See http://www.jdom.org/.

The central construct of XPath, which gave the language its name , is the path expression, which uses a sequence of steps, separated by / characters , to address nodes within the tree representation of an XML document. The syntax is derived from the syntax of UNIX filenames or URIs, but this is deceptive, because the detailed semantics are much more powerful. Semantically, each step in a path expression actually has three parts :

  • An axis , which describes the relationship to be traversed: For example, it selects the children of the context node, the parent of the context node, or the ancestors , descendants, or siblings. Because the child axis is the one used most frequently, it is the default when no other axis is named.

  • A node test , which places constraints on the names or kinds of nodes to be selected: For example, it might select all elements, or attributes called code .

  • Optionally, one or more predicates , which place further restrictions on the sequence of nodes to be selected. These restrictions may depend on the content of the nodes, or on their position in the sequence of nodes. They may also contain further path expressions, so that the condition for selecting nodes depends on a further complex traversal of the tree.

Thus a path expression such as

 /book/*[1]/@id 

consists of three steps: the first step implicitly uses the child axis to select elements named book ; the second selects the first child element regardless of its name; and the third uses the attribute axis (denoted by @ ) to select attribute nodes named id .

Like filenames and URIs, path expressions may be absolute or relative. Relative path expressions select nodes starting at a point (the context node ) that is, in effect, an implicit parameter to the path expression. Absolute path expressions select from the document root node (though it is rather misleading to call them "absolute," since there may be several documents around, and the selection of a particular document is again an implicit parameter).

The biggest difference between XPath path expressions and the filenames or URIs that they resemble is that each step selects a set of nodes, not a single node. Each step is applied to all the nodes selected by the previous step. XPath therefore shares with SQL the characteristic that it is always processing sets (of nodes in the case of XPath, of tuples in the case of SQL), never individual nodes one at a time.

As well as path expressions that select nodes, XPath 1.0 also has a range of operators and functions for computing values. For example, count(/book/chapter) returns a number giving the number of nodes selected by the path expression /book/chapter , while substring(@desc, 1, 1) selects the first character of the desc attribute of the context node. These operators and functions use just three datatypes in addition to the node-sets that are manipulated by path expressions: strings, Booleans, and numbers . Numeric arithmetic is all based on double-precision floating point. When operations are applied to values of the wrong type, implicit conversions take place; for example, using a string as input to an addition causes no problem, so long as the string actually contains a number. This aspect of the language is very familiar to JavaScript programmers, who are accustomed to using functions and operators with very little regard to datatypes.



XQuery from the Experts(c) A Guide to the W3C XML Query Language
Beginning ASP.NET Databases Using VB.NET
ISBN: N/A
EAN: 2147483647
Year: 2005
Pages: 102

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net