Understanding XPath

Understanding XPath

Although we already know, for example, that you can assign . to a select attribute to refer to the current node, . is not a valid match pattern; its an XPath abbreviation for self::node() . Match patterns are restricted to only two axes: child and attribute , but XPath has thirteen axes, including self . Youll see all those axes in this chapter, as well as an example of each at work.

Formally speaking, XPath enables you to refer to specific sections of XML documents; its a language for addressing the various parts of such documents. XPath is what you use to indicate what part of a document you want to work with. W3C says of XPath:

The primary purpose of XPath is to address parts of an XML document. In support of this primary purpose, it also provides basic facilities for manipulation of strings, numbers and Booleans. XPath uses a compact, non-XML syntax to facilitate use of XPath within URIs and XML attribute values. XPath operates on the abstract, logical structure of an XML document, rather than its surface syntax. XPath gets its name from its use of a path notation as in URLs for navigating through the hierarchical structure of an XML document.

This quotation comes from the XPath 1.0 specification. Note that although the primary purpose of XPath is to address parts of XML documents, it also supports syntax to work with strings, numbers, and Boolean true/false values; that support is also very useful by itself, as youll see.

Currently, XPath version 1.0 is the standard, but the requirements for XPath 2.0 have been released. There are no drafts of XPath 2.0 yet, just a list of what W3C plans to put into it. An overview at the end of this chapter looks at that list. You can find the primary XPath resources in two places:

  • The XPath 1.0 specification. You use XPath to locate and point to specific sections and elements in XML documents so that you can work with them. www.w3.org/TR/xpath

  • The XPath 2.0 requirements. XPath is being updated to offer more support for XSLT 2.0primarily support for XML schemas. www.w3.org/TR/xpath20req

For more on XPath, see Inside XML . You might also want to take a look at these XPath tutorials:

  • www.zvon.org/xxl/XPathTutorial/General/examples.html

  • www.pro-solutions.com/tutorials/xpath/

The match patterns youve seen so far have returned node sets that you can loop over or match, but XPath is more general than that. In addition to node sets, XPath expressions can also return numbers, Boolean (true/false) values, and strings. Understanding XPath means understanding XPath expressions, and only one kind of XPath expression (although a very important kind) returns node sets that locate sections of a document. Other XPath expressions return other kinds of data, as youll see.

The full syntax of XPath expressions is given in the XPath specification, and I include it here for reference. As it does for match patterns, W3C uses Extended Backus-Naur Form (EBNF) notation to give the formal definition of XPath expressions. (You can find an explanation of this grammar in www.w3.org/TR/REC-xml, section 6.) The following list includes the EBNF notations you need:

  • ::= means is defined as

  • + means one or more

  • * means zero or more

  • means or

  • - means not

  • ? means optional

Also, note that when an item is quoted with single quotation marks, as in ancestor or ::, that item is meant to appear in an expression literally (like ancestor::PLANET), as are items named literals . Heres the formal definition of an XPath expression (named Expr in this definition) in full:

 Expr    ::=    OrExpr  OrExpr    ::=    AndExpr  OrExpr 'or' AndExpr  AndExpr    ::=    EqualityExpr   AndExpr 'and' EqualityExpr  EqualityExpr    ::=    RelationalExpr   EqualityExpr '=' RelationalExpr        EqualityExpr '!=' RelationalExpr  RelationalExpr    ::=    AdditiveExpr  RelationalExpr '<' AdditiveExpr      RelationalExpr '>' AdditiveExpr  RelationalExpr '<=' AdditiveExpr      RelationalExpr '>=' AdditiveExpr  AdditiveExpr    ::=    MultiplicativeExpr    AdditiveExpr '+' MultiplicativeExpr      AdditiveExpr '-' MultiplicativeExpr  MultiplicativeExpr    ::=    UnaryExpr  MultiplicativeExpr MultiplyOperator UnaryExpr      MultiplicativeExpr 'div' UnaryExpr  MultiplicativeExpr 'mod' UnaryExpr  UnaryExpr    ::=    UnionExpr   '-' UnaryExpr  MultiplyOperator    ::=    '*'  UnionExpr    ::=    PathExpr    UnionExpr '' PathExpr  PathExpr    ::=    LocationPath  FilterExpr      FilterExpr '/' RelativeLocationPath      FilterExpr '//' RelativeLocationPath  LocationPath    ::=    RelativeLocationPath   AbsoluteLocationPath  AbsoluteLocationPath    ::=    '/' RelativeLocationPath? graphics/ccc.gif AbbreviatedAbsoluteLocationPath  RelativeLocationPath    ::=    Step    RelativeLocationPath '/' Step        AbbreviatedRelativeLocationPath  AbbreviatedAbsoluteLocationPath    ::=    '//' RelativeLocationPath  AbbreviatedRelativeLocationPath    ::=    RelativeLocationPath '//' Step  Step    ::=    AxisSpecifier NodeTest Predicate*    AbbreviatedStep  AxisSpecifier    ::=    AxisName '::'    AbbreviatedAxisSpecifier  AxisName    ::=    'ancestor'  'ancestor-or-self'  'attribute'  'child'  'descendant'      'descendant-or-self'  'following'  'following-sibling'  'namespace'  'parent'      'preceding'  'preceding-sibling'  'self'  AbbreviatedAxisSpecifier    ::=    '@'?  NodeTest    ::=    NameTest   NodeType '(' ')'    'processing-instruction' '(' Literal graphics/ccc.gif ')'  NameTest    ::=    '*'  NCName ':' '*'  QName  NodeType    ::=    'comment'  'text'  'processing-instruction'  'node'  Predicate    ::=    '[' PredicateExpr ']'  PredicateExpr    ::=    Expr  FilterExpr    ::=    PrimaryExpr  FilterExpr Predicate  PrimaryExpr    ::=    VariableReference  '(' Expr ')'  Literal      Number graphics/ccc.gif FunctionCall  VariableReference    ::=    '$' QName  Number    ::=    Digits ('.' Digits?)?  '.' Digits  Digits    ::=    [0-9]+  FunctionCall    ::=    FunctionName '(' ( Argument ( ',' Argument )* )? ')'  FunctionName    ::=    QName - NodeType  Argument    ::=    Expr  AbbreviatedStep    ::=    '.'    '..' 

As you can see, theres a lot to this specification, including calls to XPath functions (which youll see in the next chapter). The best way to understand XPath expressions is to organize them by the data types they can return.



Inside XSLT
Inside Xslt
ISBN: B0031W8M4K
EAN: N/A
Year: 2005
Pages: 196

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net