The Formal Definition | NetBeansв„ў IDE Field Guide: Developing Desktop, Web, Enterprise, and Mobile Applications (2nd Edition)

The XSLT specification defines the way patterns are evaluated in terms of the XPath expression that is equivalent to the pattern. We've already seen that every pattern is a valid XPath expression. In fact, the rules are written so that the only XPath expressions that can be used as patterns are those that return a sequence of nodes. The idea is that you should be able to decide whether a node matches a pattern by seeing whether the node is in the sequence returned by the corresponding expression.

This then raises the question of context. The result of the XPath expression «title » is all the <title> children of the context node. Does that include the particular <title> element we are trying to match, or not? It obviously depends on the context. Since we want the pattern «title » to match every <title> element, we could express the rule by saying that the node we are testing(let's call it N ) matches the pattern «title » if we can find a node( A , say) anywhere in the document, which has the property that when we take A as the context node and evaluate the expression «title » , the node N will be selected as part of the result. In this example, we don't have to look very far to find node A :in fact, only as far as the parent node of N.

So the reason that a <title> element matches the pattern «title » is that it has a parent node, which when used as the context node for the expression «./child::title » , returns a sequence that includes that <title> element. The pattern might be intuitive but, as you can see, the formal explanation is starting to get quite complex.

In an early draft of the XSLT 1.0 specification, the rules allowed almost any path expression to be used as a pattern. For example, you could define a pattern « ancestor ::*[3] » , which would match any node that was the great grandparent of some other node in the document. It turned out that this level of generality was neither needed nor possible to implement efficiently , and so a further restriction was imposed, that the only axes you could use in a pattern were the child and attribute axes (the various axes are explained in Chapter 7 of XPath 2.0 Programmer's Reference ). A consequence of this is that the only place where the XSLT processor has to look for node A (the one to use as a context node for evaluating the expression) is among the ancestors of the node being matched ( N ), including N itself.

This brings us to the formal definition of the meaning of a pattern. For the moment let's ignore the complications caused by parentless nodes; I return to these page 500 later.

Important

The node $N matches a pattern PAT if $N is a member of the sequence selected by the expression «root($N)//(PAT) » .

The way this rule is expressed has changed since XSLT 1.0, but the effect is the same. It has become possible to simplify the rule as a result of the generalization of path expressions that has happened in XPath 2.0. In XPath 1.0, XPath expressions such as «//(ab) » or «//(/a) » were not allowed, so this rule would have made many patterns illegal.

Let's see what this rule means. We start with a node $N that we want to test against the pattern. First we find the root node of the tree containing $N . Then we look for all the descendant-or-self nodes of this root node, which means all the nodes in the tree except for attributes and namespaces. For each one of these nodes, we evaluate the pattern as if it were an XPath expression, using that node as the context node. If the result includes the original node $N , we have a match.

Let's see how this rule works by testing it against some common cases:

If the pattern is «title » , then a node matches the pattern if the node is included in the result of the expression «root(.)//(title) » , which is the same as «//title » . This expression selects all <title> elements in the document, so a node matches the pattern if and only if it is a <title> element.
If the pattern is «chapterappendix » , then a node matches the pattern if it is selected by the expression «root(.)//(chapter appendix) » . This expression is equivalent to «//chapter //appendix » , and matches all <chapter> and <appendix> elements in the document.
If the pattern is «/ » , then a node matches if it is selected by the expression «root(.)//(/) » . This rather strange XPath expression selects the root node of every descendant of the root node, and then eliminates duplicates: so it is actually equivalent to the expression «/ » , which selects the root node only. (There are complications if the root node is not a document node, for example if it is a parentless element. I will cover these complications later in the chapter.)
If the pattern is «chapter/title » then a node matches if it is selected by the expression «root(.)//(chapter/title) » , which selects all <title> elements that are children of <chapter> elements.
If the pattern is «para[1] » then a node matches if it is selected by the expression «root(.)//(para[1]) » , which selects any <para> element that is the first <para> child of its parent.
If the pattern is «id('S123') » then a node matches if it is selected by the expression «root(.)//(id('S123')) » , which is equivalent to the expression «id('S123') » , and selects the element with an ID value of «S123 » .

This means there is a theoretical algorithm for testing whether a given node N matches a pattern P, as follows : for each node, starting from N and working through its ancestors up to the root node, evaluate P as an XPath expression with that node as the context node. If the result is a sequence of nodes containing N, the pattern matches; otherwise keep trying until you get to the root.

XSLT processors don't usually use this algorithm, it's there only as a way of stating the formal rules. The processor will usually be able to find a faster way of doing the test-which is just as well, since pattern matching would otherwise be prohibitively expensive.

Although the formal rules usually give the answer you would expect intuitively, there can be surprises . For example, you might expect the pattern «node() » to match any node; but it doesn't. The equivalent expression, «//(node()) » is short for «root(.)/descendant-or-self::node()/child::node() » , and the only nodes that this can select are nodes that are children of something. Since document nodes, attribute nodes, and namespace nodes are never children of another node (see the description of the tree model on page 48 in Chapter 2), they will never be matched by the pattern «node() » .

Patterns Containing Predicates

The formal equivalence of patterns and expressions becomes critical when considering the meaning of predicates (conditions in square brackets), especially predicates that explicitly or implicitly use the position() and last() functions.

For example, the pattern «para[1] » corresponds to the expression «root(.)//(para[position()=1]) » . This expression takes all the <para> children of the context node, and then filters this sequence to remove all but the first (in document order). So the pattern «para[1] » matches any <para> element that is the first <para> child of its parent. Similarly the pattern «*[1][self::para] » matches any element that is the first child of its parent and that is also a <para> element, while «para[last()!=1] » matches any <para> element that is a child of an element with two or more <para> children.