7.3 XPointer | Secure XML: The New Syntax for Signatures and Encryption

XPointer is the syntax you use for the most general addressing of parts of an XML object [XPointer]. When an HTTP URI references XML, any fragment specifier to select a portion of the XML is written in XPointer syntax. XPointer can also be called explicitly to extract a subset of data (see Chapter 19). Note that XPointer does not include any way to point into the DTD or XML declaration for a document.

XPointer extends XPath so that you can use it in the following ways:

As a fragment identifier in a URI reference
To locate information by string matching
To address points and ranges within XML

In XML Security URIs, you should rarely encounter anything other than very simple XPointers. The implementation might delegate URI retrieval and fragment specifier processing to separate code with somewhat unpredictable results. For this reason, the XML Security standards discourage use of XPointer fragment specifiers in favor of using the explicit "Transforms" mechanism provided. (While you can invoke the full power of XPointer in such transforms, full support for it is optional. Even full support of XPath, on which XPointer builds, is merely recommended, not mandatory.)

XPointer Encoding

When an XPointer contains special characters that have significance to XPointer but that should be treated as data, a circumflex ("^") prefixes the special characters. One example would be an unbalanced parenthesis either "(" or ")" appearing in a literal string. Due to this use, circumflex itself must be considered a special character and occurrences of a literal circumflex doubled ("^^"). For example,

 xpointer( string-range ( /., "f)^" ) )

would be encoded as follows:

 xpointer( string-range ( /., "f^)^^") )

The preceding rule applies only to general XPointer encoding. An XPointer must also be URI encoded (see Section 7.1.4) if it is used in a URI and XML encoded if it appears in XML. So, for example,

 xpointer(string-range(//example,"André :-)"))

would always be XPointer encoded for the unbalanced parenthesis in the "smiley" as follows:

 xpointer(string-range(//example,"André :-^)"))

If it appeared as such in an XML attribute value delimited by double quotes and encoded in US-ASCII, you would escape the double quotes and the accented "é":

 xpointer(string-range(//example,&quot;Andr&#xE9; :-^)&quot;))

If it appeared generally as a URI reference fragment specifier, you would encode the space, double quotes, circumflex, and accented "é":

 xpointer(string-range(//example,%22Andr%C3%A9%20:-%5E)%22)

7.3.1 Forms of XPointer

XPointer has three possible forms:

Full XPointers
Bare names
Child sequences

Full XPointers

A full XPointer can be complicated, but you will rarely encounter the full form. It consists of a sequence of one or more parts that can optionally be separated by white space. Each part has the following format:

 scheme(string-with-balanced-parentheses)

The "xpointer" and "xmlns" schemes are defined below. The "string-with-balanced-parentheses" means any string where all parentheses both "(" and ")" are properly nested, except for those escaped with the circumflex ("^"), as described earlier. Full XPointer parts end with a close parenthesis that matches the open parenthesis after the scheme name.

If multiple parts are present, they are evaluated from left to right until one succeeds. If all parts fail, then the full XPointer fails. As yet undefined schemes are permitted for future expansion. Encountering a scheme you do not understand is equivalent to a failure of that part. (This scheme-based system allows other data types than general XML to define their own schemes.)

You use the "xmlns" scheme to set up the namespace context for XPointer expression evaluations occurring farther to the right. The parenthesized string immediately after this scheme must consist of a namespace prefix followed by a namespace URI. Do not use quotation marks around the URI. For example:

 xmlns(foo=http://bar.example/x)

The "xmlns" scheme adds its namespace declaration to the namespace context and then "fails" so that evaluation proceeds to the next part of the full XPointer to its right. If two "xmlns" parts try to bind the same prefix, the one evaluated later (the one farther to the right) overrides the first. Those "xmlns" parts that attempt to bind the "xml" prefix are ignored. Instead, the prefix is always bound to "http://www.w3.org/XML/1998/namespace"; this binding is always part of the namespace context for evaluating XPointer expressions.

The "xpointer" scheme does the obvious thing namely, it interprets the parenthesized string immediately after it as an XPointer expression. If it is evaluated without error and yields a nonempty location-set, then that result is the value of the entire full XPointer.

You can use the ability to provide a sequence of XPointer parts for various purposes. The following example shows general fail-over from one XPointer expression to another. It finds all "foo" elements that don't have a foo descendant or, if no such foo elements are present, all "bar" elements that do not have any children.

 xpointer(//*foo[not(.//foo)])xpointer(//*bar[not(./*)]

Bare Names

In comparison with a full XPointer, it is difficult to get much simpler than a bare name. A bare name is pretty much like it sounds just a token. It refers to the element with that token as an ID. In other words, the bare name fragment specifier

 #foo

is the same as the full XPointer fragment specifier

 #xpointer(id("foo"))

Child Sequences

A child sequence consists of a series of one or more decimal numbers preceded by a slash ("/") and separated by slashes. No white space is permitted. The sequence may be optionally prefixed with a bare name. Two example fragment specifiers are shown here:

 #/2/7/18/2/8 #pi/3/14/

Such sequences can only locate elements. They do so by using each number to index into the children of the element found by the previous step. The starting point is the root element, if the child sequence starts with a slash, or the element that the name as an ID specifies (if it starts with a name). The preceding examples are therefore equivalent to the following:

 #xpointer(/*[2]/*[7]/*[18]/*[2]/*[8]) #xpointer(id("pi")/*[3]/*[14])

7.3.2 The XPath Extensions

XPath deals only with nodes. XPointer extends XPath, however, so that it can handle more general locations. The locations permitted include a pointer into the middle of text as well as more general ranges, such as might result from a user clicking and dragging on a screen display of XML to include parts of two elements with different parents. In summary, the extensions to XPath have the following effects:

The concepts of a node and a node-set are extended to include a location and a location-set. In effect, the two location types of "point" and "range" have the same status as node types and appropriate extensions are made to node tests and the definitions of axes. While evaluating an expression, the context location is extended so that it can consist of a point or range. Also, you can use the XPath "[number]" predicate to select values from a set of points and ranges.
The extensions provide extended rules for establishing the XPath evaluation context.
Numerous additional functions are added, as listed in Section 7.3.3. A special-case extension applies to the expression syntax for the "range-to," as explained with that function's definition.
The root node may have multiple child elements. This principle allows XPointers into general external parsed entities, which might consist of multiple top-level elements.

Location Extension: Point

XPointer adds the "point" type to XPath, defined as follows:

A point is defined as an index and a container node. It always points before the first item in the container, between two items in the container, or after the last item in the container.
If the container is an element or root node, the items are its children.
If there are N children, an index of zero points just before the first child; an index of N points just after the last child; and an index of X, where 0 < X < N, points between child X and X + 1. Such a point into children is called a "node-point."
If a container does not have children but does have a string value, then the index points between characters in that string value. If the string value length is N, an index of zero points just before the first character; an index of N points just after the last character; and an index of X, where 0 < X < N, points between character X and X + 1. Such a point into text is called a "character-point."

You need to be careful about thinking of a "point" as just a location in the external representation of XML. For example, consider "<a>xyz</a>". It is an element node with a child text node. The point using this element as container and index 1 is the point just after the text node. The point using the text node as a container and index 3 is the point just after the last character in the text. Although the two are different points, a poorly designed user interface might display them indistinguishably on a computer screen.

A point location does not have an expanded name. It does have a null string value.

The XPath set of node tests is extended to include "point( )" so that points can be selected from a location-set. The axes of a point are location-sets defined as follows:

The "self::" axis contains the point itself.
The "parent::" axis contains the point's container node.
The "ancestor::" axis contains the point container node and its ancestors.
The "child::", "descendant::", "preceding-sibling::", "following-sibling::", "attribute::", and "namespace::" axes are empty.
Although they are not defined in the XPointer document, one would presume that the "descendant-or-self::" axis contains just the point itself, that the "ancestor-or-self::" axis is the union of the "self::" and "ancestor::" axes, and that the "following::" and "preceding::" axes are empty.

Location Extension: Range

XPointer adds to XPath the "range" type. A range is simply defined as two points: the start point and the end point of the range. The start point must not follow the end point, and both must appear in the same XML document. The range represents the XML content and structure between its points.

If the container node of one point of a range is an element, text, or root, then the container node of the other point must also be one of these three types. If the container node of one such point is any other type, then both the start and end point must reside within the same node.

For example, you can have a range that appears within the string value of a processing instruction, where both points have the processing instruction as their container node. Alternatively, for a range from a processing instruction to (and including) an immediately following element, the points of the range might have as their container nodes the parents of the processing instruction and element. You could not, however, have a range from inside the text content of a processing instruction to inside the text content of a following element.

A range with the same start and end point is called a collapsed range. A range location does not have an expanded name.

The string value of a range depends on the nature of its points. If both are character-points in the same container node, the string value is just as you would expect the characters between the start and end points. Otherwise, the string value consists of the characters in text nodes for which the character is found after the start point and before the end point. For example, in

 <a>1#23<b attribute='value'>foo</b>xy#z</a>

the string value of a range from just after the first octothorpe ("#") to just before the second would simply be

 23fooxy

In the same example, the string value of the range from just before element "a" to just after element "a" is

 1#23fooxy#z

The XPath set of node tests is extended to include "range( )" so that ranges can be selected from a location-set.

The axes of a range are the same as the axes of the start point of that range.

Covering Ranges

XPointer defines the concept of a covering range. A covering range that encompasses any type of location can be found as follows:

The covering range of a range is that range.
The covering range for a point is the collapsed range starting and ending with that point.
For the root node, the start and end points of the covering range have the root node as their container. The index of the start point is zero, and the index of the end point is the number of children of the root.
For an attribute or namespace node, the start and end points of the covering range have the attribute or namespace node as their container. The index of the start point is zero, and the index of the end point is the length of the string value of the attribute or namespace node.
For all other kinds of nodes, the start and end points of the covering range have the parent of that node as their container. The index of the start point is the number of preceding sibling nodes, and the index of the end point is one greater than the start point. Thus the covering range of an element is the pair of node-points to just before and just after that element.

Document Order

XPointer extends the XPath concept of "document order" to include points and ranges.

First, a "preceding node" is defined for all points as follows:

For a node-point with a nonzero index X, the preceding node is the Xth child of the container node.
For a node-point with a zero index, the preceding node is the container node, unless it has attributes or namespaces. In that case, it is the last attribute or namespace declaration.
For a character-point, the preceding node is its container node.

Using these definitions, you can find document orderings that XPath does not specify:

A node is located before a point if it is before or the same as the preceding node of that point. Otherwise, it is found after the point.
The document order of a node and a range matches the document order of that node and the start point of the range.
The document order of two points matches the document order of their preceding nodes. If they are identical, the point with the smaller index comes first. (If both the preceding node and the indices of the points are equal, they are the same.)
The document order of a point and a range matches the document order of that point and the start point of the range.
The document order of two ranges matches the document order of their start points, unless they have the same start point. In that case, it is the document order of their end points.

Initialization of Evaluation Context

The evaluation of XPointer expressions occurs in the same way as the evaluation of XPath expressions, albeit with a few changes:

The context location is initialized to a root node. For a URI reference fragment specifier XPointer expression, it is the root node of the document that the rest of the URI specifies. For other XPointer use for example, the XPointer transform described in Chapter 19 the initial root node is specified by the application.
The context position and size are initialized to 1.
An empty set of variable bindings is used.
The library of functions matches that defined in Chapter 6 for XPath and Section 7.3.3.
A set of namespace declarations is provided, as described earlier in this chapter.

7.3.3 XPointer Functions

The following functions have been added to the core XPath function library for the evaluation of XPointer expressions. In this section, the function name appears in boldface, preceded by the data type of the result in italics. Parameters are represented by their data type in italics. Parameters are followed by a question mark when they are optional.

A location-set is a superset of a node-set. Nodes are also locations. Thus, wherever a parameter is shown as being of type location-set, you may supply a node-set without a data type mismatch occurring.

location-set end-point (location-set)

The result is a point for each location in the input as specified by the following rules:

For an input point, the output element is the same point.
For an input range, the output element is the end point of the range.
For a root or element node input, the output is the point just after the last child of the input. That is, the output point has a container node of the input node and an index of the number of children of the input.
For a text, comment, or processing instruction node, the output is the point just after the end of the text content. That is, the output point has a container of the input node and an index of the length of the string value of the input.
For an attribute or namespace node, the XPointer in which this function appears fails.

location-set here( )

This function fails if the XPointer where it appears is not in XML. If it is in XML, then the function returns a location-set with a single member. If the XPointer expression being evaluated occurred in a text node, then the function returns the parent element. Otherwise, it returns the node containing the XPointer, presumably an attribute or processing instruction node. (When an XPointer occurs as element content, it isn't actually in that element but rather appears in a text child of that element.)

location-set origin( )

This function provides addressing relative to the origin of the link traversed to reach the document containing the XPointer. It returns a location-set with a single member the element from which the traversal was initiated. An error occurs if you invoke this function where no such traversal has occurred or the document from which traversal occurred is not XML. You cannot use this function in a URI reference fragment identifier where a URI is also provided, unless that URI identifies the same resource from which the traversal was initiated. See [Xlink] for more information on traversal.

location-set range (location-set)

This function returns the ranges covering all items in the input. A covering range is added to the output for each member of the input.

location-set range-inside (location-set)

This function returns the ranges covering the contents of all items in the input. For every input item that is a range or point, that range (or the collapsed range of the point) is added. For all other types of input item, a range is added with that item as the container node and a start point index of zero. The end point index is the number of children of that item or, if the input item is of a type that cannot have children, the length of the string value of the item.

location-set range-to (location-set)

Range-to is a special function in terms of the way in which it makes use of the context. For each location in the context, it returns a range from the start point of the context location to the end point found by evaluating its parameter with that context location. A special-purpose extension to the XPath syntax permits the use of a range-to in place of an axis specifier and node test in a location path step. For example, to obtain a range from the element with the ID "label1" to the element with the ID "label2" you can write the following code:

 xpointer(id("label1")/range-to(id("label2")))

As another example, if portions of a document have been marked by EdStart and EdEnd elements, ranges covering all such pairs could be found with the following code:

 xpointer(//EdStart/range-to(following::EdEnd[1])))

location-set start-point (location-set)

This function returns a point for each location in the input as specified by the following rules:

For an input point, the output element is the same point.
For an input range, the output element is the start point of the range.
For a root or element node input, the output is the point just before the first child of the input. That is, the output point has a container node of the input node and an index of zero.
For a text, comment, or processing instruction node, the output is the point just before the first character of the text content. That is, the output point has a container of the input node and an index of zero.
For an attribute or namespace node, the XPointer in which the function appears fails.

location-set string-range (location-set, string, number?, number?)

For each item in the input location-set, the function searches the string value of that item for the second parameter. For each nonoverlapping occurrence found, it adds a range to the output location-set. This range consists of two character-points encompassing the occurrence of the string if the optional numeric third and fourth parameters are absent.

If one numeric parameter is present, the function returns the position of the first character of the resulting range adjusted by that parameter relative to the beginning of the matched string. A single numeric parameter value of 1 indicates no adjustment.

If a second numeric parameter is present, it specifies the length of the resulting range in characters. The default, in the absence of a second numeric parameter, is that the resulting range extends to include the last matched character. If the numeric parameters are such that the resulting range would extend beyond either end of the string value, the XPointer part in which the function appears fails.