9.4 The XML Canonicalization Data Model | Secure XML: The New Syntax for Signatures and Encryption

XML canonicalization uses an extension of the XPath data model. For ex ample, all adjacent text content in input is combined into a single text node and CDATA section boundaries are discarded.

The formal definitions of standard XML canonicalizations state that they use an extension of the XPath data model and take two or three inputs. The first must be either an XPath node-set or a sequence of octets. The second is a Boolean flag that indicates whether to preserve comments. The third, which only applies to Exclusive XML Canonicalization, is a list of namespace prefixes to be treated inclusively. If the first parameter is an octet sequence, it is parsed as XML into an XPath node-set and so must be well formed XML. See Figure 9-7.

Figure 9-7. Canonical XML flows

graphics/09fig07.gif

XPath provides a "name( )" function for element and attribute nodes that returns the name of the node with the namespace prefix, if one is present. When multiple namespace prefixes are associated with the same namespace URI at a node, XPath makes it an implementation option whether the original parsed prefix, if any, is guaranteed to be returned by name( ). Note that XML canonicalization requires returning the original prefix. One reason for this mandate is that XPath, XPointer, and XSLT, which can be used in XMLDSIG transforms and elsewhere in XML applications, can search on and match exact prefix names. For this reason, XML canonicalization output, as described in Section 9.5, must always produce the same prefix for the same XML. Otherwise, digital signatures using canonicalization would break.

The specification does not require actual implementation of the full XPath standard to support the XML canonicalization, but it is recommended.

XPath was chosen as the data model for canonicalization for two reasons:

It retains namespace prefixes, an ability that appears to be essential to retaining data of significance to XPointer, XSLT, and similar processing.
XPath processing is a powerful tool in application filtering of data to be signed, provides a language for formally expressing the mandatory Enve loped Signature Transform, and serves as the foundation of XPointer and XSLT.

DOMHASH [RFC 2803] provides a different approach to canonicalization and hashing, although it is primarily of historical interest. DOMHASH is based on the DOM data model and was designed before XPath existed because IOTP v1.0 [RFC 2801, 2802] needed XML digital signatures. At that time, no standard for XML digital signatures existed. Future versions of IOTP are committed to using XMLDSIG.

9.4.1 Node-Set

Most XPath applications consider an XPath node to represent the entire document tree below that node (sometimes ignoring certain descendant node types, such as comments and processing instructions). In XML canonicalization, however, an XPath node indicates only that item and none of its children and, if it is an element, none of its attribute or namespace declarations. For example, the XPath node-set produced by the XPath expression

 id("foo")

would represent just the element with that ID type attribute. For canonical XML purposes, it would not include that element's contents, child elements, attributes, or namespace declarations, if any.

In the XPath Recommendation, the execution of XPath expressions produces XPath node-sets. It therefore makes sense to consider XML canonicalization to work that way when its input consists of a byte stream being parsed as XML. In that case, whether comments will be included in the output can be rolled into the XPath expression that calculates the node-set to be output:

Without comments:

 (//.[not(self::comment()] | //@. | //namespace::*)

With comments:

 (//. | //@. | //namespace::*)

The XMLDSIG security recommendation requires support for the inclusive Canonical XML without comments and recommends support for Canonical XML with comments (although later work found that Exclusive XML Canonicalization is more appropriate for most signatures). With XML Encryption, all the standard canonicalizations are considered optional. In these cases, if you implement XML canonicalization so that it takes an XPath node-set as input and uses even a simplified XPath processor, then you can easily control comments by using various XPath expressions. Such an implementation also makes it easy to achieve many other effects or document subsets. For example, you could produce a canonical version of an XML document with comments left in but processing instructions and a particular element (e.g., Elem1) removed with the following code:

 (//. | //@. | //namespace::*)[not(self::pi() or self::Elem1)]

Note that applying an XPath expression to an XML node-set results in a selected XPath node-set that, in some sense, remains linked to all of the XPath nodes in the original set. Although the expression will have selected a subset of nodes, the resulting node-set can, until it leaves the XPath or equivalent environment, remain subject to XPath parent, child, sibling, or other operations that can recover all the original nodes. Application of an XPath expression to an XML object first converts the object to an XPath node-set, and the above observations then apply to this node-set.

You can use various techniques to select complete document subsets. For example, consider the following:

 id("foo")

To actually select the document subset whose apex is the element with ID "foo," you could use the following:

 (//. | //@* | //namespace::* ) [ count(id("foo")|ancestor-or-self::node()) =   count(ancestor-or-self::node()) ]

This code selects all nodes and then applies the predicate inside the square brackets. The predicate allows only nodes for which the node with ID "foo" appears in the ancestor-or-self path (i.e., for which the node with ID "foo" is the node in question or an ancestor thereof). Thus it selects the entire subdocument of which the "foo" node is the apex.

9.4.2 Document Order

The XPath Recommendation describes XPath node-sets as unordered. Nevertheless, except for attribute and namespace nodes, XPath associates a document order with each of them. XML canonicalization uses the XPath definition of document order, which states that one syntactic item precedes another in document order if its first character appears before the first character of the other item. Note that XML canonicalization extends this document order by giving namespace and attribute nodes document order positions greater than those of their parent element, but less than the positions of any of child elements. It also gives attribute nodes of an element document order positions greater than those of namespace nodes of the same element.

9.4.3 Alphabetic Order for Namespaces and Attributes

Alphabetic comparisons are based on the character's Unicode value. Names are considered left-justified. Shorter names sort before longer names that start with the same characters as the shorter name. For example, "foo" sorts before "foobar." This approach results in exactly the same ordering as left-justified sorting of the names in the UTF-8 encoding if you consider each octet to be an unsigned integer that is, sorting shorter UTF-8 byte sequences before longer ones that start with the same bytes as the shorter sequence. For example, the following three Unicode character strings ("W3", "©", and "©!") are in alphabetic order as Canonical XML defines it:

 U+0057 U+0033 U+00A9 U+00A9 U+0021

The ordering is the same for their UTF-8 encoding:

 0x57 0x33 0xC2 0xA9 0xC2 0xA9 0x21

An element's namespace nodes are ordered alphabetically by the associated prefix name. Its attributes are ordered by namespace URI first, and then by attribute name. Unqualified attributes are ordered before those qualified with a namespace. Although it might seem to be more natural to sort by prefix and local name, sorting attributes by URI first was chosen to group together attributes with the same URI, even if they use different prefixes. This system more closely adheres to the spirit of XML namespaces.