3.2 Going Deep: The XPath Data Model

In the XPath view of things, elements, attributes, text, comments, processing instructions, and even namespaces are represented internally as nodes connected in a tree shape. Some nodes, such as elements, may have child nodes, while others, such as attributes, have no children, as restricted by XML rules. A special node, called the root node, serves as the ultimate ancestor node.

XML Information Set Mapping

XPath 1.0 was completed in November 1999 with a section outlining the XPath Data Model and an appendix defining the mapping to the then-unfinished W3C specification called XML Information Set or "infoset" for short. This formalized description of the XML Data Model, available at http://www.w3.org/TR/xml-infoset/, was completed in October 2001. Subsequently, an errata to XPath, at http://www.w3.org/1999/11/REC-xpath-19991116-errata/, finalized the infoset mapping.

Example 3-2 shows a short XML document, and Figure 3-1 shows how that document would be represented by a tree of nodes.

In this example, note that neither the XML declaration nor the DOCTYPE declaration produce any nodes. Thus, these XML data structures are effectively invisible to XPath and, by extension, XForms. On the other hand, notice how each element node has two namespace nodes attached: one for the xmlns:html declaration on the root element, which applies throughout, and one for the built-in declaration of the xml prefix, as seen in the attribute xml:lang. Even a short document like this generates a huge number of nodes!

Example 3-2. A basic XML document, represented as text
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"     "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd"> <?xml-stylesheet href="screen.css" type="text/css" media="screen"?> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">   <head>     <title>Virtual Library</title>   </head>   <body>     <p>Moved to <a href="http://vlib.example.org/">vlib.example.org</a>.</p>   </body> </html>
Figure 3-1. A basic XML document, represented as nodes in the XPath data model
figs/xfe_0301.gif

Each node has a set of properties that are exposed to XPath in various ways.

  • A name. Some nodes, like the root node, comments, and text nodes will always have a name of the empty string (""). Elements and attributes have an expanded name, a combination of a local name plus a namespace URI. Processing instructions use their target, which is not subject to namespace processing, as a name. Namespace nodes have the namespace prefix as a name.

  • A string value. Text nodes contain the text characters from the source XML, with line endings normalized to #xA as required by the XML specification. Text nodes will always contain as much text as possible, and, thus, no two consecutive children will ever both be text nodes and the location of CDATA sections is not preserved. Comment nodes contain the full text of the comment (minus the <!-- and --> delimiters). Attributes contain the attribute value, and processing instructions contain the text after the initial target and whitespace, and up to and not including the terminating ?>. Namespace nodes contain a URI (or the empty string) as a name. The root node and element nodes compute their string value by recursively concatenating the string values of all descendant nodes.

  • Children. In theory, you can ask about children of any node, but only the root node and element nodes will actually have children.

  • A position relative to all other nodes. The overall ordering is called the document order.

Additionally, some nodes have the following properties:

  • A parent. Every node but the special root node has exactly one parent.

  • Attributes. Elements may also contain attributes, which are treated specially and not considered children.

  • Namespaces. Namespace nodes are also treated specially, and are not considered to have a child relationship with the element node to which they attach.

A collection of nodes without duplicates is referred to as a node-set.



XForms Essentials
Xforms Essentials
ISBN: 0596003692
EAN: 2147483647
Year: 2005
Pages: 117
Authors: Micah Dubinko

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net