Chapter 8. XSL Transformations (XSLT) | XML in a Nutshell, 2nd Edition

CONTENTS

8.1 An Example Input Document
8.2 xsl:stylesheet and xsl:transform
8.3 Stylesheet Processors
8.4 Templates and Template Rules
8.5 Calculating the Value of an Element with xsl:value-of
8.6 Applying Templates with xsl:apply-templates
8.7 The Built-in Template Rules
8.8 Modes
8.9 Attribute Value Templates
8.10 XSLT and Namespaces
8.11 Other XSLT Elements

The Extensible Stylesheet Language (XSL) is divided into two parts: XSL Transformations (XSLT) and XSL Formatting Objects (XSL-FO). This chapter describes XSLT. Chapter 13 covers XSL-FO.

XSLT is an XML application for specifying rules by which one XML document is transformed into another XML document. An XSLT document that is, an XSLT stylesheet contains template rules. Each template rule has a pattern and a template. An XSLT processor compares the elements and other nodes in an input XML document to the template-rule patterns in a stylesheet. When one matches, it writes the template from that rule into the output tree. When it's done, it may further serialize the output tree into an XML document or some other format like plain text or HTML.

This chapter describes the template rules and a few other elements that appear in an XSLT stylesheet. XSLT uses the XPath syntax to identify matching nodes. We'll introduce a few pieces of XPath here, but most of it will be covered in Chapter 9.

8.1 An Example Input Document

To demonstrate XSL transformations, we first need a document to transform. Example 8-1 shows the document used in this chapter. The root element is people, which contains two person elements. The person elements have roughly the same structure (a name followed by professions and hobbies) with some differences. For instance, Alan Turing has three professions, but Richard Feynman only has one. Feynman has a middle_initial and a hobby, but Turing doesn't. Still these are clearly variations on the same basic structure. A DTD that permitted both of these would be easy to write.

Example 8-1. An XML document describing two people

<?xml version="1.0"?> <people>   <person born="1912" died="1954">     <name>       <first_name>Alan</first_name>       <last_name>Turing</last_name>     </name>     <profession>computer scientist</profession>     <profession>mathematician</profession>     <profession>cryptographer</profession>   </person>   <person born="1918" died="1988">     <name>       <first_name>Richard</first_name>       <middle_initial>P</middle_initial>       <last_name>Feynman</last_name>     </name>     <profession>physicist</profession>     <hobby>Playing the bongoes</hobby>   </person> </people>

Example 8-1 is an XML document. For purposes of this example, it will be stored in a file called people.xml. It doesn't have a DTD; however, this is tangential. XSLT works equally well with valid and invalid (but well-formed) documents. This document doesn't use namespaces either, though it could. XSLT works just fine with namespaces. Unlike DTDs, XSLT does pay attention to the namespace URIs instead of the prefixes. Thus, it's possible to use one prefix for an element in the input document and different prefixes for the same namespace in the stylesheet and output documents.

8.2 xsl:stylesheet and xsl:transform

An XSLT stylesheet is an XML document. It can and generally should have an XML declaration. It can have a document type declaration, though most stylesheets do not. The root element of this document is either stylesheet or transform. These are synonyms for each other. You can use either stylesheet or transform as you prefer. There is absolutely no difference between them, aside from the name. They both have the same possible children and attributes. They both mean the same thing to an XSLT processor.

The stylesheet and transform elements, like all other XSLT elements, are in the http://www.w3.org/1999/XSL/Transform namespace. This namespace is customarily mapped to the xsl prefix so that you write xsl:transform or xsl:stylesheet rather than simply transform or stylesheet.

As well as the xmlns:xsl attribute declaring this prefix mapping, the root element must have a version attribute with the value 1.0. Thus, a minimal XSLT stylesheet, with only the root element and nothing else, is as shown in Example 8-2.

This namespace URI must be exactly correct. If even so much as a single character is wrong, the stylesheet processor will output the stylesheet itself instead of either the input document or the transformed input document. There's a reason for this (see Section 2.3 of the XSLT 1.0 specification, Literal Result Element as Stylesheet, if you really want to know), but the bottom line is that this weird behavior looks very much like a bug in the XSLT processor if you're not expecting it. If you ever do see your stylesheet processor spitting your stylesheet back out at you, the problem is almost certainly an incorrect namespace URI.

Internet Explorer 5.0 and 5.5 partially support a very old and out-of-date working draft of XSLT, as well as various Microsoft extensions to this old working draft. They do not support XSLT 1.0, and indeed no XSLT stylesheets in this book work in IE5. Stylesheets that are meant for Microsoft XSLT can be identified by their use of the http://www.w3.org/TR/WD-xsl namespace. IE6 supports both http://www.w3.org/1999/XSL/Transform and http://www.w3.org/TR/WD-xsl. Good XSLT developers don't use http://www.w3.org/TR/WD-xsl and don't associate with developers who do.

Example 8-2. A minimal XSLT stylesheet

<?xml version="1.0"?> <xsl:stylesheet version="1.0"                 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> </xsl:stylesheet>

Perhaps a little surprisingly, this is a complete XSLT stylesheet; an XSLT processor can apply it to an XML document to produce an output document. Example 8-3 shows the effect of applying this stylesheet to Example 8-1.

Example 8-3. people.xml transformed by the minimal XSLT stylesheet

<?xml version="1.0" encoding="utf-8"?>       Alan       Turing     computer scientist     mathematician     cryptographer       Richard       P       Feynman     physicist     Playing the bongoes

You can see that the output consists of a text declaration plus the text of the input document. In this case, the output is a well-formed external parsed entity, but it is not itself a complete XML document.

Markup from the input document has been stripped. The net effect of applying an empty stylesheet, like Example 8-2, to any input XML document is to reproduce the content but not the markup of the input document. To change that, we'll need to add template rules to the stylesheet telling the XSLT processor how to handle the specific elements in the input document. In the absence of explicit template rules, an XSLT processor falls back on built-in rules that have the effect shown here.

8.3 Stylesheet Processors

An XSLT processor is a piece of software that reads an XSLT stylesheet, reads an input XML document, and builds an output document by applying the instructions in the stylesheet to the information in the input document. An XSLT processor can be built into a web browser, just as MSXML is in Internet Explorer 6. It can be built into a web or application server, as in the Apache XML Project's Cocoon (http://xml.apache.org/cocoon). Or it can be a standalone program run from the command like Michael Kay's SAXON (http://saxon.sourceforge.net) or the Apache XML Project's Xalan (http://xml.apache.org/xalan-j/ ).

8.3.1 Command-Line Processors

The exact details of how to install, configure, and run the XSLT processor naturally vary from processor to processor. Generally, you have to install the processor in your path, or add its jar file to your class path if it's written in Java. Then you pass in the names of the input file, stylesheet file, and output file on the command line. For example, using Xalan, Example 8-3 is created in this fashion:

% java org.apache.xalan.xslt.Process -IN people.xml -XSL minimal.xsl    -OUT 8-3.txt ========= Parsing file:D:/books/xian/examples/08/minimal.xsl ========== Parse of file:D:/books/xian/examples/08/minimal.xsl took 771 milliseconds ========= Parsing people.xml ========== Parse of people.xml took 90 milliseconds ============================= Transforming... transform took 20 milliseconds XSLProcessor: done

For exact details, you'll need to consult the documentation that comes with your XSLT processor.

8.3.2 The xml-stylesheet Processing Instruction

XML documents that will be served directly to web browsers can have an xml-stylesheet processing instruction in their prolog telling the browser where to find the associated stylesheet for the document, as discussed in the last chapter. If this stylesheet is an XSLT stylesheet, then the type pseudoattribute should have the value application/xml. For example, this xml-stylesheet processing instruction says that browsers should apply the stylesheet found at the absolute URL http://www.oreilly.com/styles/people.xsl. Relative URLs can also be used.

<?xml version="1.0"?> <?xml-stylesheet type="application/xml"                  href="http://www.oreilly.com/styles/people.xsl"?> <people>   ...

Microsoft Internet Explorer uses type="text/xsl" for XSLT stylesheets. However, the text/xsl MIME media type has not been and will not be registered with the IANA. It is a figment of Microsoft's imagination. In the future, application/xslt+xml will probably be registered to identify XSLT stylesheets specifically.

8.4 Templates and Template Rules

To control what output is created from what input, you add template rules to the XSLT stylesheet. Each template rule is represented by an xsl:template element. This element has a match attribute that contains an XPath pattern identifying the input it matches; it also contains a template that is instantiated and output when the pattern is matched. The terminology is a little tricky here: the xsl:template element is a template rule that contains a template. An xsl:template element is not itself the template.

The simplest match pattern is an element name. Thus, this template rule says that every time a person element is seen, the stylesheet processor should emit the text "A Person":

<xsl:template match="person">A Person</xsl:template>

Example 8-4 is a complete stylesheet that uses this template rule.

Example 8-4. A very simple XSLT stylesheet

<?xml version="1.0"?> <xsl:stylesheet version="1.0"                 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">   <xsl:template match="person">A Person</xsl:template> </xsl:stylesheet>

Applying this stylesheet to the document in Example 8-1 produces this output:

<?xml version="1.0" encoding="utf-8"?>  A Person  A Person

There were two person elements in the input document. Each time the processor saw one, it emitted the text "A Person." The whitespace outside the person elements was preserved, but everything inside the person elements was replaced by the contents of the template rule, which is called the template.

The text "A Person" is called literal data characters, which is a fancy way of saying plain text that is copied from the stylesheet into the output document. A template may also contain literal result elements, i.e., markup that is copied from the stylesheet to the output document. For instance, Example 8-5 wraps the text "A Person" in between  and  tags:

Example 8-5. A simple XSLT stylesheet with literal result elements

<?xml version="1.0"?> <xsl:stylesheet version="1.0"                 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">   <xsl:template match="person">     <p>A Person</p>   </xsl:template> </xsl:stylesheet>

The output from this stylesheet is:

<?xml version="1.0" encoding="utf-8"?>   <p>A Person</p>   <p>A Person</p>

The  and  tags were copied from the input to the output. The only major restriction on the markup you may output is that it must be well-formed XML because the stylesheet must be well-formed XML. For instance, you cannot write a template rule like this:

<xsl:template match="person">   A Person<p> </xsl:template>

Here the  start-tag has no matching end-tag, and, therefore, the stylesheet is malformed. Any other markup you include in your XSLT stylesheet must be similarly well-formed. Empty-element tags must end with />; attribute values must be quoted; less-than signs must be escaped as <; all entity references must be declared in a DTD except for the five predefined ones, and so forth. XSLT has no exceptions to the rules of well-formedness.

8.5 Calculating the Value of an Element with xsl:value-of

Most of the time, the text that is output is more closely related to the text that is input than it was in the last couple of examples. Other XSLT elements can select particular content from the input document and insert it into the output document.

One of the most generally useful elements of this kind is xsl:value-of. This element calculates the string value of an XPath expression and inserts it into the output. The value of an element is the text content of the element after all the tags have been removed and entity and character references have been resolved. The element whose value is taken is identified by a select attribute containing an XPath expression.

For example, suppose you just want to extract the names of all the people in the input document. Then you might use a stylesheet like Example 8-6. Here the person template outputs only the value of the name child element of the matched person in between  and  tags.

Example 8-6. A simple XSLT stylesheet that uses xsl:value-of

<?xml version="1.0"?> <xsl:stylesheet version="1.0"                 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">   <xsl:template match="person">     <p>       <xsl:value-of select="name"/>     </p>   </xsl:template> </xsl:stylesheet>

When an XSLT processor applies this stylesheet to Example 8-1, it outputs this text:

<?xml version="1.0" encoding="utf-8"?>   <p>       Alan       Turing     </p>   <p>       Richard       P       Feynman     </p>

8.6 Applying Templates with xsl:apply-templates

By default, an XSLT processor reads the input XML document from top to bottom, starting at the root of the document and working its way down using preorder traversal. Template rules are activated in the order in which they match elements encountered during this traversal. This means a template rule for a parent will be activated before template rules matching the parent's children.

However, one of the things a template can do is change the order of traversal. That is, it can specify which element(s) should be processed next. It can specify that an element(s) should be processed in the middle of processing another element. It can even prevent particular elements from being processed. In fact, Examples 8-4 through 8-6 all implicitly prevent the child elements of each person element from being processed. Instead, they provided their own instructions about what the XSLT processor was and was not to do with those children.

The xsl:apply-templates element lets you make explicit your choice of processing order. Its select attribute contains an XPath expression telling the XSLT processor which nodes to process at that point in the output tree.

For example, suppose you wanted to list the names of the people in the input document; however, you want to put the last names first, regardless of the order in which they occur in the input document, and you don't want to output the professions or hobbies. First you need a name template that looks like this:

<xsl:template match="name">   <xsl:value-of select="last_name"/>,   <xsl:value-of select="first_name"/> </xsl:template>

However, this alone isn't enough; if this were all there was in the stylesheet, not only would the output include the names, it would also include the professions and hobbies. You also need a person template rule that says to apply templates to name children only, but not to any other child elements like profession or hobby. This template rule does that:

<xsl:template match="person">   <xsl:apply-templates select="name"/> </xsl:template>

Example 8-7 shows the complete stylesheet.

Example 8-7. A simple XSLT stylesheet that uses xsl:apply-templates

<?xml version="1.0"?> <xsl:stylesheet version="1.0"                 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">   <xsl:template match="name">     <xsl:value-of select="last_name"/>,     <xsl:value-of select="first_name"/>   </xsl:template>   <xsl:template match="person">     <xsl:apply-templates select="name"/>   </xsl:template> </xsl:stylesheet>

When an XSLT processor applies this stylesheet to Example 8-1, this is output:

<?xml version="1.0" encoding="utf-8"?>   Turing,     Alan   Feynman,     Richard

The order of the template rules in the stylesheet doesn't matter. It's only the order of the elements in the input document that matters.

Applying templates is also important when the child elements have templates of their own, even if you don't need to reorder the elements. For example, let's suppose you want a template rule for the root people element that wraps the entire document in an HTML header and body. Its template will need to use xsl:apply-templates to indicate where it wants the children of the root element to be placed. That template rule might look like this:

<xsl:template match="people">   <html>     <head><title>Famous Scientists</title></head>     <body>       <xsl:apply-templates select="person"/>     </body>   </html> </xsl:template>

This template tells the XSLT processor to replace every people element in the input document (of which there is only one in Example 8-1) with an html element. This html element contains some literal character data and several literal result elements of which one is a body element. The body element contains an xsl:apply-templates element telling the XSLT processor to process all the person children of the current people element and insert the output of any matched templates into the body element of the output document.

If you'd rather apply templates to all types of children of the people element, rather than just person children, you can omit the select attribute as demonstrated in Example 8-8. You can also use the more complicated XPath expressions discussed in the next chapter to be more precise about which elements you want to apply templates to.

Example 8-8. An XSLT stylesheet that generates a complete HTML document

<?xml version="1.0"?> <xsl:stylesheet version="1.0"                 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">   <xsl:template match="people">     <html>       <head><title>Famous Scientists</title></head>       <body>         <xsl:apply-templates/>       </body>     </html>   </xsl:template>   <xsl:template match="name">     <p><xsl:value-of select="last_name"/>,     <xsl:value-of select="first_name"/></p>   </xsl:template>   <xsl:template match="person">     <xsl:apply-templates select="name"/>   </xsl:template> </xsl:stylesheet>

When an XSLT processor applies this stylesheet to Example 8-1, it outputs the well-formed HTML document shown in Example 8-9. Look closely at this example, and you may spot an important change that was not explicitly caused by the instructions in the stylesheet.

Example 8-9. The HTML document produced by applying Example 8-8 to Example 8-1

<html> <head> <title>Famous Scientists</title> </head> <body>   <p>Turing,     Alan</p>   <p>Feynman,     Richard</p> </body> </html>

The difference between Example 8-9 and all the previous output examples is that the text declaration has disappeared! Although there is an XSLT element you can use to specify whether you want a text declaration preceding your output (xsl:output), we haven't used that here. Instead, the XSLT processor noted that the root output element was html, and it adjusted itself accordingly. Since HTML output is such a common case, XSLT has special rules just to handle it. As well as omitting the text declaration, the processor will use HTML empty-element syntax like   instead of XML empty-element syntax like   in the output document. (The input document and stylesheet must still be well-formed XML.) There are about half a dozen other changes the XSLT processor will make when it knows it's outputting HTML, all designed to make the output more acceptable to existing web browsers than is well-formed XML.

8.7 The Built-in Template Rules

There are seven kinds of nodes in an XML document: the root node, element nodes, attribute nodes, text nodes, comment nodes, processing instruction nodes, and namespace nodes. XSLT provides a default built-in template rule for each of these seven kinds of nodes that says what to do with that node if the stylesheet author has not provided more specific instructions. These rules use special wildcard XPath expressions to match all nodes of a given type. Together these template rules have major effects on which nodes are activated when.

8.7.1 The Default Template Rule for Text and Attribute Nodes

The most basic built-in template rule copies the value of text and attribute nodes into the output document. It looks like this:

<xsl:template match="text( )|@*">   <xsl:value-of select="."/> </xsl:template>

The text( ) node test is an XPath pattern matching all text nodes, just as first_name is an XPath pattern matching all first_name element nodes. @* is an XPath pattern matching all attribute nodes. The vertical bar combines these two patterns so that the template rule matches both text and attribute nodes. The rule's template says that whenever a text or attribute node is matched, the processor should output the value of that node. For a text node, this value is simply the text in the node. For an attribute, this value is the attribute value but not the name.

Example 8-10 is an XSLT stylesheet that pulls the birth and death dates out of the born and died attributes in Example 8-1. The default template rule for attributes takes the value of the attributes, but an explicit rule selects those values. The @ sign in @born and @died indicates that these are attributes of the matched element rather than child elements.

Example 8-10. An XSLT stylesheet that reads attribute

<?xml version="1.0"?> <xsl:stylesheet version="1.0"                 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">   <xsl:template match="people">     <html>       <head><title>Famous Scientists</title></head>       <body>         <dl>           <xsl:apply-templates/>         </dl>       </body>     </html>   </xsl:template>   <xsl:template match="person">     <dt><xsl:apply-templates select="name"/></dt>     <dd><ul>       <li>Born: <xsl:apply-templates select="@born"/></li>       <li>Died: <xsl:apply-templates select="@died"/></li>     </ul></dd>   </xsl:template> </xsl:stylesheet>

When an XSLT processor applies this stylesheet to Example 8-1, it outputs the HTML document shown in Example 8-11.

Example 8-11. The HTML document produced by applying Example 8-10 to Example 8-1

<html>    <head>       <title>Famous Scientists</title>    </head>    <body>       <dl>          <dt>             Alan             Turing          </dt>          <dd>             <ul>                <li>Born: 1912</li>                <li>Died: 1954</li>             </ul>          </dd>          <dt>             Richard             P             Feynman          </dt>          <dd>             <ul>                <li>Born: 1918</li>                <li>Died: 1988</li>             </ul>          </dd>       </dl>    </body> </html>

It's important to note that although this template rule says what should happen when an attribute node is reached, by default the XSLT processor never reaches attribute nodes and, therefore, never outputs the value of an attribute. Attribute values are output according to this template only if a specific rule applies templates to them, and none of the default rules do this because attributes are not considered to be children of their parents. In other words, if element E has an attribute A, then E is the parent of A, but A is not the child of E. (The biological metaphor breaks down here.) Applying templates to the children of an element with <xsl:apply-templates/> does not apply templates to attributes of the element. To do that, the xsl:apply-templates element must contain an XPath expression specifically selecting attributes.

8.7.2 The Default Template Rule for Element and Root Nodes

The most important template rule is the one that guarantees that children are processed. This is that rule:

<xsl:template match="*|/">   <xsl:apply-templates/> </xsl:template>

The asterisk * is an XPath wild-card pattern that matches all element nodes, regardless of what name they have or what namespace they're in. The forward slash / is an XPath pattern that matches the root node. This is the first node the processor selects for processing, and therefore this is the first template rule the processor executes (unless a nondefault template rule also matches the root node). Again, the vertical bar combines these two expressions so that it matches both the root node and element nodes. In isolation, this rule means that the XSLT processor eventually finds and applies templates to all nodes except attribute and namespace nodes because every nonattribute, non-namespace node is either the root node, a child of the root node, or a child of an element. Only attribute and namespace nodes are not children of their parents. (You can think of them as disinherited nodes.)

Of course, templates may override the default behavior. For example, when you include a template rule matching person elements in your stylesheet, then children of the matched person elements are not necessarily processed, unless your own template says to process them.

8.7.3 The Default Template Rule for Comment and Processing Instruction Nodes

This is the default template rule for comments and processing instructions:

<xsl:template match="processing-instruction()|comment( )"/>

It matches all comments and processing instructions. However, it does not output anything into the result tree. That is, unless you provide specific rules matching comments or processing instructions, no part of these items will be copied from the input document to the output document.

8.7.4 The Default Template Rule for Namespace Nodes

A similar template rule matches namespace nodes and instructs the processor not to copy any part of the namespace node to the output. This is truly a built-in rule that must be implemented in the XSLT processor's source code; it can't even be written down in an XSLT stylesheet because there's no such thing as an XPath pattern matching a namespace node. That is, there's no namespace( ) node test in XPath. The XSLT processor handles the insertion of any necessary namespace declarations in the output document automatically, without any special assistance from namespace templates.

8.8 Modes

Sometimes the same input content needs to appear multiple times in the output document, formatted according to a different template each time. For instance, the titles of the chapters in a book would be formatted one way in the chapters themselves and a different way in the table of contents. Both xsl:apply-templates and xsl:template elements can have optional mode attributes that connect different template rules to different uses. A mode attribute on xsl:template element identifies in which mode that template rule should be activated. An xsl:apply-templates element with a mode attribute only activates template rules with matching mode attributes. Example 8-12 demonstrates with a stylesheet that begins the output document with a list of people's names. This is accomplished in the toc mode. Then a separate template rule, as well as a separate xsl:apply-templates element in the default mode (really no mode at all), output the complete contents of all person elements.

Example 8-12. A stylesheet that uses modes

<?xml version="1.0"?> <xsl:stylesheet version="1.0"                 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">   <xsl:template match="people">     <html>       <head><title>Famous Scientists</title></head>       <body>         <ul><xsl:apply-templates select="person" mode="toc"/></ul>         <xsl:apply-templates select="person"/>       </body>     </html>   </xsl:template>   <!-- Table of Contents Mode Templates -->   <xsl:template match="person" mode="toc">     <xsl:apply-templates select="name" mode="toc"/>   </xsl:template>   <xsl:template match="name" mode="toc">     <li><xsl:value-of select="last_name"/>,     <xsl:value-of select="first_name"/></li>   </xsl:template>   <!-- Normal Mode Templates -->   <xsl:template match="person">     <p><xsl:apply-templates/></p>   </xsl:template> </xsl:stylesheet>

Example 8-13 shows the output when this stylesheet is applied to people.xml. The people template in Example 8-12 applies templates to its person children twice. The first time it does so in the toc mode. This selects the first person template rule in the stylesheet that outputs each person in the form <li>Turing, Alan</li>. The second time, it doesn't specify any mode. This selects the second person template rule in the stylesheet, which outputs all the character data of the person wrapped in a p element.

Example 8-13. Output from a stylesheet that uses modes to process each person twice with different templates

<html> <head> <title>Famous Scientists</title> </head> <body> <ul> <li>Turing,     Alan</li> <li>Feynman,     Richard</li> </ul> <p>       Alan       Turing     computer scientist     mathematician     cryptographer   </p> <p>       Richard       P       Feynman     physicist     Playing the bongoes   </p> </body> </html>

For every mode you use in the stylesheet, the XSLT processor adds one default template rule to its set of built-in rules. This applies to all element and root nodes in the specified mode and applies templates to their children in the same mode (since the usual built-in template rule for element and root nodes doesn't have a mode). For instance, the extra default rule for Example 8-10 looks like this:

<xsl:template match="*|/" mode="toc">   <xsl:apply-templates mode="toc"/> </xsl:template>

8.9 Attribute Value Templates

It's easy to include known attribute values in the output document as the literal content of a literal result element. For example, this template rule wraps each input person element in an HTML span element that has a class attribute with the value person:

<xsl:template match="person">   <span class="person"><xsl:apply-templates/></span> </xsl:template>

However, it's trickier if the value of the attribute is not known when the stylesheet is written, but instead must be read from the input document. The solution is to use an attribute value template. An attribute value template is an XPath expression enclosed in curly braces that's placed in the attribute value in the stylesheet. When the processor outputs that attribute, it replaces the attribute value template with its value. For example, suppose you wanted to write a name template that changed the input name elements to empty elements with first_name, middle_initial, and last_name attributes like this:

<name first="Richard" initial="P" last="Feynman"/>

This template accomplishes that task:

<xsl:template match="name">   <name first="{first_name}"         initial="{middle_initial}"         last="{last_name}" /> </xsl:template>

The value of the first attribute in the stylesheet is replaced by the value of the first_name element from the input document. The value of the initial attribute is replaced by the value of the middle_initial element from the input document; the value of the last attribute is replaced by the value of the last_name element from the input document.

8.10 XSLT and Namespaces

XPath patterns, as well as expressions that match and select elements, identify these elements based on their local part and namespace URI. They do not consider the namespace prefix. Most commonly, the same namespace prefix is mapped to the same URI in both the input XML document and the stylesheet. However, this is not required. For instance, consider Example 8-14. This is exactly the same as Example 8-1, except that now all the elements have been placed in the namespace http://www.cafeconleche.org/namespaces/people.

Example 8-14. An XML document describing two people that uses a default namespace

<?xml version="1.0"?> <people xmlns="http://www.cafeconleche.org/namespaces/people">   <person born="1912" died="1954">     <name>       <first_name>Alan</first_name>       <last_name>Turing</last_name>     </name>     <profession>computer scientist</profession>     <profession>mathematician</profession>     <profession>cryptographer</profession>   </person>   <person born="1918" died="1988">     <name>       <first_name>Richard</first_name>       <middle_initial>M</middle_initial>       <last_name>Feynman</last_name>     </name>     <profession>physicist</profession>     <hobby>Playing the bongoes</hobby>   </person> </people>

Except for the built-in template rules, none of the rules in this chapter so far will work on this document! For instance, consider this template rule from Example 8-8:

<xsl:template match="name">   <p><xsl:value-of select="last_name"/>,   <xsl:value-of select="first_name"/></p> </xsl:template>

It's trying to match a name element in no namespace, but the name elements in Example 8-13 aren't in no namespace. They're in the http://www.cafeconleche.org/namespaces/people namespace. This template rule no longer applies. To make it fit, we map the prefix pe to the namespace URI http://www.cafeconleche.org/namespaces/people. Then instead of matching name, we match pe:name. That the input document doesn't use the prefix pe is irrelevant as long as the namespace URIs match up. Example 8-15 demonstrates by rewriting Example 8-8 to work with Example 8-14 instead.

Example 8-15. An XSLT stylesheet for input documents using the http://www.cafeconleche.org/namespaces/people

<?xml version="1.0"?> <xsl:stylesheet version="1.0"                 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"                 xmlns:pe="http://www.cafeconleche.org/namespaces/people">   <xsl:template match="pe:people">     <html>       <head><title>Famous Scientists</title></head>       <body>         <xsl:apply-templates/>       </body>     </html>   </xsl:template>   <xsl:template match="pe:name">     <p><xsl:value-of select="pe:last_name"/>,     <xsl:value-of select="pe:first_name"/></p>   </xsl:template>   <xsl:template match="pe:person">     <xsl:apply-templates select="pe:name"/>   </xsl:template> </xsl:stylesheet>

The output is essentially the same output you get by applying Example 8-8 to Example 8-1 except that it will have an extra xmlns:pe attribute on the root element.

8.11 Other XSLT Elements

This is hardly everything there is to say about XSLT. Indeed, XSLT does a lot more than the little we've covered in this introductory chapter. Other features yet to be discussed include:

Named templates
Numbering and sorting output elements
Conditional processing
Iteration
Extension elements and functions
Importing other stylesheets

These and more will all be discussed in Chapter 23. Since XSLT is itself Turing complete and since it can invoke extension functions written in other languages like Java, chances are very good you can use XSLT to make whatever transformations you need to make.

Furthermore, besides these additional elements, you can do a lot more simply by expanding the XPath expressions and patterns used in the select and match attributes of the elements with which you're already familiar. These techniques will be explored in Chapter 9.

However, the techniques outlined in this chapter lay the foundation for all subsequent, more advanced work with XSLT. The key to transforming XML documents with XSLT is to match templates to elements in the input document. Those templates contain both literal result data and XSLT elements that instruct the processor where to go to get more data. Everything you do with XSLT is based on this one simple idea.

CONTENTS