Section 18.5. Creating a stylesheet


Prev	don't be afraid of buying books	Next

18.5. Creating a stylesheet

XSLT's processing model revolves around the idea of patterns. Patterns are XPath expressions designed to test nodes. Patterns allow the XSLT processor to choose which elements to apply which style rules to. XSLT's pattern language is basically XPath with a few extensions and restrictions. Patterns are used in the match attribute of template rules to specify which nodes the rule applies to.

18.5.1 Document-level template rule

Consider a document whose root element-type is book and that can contain title, section and appendix element types. section and appendix elements can contain title, para and list subelements. Titles contain #PCDATA and no subelements. Paragraphs and list items contain emph and #PCDATA. Example 18-3 is a DTD that represents these constraints and Example 18-4 is an example document.

Example 18-3. DTD for book example

 <!ELEMENT book (title, (section|appendix)+)> <!ELEMENT section (title, (para|list)+)> <!ELEMENT appendix (title, (para|list)+)> <!ELEMENT title (#PCDATA)> <!ELEMENT para (#PCDATA|emph)*> <!ELEMENT emph (#PCDATA)> <!ELEMENT list (item)+> <!ELEMENT item (#PCDATA|emph)*>

Example 18-4. Book document instance

 <book>     <title>Chicken Soup for the Chicken's Soul</title>     <section>         <title>Introduction</title>         <para>I've always wanted to write               this book.</para>     </section> </book>

First the XSLT processor would examine the root element of the document. The XSLT processor would look for a rule that applied to books (a rule with a match pattern that matched a book). This sort of match pattern is very simple. Example 18-5 demonstrates.

Example 18-5. Simple match pattern

 <xsl:template match="book">   <!-- describe how books should be transformed --> </xsl:template>

We can choose any basic structure for the generated book. Example 18-6 shows a reasonable one.

Example 18-6. Generated book structure

 <xsl:template match="book">   <body>     <h1><!-- handle title --></h1>     <!-- handle sections -->     <hr/> <!-- HTML horizontal rule -->     <h2>Appendices</h2>     <!-- handle appendices -->     <hr/>     <p>Copyright 2004, the establishment.</p>   </body> </xsl:template>

The template in this template rule generates a body to hold the content of the document. The tags for the body element are usually omitted in HTML but we've generated them here so we can add some attributes to the element later. The body is called a literal result element.

18.5.2 Literal result elements

The XSLT processor knows to treat body as a literal result element that is copied into the output because it is not an XSLT instruction (formally, it is not in the XSLT namespace). Elements in templates that are not part of the XSLT namespace are treated literally and copied into the output. You can see why these are called templates! They describe the form of the result document both by ordering content and by generating literal result elements. If the XSLT processor supports legacy HTML output, and the HTML output method is being used to serialize the result, then it will know to use legacy HTML conventions.

The h1, h2 and hr elements are also literal result elements that will create HTML headings and horizontal rules. As the stylesheet is represented in XML, the specification for the horizontal rule can use XML empty-element syntax. Finally the document has a literal result element and literal text representing the copyright. XSLT stylesheets can introduce this sort of boilerplate text.

18.5.3 Extracting data

The template also has comments describing things we still have to handle: the document's title, its sections and the appendices.

We can get the data content from the title element with the xsl:value-of instruction. It has a select attribute which is a pattern. If this pattern names a simple element type then it will match a subelement of the current element.

In this case the current element is the book element as that is the element matched by the template rule. Example 18-7 shows what the data extraction would look like.

Example 18-7. Extracting data from a subelement

 <h1><xsl:value-of select="title"/></h1>

18.5.4 The `apply-templates` instruction

The next step is to handle sections and appendices. We could do it in one of two ways. We could either create a new template rule for handling sections or we could handle sections directly in the book template rule.

The benefit of creating a new rule is that it can be used over and over again. Before we create the new rule we should ensure it will get invoked at the right point. We will use a new instruction, xsl:apply-templates. Example 18-8 shows this instruction.

Example 18-8. The `xsl:apply-templates` instruction

 <xsl:apply-templates select="section"/>

The xsl:apply-templates instruction does two important things.

It finds all nodes that match the select attribute pattern.
It processes each of these in turn. It does so by finding and applying the template rule that matches each node.

This important principle is at the heart of XSLT's processing model.

In this case, the select pattern in the xsl:apply-templates element selects all of the book's subelements of type section. The xsl:apply-templates instruction always searches out the rule that is appropriate for each of the selected nodes. In this case the xsl:apply-templates instruction always searches out the rule that is appropriate for each of the selected nodes. In this case the xsl:apply-templates instruction will search out a rule that applies to sections. The expanded book template rule is in Example 18-9.

Example 18-9. Handling section elements

 <xsl:template match="book">   <body>     <h1><xsl:value-of select="title"/></h1>     <xsl:apply-templates select="section"/>     <hr/>     <h2>Appendices</h2>     <xsl:apply-templates select="appendix"/>     <p>Copyright 2004, the establishment</p>   </body> </xsl:template>

18.5.5 Handling optional elements

Our sample document does not have appendices but the stylesheet should support anything that the DTD or schema allows. Documents of this type created in the future may have appendices.

Our stylesheet generates the title element followed by section elements (in the order that they occurred in the document) followed by appendix elements (also in document order).

If our DTD allowed more than one title subelement in a book element then this stylesheet would generate them all. There is no way for a stylesheet to require that the document have a single title. These sorts of constraints are specified in a DTD or schema.

Our DTD does permit documents to have no appendices. Our "Appendices" title and horizontal rule separating the appendices from the sections would look fairly silly in that case. XSLT provides an instruction called xsl:if that handles this situation. We can wrap it around the relevant parts as shown in Example 18-10.

Example 18-10. Using `xsl:if`

 <xsl:if test="appendix">   <hr/>   <h2>Appendices</h2>   <xsl:apply-templates select="appendix"/> </xsl:if>

The xsl:if instruction goes within a template. We could drop it into our book template as a replacement for our current appendix handling.

The instruction also contains another template within it. The contained template is only instantiated (generated) if there is some element that matches the pattern exhibited by the test attribute – in this case, an appendix element.

As with the select attribute, the context is the current node. If there is no node that matches the pattern in the test attribute then the entire contained template will be skipped.

There is another instruction called xsl:choose that allows for multiple alternatives, including a default template for when none of the other alternatives match.

18.5.6 Reordering the output

If the DTD had allowed titles, sections and appendices to be mixed together our stylesheet would reorder them so that the title preceded the sections and the sections preceded the appendices.

This ability to reorder is very important. It allows us to use one structure in our abstract representation and another in our rendition. The abstract structure is optimized for editing, validating and processing convenience. The rendered structure is optimized for viewing and navigation.

Reordering is easy when you know exactly the order in which you want elements of various types to be processed. In the case of the body, for example: titles before sections before appendices. But within a section or appendix, reordering is somewhat trickier because we don't know the complete output order.

That is, we need to process titles before any of the paragraphs or lists, but we cannot disturb the relative order of the paragraphs and lists themselves. Those have to be generated in the document order.

We can solve this fairly easily. In XPath pattern syntax the vertical bar (|) character means "or". So we can make a rule like the one in Example 18-11.

Example 18-11. The section rule

 <xsl:template match="section">     <h2><xsl:value-of select="title"/></h2>     <xsl:apply-templates select="para|list"/> </xsl:template>

This rule forces titles (in our DTD there can be only one) to be handled first and paragraphs and lists to be processed in the order that they are found. The rules that are defined for paragraphs and lists will automatically be selected when those types of element appear. We'll create those rules next.

18.5.7 Data content

Next we can handle paragraphs. We want them each to generate a single HTML element. We also want them to generate their content to populate that element in the order that the content occurs, not in some order pre-defined by the template.

We need to process all of the paragraph's subnodes. That means that we cannot just handle emph subelements. We must also handle ordinary character data. Example 18-12 demonstrates this.

Example 18-12. Paragraph rule

 <xsl:template match="para">     <p><xsl:apply-templates select="node()"/>     </p> </xsl:template>

As you can see, the rule for paragraphs is very simple. The xsl:apply-templates instruction handles most of the work for us automatically. The select attribute matches all nodes: element nodes, text nodes, etc. If it encounters a text node it copies it to the result; that is a default rule built into XSLT. If it encounters a subelement, it processes it using the appropriate rule.

XSLT handles much of the complexity for us but we should still be clear: transformations will not always be this easy. These rules are so simple because our DTD is very much like HTML. The more alike the source and result DTDs the simpler the transformation will be. It is especially helpful to have a very loose or flexible result DTD. HTML is perfect in this regard.

18.5.8 Handling inline elements

The rule for emph follows the same basic organization as the paragraph rule. Mixed content (i.e. character-containing) elements often use this organization. The HTML element-type name is em (Example 18-13). Note that in this case we will use an abbreviated syntax for the xsl:apply-templates element: Because the select attribute defaults to node(), we can leave it out.

Example 18-13. Handling emphasis

 <xsl:template match="emph">     <em><xsl:apply-templates/></em> </xsl:template>

List items also have mixed content, so we should look at the rules for lists and list items next. They are in Example 18-14.

Example 18-14. List and item rules

 <xsl:template match="list">     <ol>        <xsl:apply-templates/>     </ol> </xsl:template> <xsl:template match="item">     <li><xsl:apply-templates/></li> </xsl:template>

The rules in Example 18-13 and Example 18-14 work together. When a list is detected the literal result element is processed and an ol element is generated. It will contain a single li element for each item. Each li will in turn contain text nodes (handled by the default rule) and emph (handled by the emph rule).

18.5.9 Sharing a template rule

We still need a template rule for appendices. If we wrote out the rule for appendices we would find it to be identical to sections. We could just copy the sections rule but XSLT has a more elegant way. We can amend our rule for sections to say that the rule applies equally to sections or appendices. Example 18-15 demonstrates.

Example 18-15. The rule in Example 18-11 revised to handle appendices as well as sections

 <xsl:template match="section|appendix">     <h2><xsl:value-of select="title"/></h2>     <xsl:apply-templates select="para|list"/> </xsl:template>

18.5.10 Final touches

We now have a complete stylesheet but it is rather basic. We might as well add a background color to beautify it a bit. HTML allows this through the bgcolor attribute of the body element. We will not go into the details of the HTML color scheme but suffice to say that Example 18-16 gives our document a nice light purple background.

Example 18-16. Adding a background color

 <xsl:template match="book">   <body bgcolor="#FFDDFF">     <!-- Handling of body content is unchanged -->     ...   </body> </xsl:template>

There is also one more detail we must take care of. We said earlier that the more flexible a document type is the easier it is to transform to. Even though HTML is pretty flexible it does have one unbreakable rule. Every document must have a title element, but "title" means something different in the HTML vocabulary from what it does in our book DTD.

We've handled the title element from the source as a heading, but in HTML the title shows up in the window's title bar, in the bookmark list and in search engine result lists. We need the document's title to appear as both the HTML title and as an HTML heading element. Luckily XSLT allows us to duplicate data.

With these additions our stylesheet is complete! It is shown in Example 18-17.

Example 18-17. Complete stylesheet

 <?xml version="1.0"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"                 version="1.0"> <xsl:output method="html"/> <xsl:template match="book">   <body bgcolor="#FFDDFF">     <title><xsl:value-of select="title"/></title>     <h1><xsl:value-of select="title"/></h1>     <xsl:apply-templates select="section"/>     <hr/>     <xsl:if test="appendix">       <hr/>       <h2>Appendices</h2>       <xsl:apply-templates select="appendix"/>     </xsl:if>     <p>Copyright 2004, the establishment</p>   </body> </xsl:template> <xsl:template match="para">     <p><xsl:apply-templates/></p> </xsl:template> <xsl:template match="emph">     <em><xsl:apply-templates/></em> </xsl:template> <xsl:template match="list">     <ol>        <xsl:apply-templates/>     </ol> </xsl:template> <xsl:template match="item">     <li><xsl:apply-templates/></li> </xsl:template> <xsl:template match="section|appendix">     <xsl:apply-templates select="title"/>     <xsl:apply-templates select="para|list"/> </xsl:template> </xsl:stylesheet>

As you can see, simple transformations can be quite simple to specify in XSLT – evidence of the language's good design. The important thing to keep in mind is that the basic XSLT processing model is based on template rules, patterns and templates. Flow of control between rules is handled by special instructions.


	Amazon