8.4 XSLT Primer

 < Day Day Up > 



8.4 XSLT Primer

XSLT is a flexible and powerful language for transforming XML documents. An XSLT stylesheet contains detailed instructions on how to process the different types of elements in an XML document. The instructions in the stylesheet are then interpreted by a special program, called an XSLT processor, which applies them to the input document to produce another document as output.

XSLT is part of the XSL family of standards developed by the W3C for expressing style information in XML documents. In addition to XSLT, XSL consists of two other standards: XPath and XSL Formatting Objects (XSL-FO). XPath is a special syntax for referring to specific parts of an XML document. XPath syntax is used extensively in XSLT stylesheets for specifying what parts of an input document should be selected.

XSL-FO is an XML vocabulary for specifying the detailed layout and formatting of a document. XSLT was originally developed as a means for applying styles to arbitrary XML documents by translating them into XSL-FO documents. However, XSLT has turned out to be much more broadly useful and is now used as a general-purpose transformation language for XML documents.

The primary purpose of XSLT is for converting one type of XML document into another. For example, there are XSLT stylesheets for converting DocBook documents into XHTML or for converting MathML expressions into Scalable Vector Graphics (SVG) graphics. However, the output is not restricted to being in XML format. By using an appropriate stylesheet, you can convert XML documents into other document formats, such as HTML or PDF.

XSLT is, in fact, widely used for converting XML documents into HTML so they can be displayed by Web browsers. A Web browser cannot properly interpret an arbitrary XML document since the document can contain tag names that are not defined in HTML. The solution is to write an XSLT stylesheet that replaces the document-specific element names with standard HTML elements. The resulting HTML document can even include CSS commands for specifying how different elements should be formatted. XSLT and CSS are therefore complementary and can be used effectively in combination with each other.

XSLT is powerful because it allows you to select specific parts of the XML document tree in any desired order and perform arbitrary transformations on them. An XSLT processor can select and manipulate an element based on various criteria such its position in the document tree, the text it contains, or the names and values of its attributes. It is possible to write an XSLT stylesheet, for example, to extract all elements with a certain name, and then rename them, combine them with additional text, and insert them into the output document in a different order.

XSLT stylesheets have a variety of uses in MathML documents, from specifying styles for particular types of notation to creating macros for complicated markup. In the rest of this chapter, we give a brief overview of the key concepts of XSLT and then provide some simple examples of how it can be used in the context of MathML.

XSLT Processors

To view the results of applying XSLT transformations on an XML document, you must have access to an XSLT processor. Two widely used XSLT processors include Apache Xalan and Saxon. (See Appendix B for URLs from where you can download these processors.) If you have one of these processors installed, you can generate an input document by applying a stylesheet to an output document. For example, in Saxon, the command for doing such a transformation is as shown below:

    java -jar saxon.jar -o output.xml input.xml test.xsl 

This generates an output document called output.xml from an input document called input.xml by applying a stylesheet called test.xsl.

The alternative to using a standalone XSLT processor is to use the built-in XSLT capabilities of a Web browser. Current versions of most browsers, including IE 6.0 and Netscape 7.0, have an XSLT processor already built in and can automatically transform XML documents before displaying them. This allows you to display arbitrary XML documents on the Web, with their content automatically formatted in any desired style. All that is needed is to write an appropriate stylesheet that specifies how to transform the XML document into an HTML one.

Suppose you want to apply a stylesheet called test.xsl to an XML document to convert it into an HTML document. You can perform the XSLT transformation and view the results using any XSLT-capable browser, such as IE 6.0 or Netscape 7.0. . To do this, you first need to edit the input XML document by adding the following processing instruction at the start:

    <?xml-stylesheet href="test.xsl" type="text/xsl"?> 

In this instruction, the href attribute specifies the name and location of the stylesheet and the type attribute specifies its format. If you then open the XML document in a suitable browser, the specified XSL stylesheet is automatically applied to the document and the HTML document resulting from the transformation is displayed in the browser.

Applying Templates

An XSLT stylesheet must be a well-formed and valid XML document. All XSLT element and attribute names in the document should be associated with the XSL namespace, which is usually indicated by the prefix xsl. The root element of the document is called xsl:stylesheet. A typical XSLT stylesheet therefore has the following structure:

    <xsl:stylesheet version="1.0" xmlns:xsl=      "http://www.w3.org/1999/XSL/Transform">    ...    </xsl: stylesheet> 

The most important part of any XSLT stylesheet is its templates. Each template specifies a set of instructions about how to process a particular type of element in the input document. A template has the following structure:

    <xsl:template match="expression">     ...    </xsl:template> 

The value of the match attribute in the xsl:template element is an XPath pattern. This is a specific type of XPath expression that specifies the type of content in the input document to which the template should be applied. For example, a template with match="title" contains instructions to be applied to all elements called title in the current context. The current context depends on what node in the document is being processed by the XSLT processor when the template is matched. Let's look at a simple example of an XML document (Example 8.1).

Example 8.1: A simple XML document.

start example
    <book>     <title>The MathML Handbook</title>    </book> 
end example

Example 8.2 shows a simple XSLT stylesheet that converts the above document into HTML.

Example 8.2: An XSLT stylesheet that contains a single template

start example
     <xsl:stylesheet version="1.0"    xmlns:xsl="http://www.w3.org/1999/XSL/Transform">      <xsl:template match="book">        <html><body>        <h1><xsl:value-of select="title"/></h1>        </body></html>      </xsl:template>    </xsl:stylesheet> 
end example

Here is the result of applying the above stylesheet to the document in Example 8.1:

    <html>      <body>        <h1>The MathML Handbook</h1>      </body>    </html> 

The stylesheet in Example 8.2 consists of a single template that picks out the book element in the input document and then does the following things:

  • It first copies into the output the literal text that corresponds to the opening tags for the html, body, and h1 elements.

  • Next, the <xsl:value-of select="title"> element is applied. The effect of this element is to copy into the output document the text of any title element in the current context.

  • Finally, the template writes out the closing tags for the html, body, and h1 elements.

Example 8.3 shows a slightly more complicated XML document than the one in Example 8.1.

Example 8.3: An XML document with several nested elements

start example
    <book>     <title>The MathML Handbook</title>     <author>Pavi Sandhu</author>     <publisher>      <name>Charles River Media</name>      <address>20 Downer Ave., Hingham, MA 02043</address>      <phone>781-740-0400</phone>     </publisher>    </book> 
end example

Example 8.4 shows a stylesheet written for the document in Example 8.3. It is possible to write a simpler stylesheet to do the same transformation, but the one shown here is useful for the purposes of illustration.

Example 8.4: An XSLT stylesheet that uses xsl:apply-templates.

start example
     <xsl:stylesheet version="1.0"    xmlns:xsl="http://www.w3.org/1999/XSL/Transform">      <xsl:template match="book">        <html><body><xsl:apply-templates/></body></html>      </xsl:template>      <xsl:template match="title">        <h1><xsl:value-of select="."/></h1>      </xsl:template>      <xsl:template match="author">        <h2><xsl:value-of select="."/></h2>      </xsl:template>      <xsl:template match="name">        <h3><xsl:value-of select="."/></h3>      </xsl:template>      <xsl:template match="address">        <p><xsl:value-of select="."/></p>      </xsl:template>      <xsl:template match="phone">        <p><xsl:value-of select="."/></p>      </xsl:template>    </xsl:stylesheet> 
end example

Here is the output that results from applying the above stylesheet to the document in Example 8.3:

    <html>     <body>      <h1>The MathML Handbook</h1>      <h2>Pavi Sandhu</h2>      <h3>Charles River Media</h3>      <p>Address: 20 Downer Avenue, Hingham, MA 02043</p>      <p>Phone: 781-740-0400</p>     </body>    </html> 

The stylesheet in Example 8.4 contains templates for many of the elements in the original XML document. Each template contains some literal text to be copied to the output document as well as an XSLT element that specifies a particular instruction to be executed. The xsl:apply-templates element in the template for the book element specifies that any templates found for all child elements of the current element should be applied. All the other templates contain an <xsl:value select="."> element. The . in the attribute value is an XPath pattern that refers to the current element being processed.

To understand the order in which templates are applied, it is useful to know how the input document is processed. The XSLT processor represents the input document as a tree consisting of nodes. Different types of nodes represent elements, attributes, text, comments, and so on. The XSLT processor then traverses the document tree from top to bottom. For each node in the document, the processor checks to see if a matching template is defined in the stylesheet. If a template matching that node is found, the instructions contained in that template are executed. If more than one template matching that node is found, the more specific template is used. After a template is executed, the processor moves on to the next node in the document tree.

One consequence of this traversal order is that the order in which the templates are written in the stylesheet is not important. A specific template is applied when it matches an element in the document tree, regardless of where that template occurs in the stylesheet. Hence, if you edit a stylesheet to change the order of some of its templates, this does not affect the output document that is produced.

Note that more than one template can be active at any given time because when a template for an element is applied, the template remains active until templates for all the child elements have been applied. Hence, one template can call another template, which can call another template, and so on. The processing of templates can therefore occur in a recursive manner, reflecting the nested structure of the document tree.

Default Templates

The stylesheet shown in Example 8.4 did not contain a template for the publisher element. Yet, the output from this stylesheet includes the text contained in the child elements of the publisher element, namely name, address, and phone. The reason for this is that XSLT defines a default template that is automatically applied to any elements in the document for which no explicit template is given. The default template looks like the following:

    <xsl:template match="*|/">     <xsl:apply-templates/>    </xsl:template> 

The value of the match attribute in the template element is an XPath expression that identifies a specific part of the document. In XPath syntax, the * refers to any element in the current context, while / refers to the root element. The vertical bar is used to separate two choices. Hence, the default template shown above means that the xsl:apply-templates element should be applied to all child elements as well as the root element in the document.

If this default template is applied to any element in the document, its effect is to write out into the output document any text contained in that element. This is because XSLT also defines the following default template for the text or attributes of elements:

    <xsl:template match="text()|@*">     <xsl:value-of select="."/>    </xsl:template> 

The match attribute here is set to the XPath pattern text()|@*. In XPath syntax, text() stands for all text in the current context. and @* stands for any attribute.

Example 8.5 shows a simple XSLT stylesheet that illustrates the effect of the default templates:

Example 8.5: An XSLT stylesheet that illustrates the concept of default templates.

start example
    <xsl:stylesheet version="1.0"    xmlns:xsl="http://www.w3.org/1999/XSL/Transform">      <xsl:template match="book">        <html><body><p>        <xsl:apply-templates/>        </p></body></html>      </xsl:template>    </xsl:stylesheet> 
end example

When the stylesheet in Example 8.5 is applied to the document in Example 8.3, the following output document is obtained:

    <html>      <body>       <p>         The MathML Handbook         Pavi Sandhu         Charles River Media         20 Downer Avenue, Hingham, MA 02043         781-740-0400        </p>      </body>    </html> 

Notice that the text of the title, author, publisher, name, address, and phone elements appears in the output document, even though the stylesheet does not contain explicit templates for these elements. This behavior is a result of the XSLT processor applying the default templates for these elements.

Although the examples discussed so far are very simple, they illustrate some of the key features common to any XSLT stylesheet. A more elaborate XSLT stylesheet has the same basic structure. It would differ only in the number of templates and the complexity of the processing rules defined in each template.



 < Day Day Up > 



The MathML Handbook
The MathML Handbook (Charles River Media Internet & Web Design)
ISBN: 1584502495
EAN: 2147483647
Year: 2003
Pages: 127
Authors: Pavi Sandhu

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net