Chapter 13. Xalan, Saxon, and XT

CONTENTS

13.1 Xalan
13.2 Saxon
13.3 XT
13.4 Generating Multiple Output Files Using Saxon, Xalan, or XT

Xalan-C++ and Xalan-J
Saxon
XT
Multiple output documents

In this chapter, we will discuss three freeware XSLT processors: Xalan, Saxon, and XT. Each section contains an overview of the product, an installation guide, and details about any extensions that are implemented by the processor.

13.1 Xalan

Xalan, an XSLT processor originally developed by Scott Boag at Lotus (now part of IBM), was donated to the Apache XML Project as part of their open source endeavor. A team of engineers at Lotus drives the development of both Xalan-C++ (using C++) and Xalan-J (using Java). Both versions implement XSLT 1.0 and XPath 1.0.

Like every XSLT processor, both Xalan-C++ and Xalan-J require an XML parser to validate the input XML document instance, and use the Apache parser, Xerces. They can work with other parsers, but you would have to write an interface to do so.

The Apache Web site includes both the C++ and Java versions, as well as a complete set of test files, the API, and documentation, which can be found at:

http://xml.apache.org/xalan-c/

http://xml.apache.org/xalan-j/

13.1.1 Xalan-C++

If you are working with very large documents, or need to work in a multi-threaded application environment, Xalan-C++ provides functionality and speed that surpasses Xalan-J. The installation includes a command-line processor as well as a simplified C++ and C API for performing standard transformations. The documentation includes lots of samples to get started with, along with full source code and a complete description of each C++ class. This provides the ability to build your own applications to include Xalan-C++ with relative ease.

The latest version of Xalan-C++ can be found on the Apache Web site at http://xml.apache.org/xalan-c. Zipped files are available for Windows 32, Red Hat Linux 6.1, AIX 4.3, HP-UX 11, and Solaris 2.6 (UNIX versions are tarred with gnu tar). Each download comes with documentation, sample applications, and the complete Xalan-C++ source tree.

In addition to the command-line version of Xalan-C++ for transformations, which is available immediately upon installation, the download provides the C++ classes required to implement and build user-defined applications.

When you download Xalan-C++, the latest version of Xerces-C++ is also included. Xerces-C++ is an XML parser that validates XML according to version 1.0 of the XML specification.^[1] The shared library provides the capability to generate, manipulate, and parse XML documents, and provides high performance, modularity, and scalability.

13.1.1.1 Installing Xalan-C++

Unzip the file that you downloaded from the Apache Web site into the directory of your choice. This will create two directories, one for xml-xalan and one for xml-xerces.

You must modify your system and library paths to point to the executable directory as follows:

Windows:

PATH=xml-xalan\c\Build\Win32\VC6\Release

Red Hat Linux:

PATH=xml-xalan/c/bin

LD_LIBRARY_PATH=xml-xalan/c/lib

or copy libxalan-c1_1.so to /usr/lib

AIX:

PATH=xml-xalan/c/bin

LIBPATH=xml-xalan/c/lib

or copy libxalan-c1_1.a to /usr/lib

HP-UX 11:

PATH=xml-xalan/c/bin

SHLIB_PATH=xml-xalan/c/lib

or copy libxalan-c1_1.a to /usr/lib

Solaris:

PATH=xml-xalan/c/bin

LD_LIBRARY_PATH=xml-xalan/c/lib

or copy libxalan-c1_1.so to /usr/lib

13.1.1.2 Using Xalan-C++ Command-line

Once you've installed the executable file and modified your Path statement, the Xalan-C++ command-line version is ready to go.

Using a command-line DOS or terminal window, type testXSLT. This will give you a message with the arguments available for Xalan-C++ on the command-line. The command-line version requires at least two arguments, with their corresponding flags: -in for the input filename, and -xsl for the XSLT stylesheet. All the other arguments, including the output filename, are optional:

testXSLT -in XMLFileName -xsl XSLFileName [-out OutFileName] [args]

Where:

testXSLT is the command to run Xalan-C++

XMLFileName is the input XML file name

XSLFileName is the name of your XSLT stylesheet

[OutFileName] is the optional output file name

[args] are additional optional arguments

The installation directory structure includes a samples directory with several basic sample applications, under install-directory\xml-xalan\c\samples on Windows or install-directory/xml-xalan/c/samples on UNIX. The samples are precompiled, each with their own executable with the same name as the directory. For example, using the command-line in the install-directory/xml-xalan/c/samples/SimpleTransform directory, type SimpleTransform to run the test. This will take the foo.xml file and process it according to the foo.xsl stylesheet, producing a foo.out output file. You can also use the command-line executable directly by typing:

testXSLT -in foo.xml -xsl foo.xsl -out foo.out

Xalan-C++ includes other command-line arguments (not case-sensitive), listed in Table 13-1 below:

Table 13-1. Command-line options for Xalan-C++
Argument	Action/Effect
`-IN` inputXMLFileName	InputXMLURL.
`-XSL` stylesheetFileName	XSLTransformationURL.
`-OUT` outputFileName	OutputFileName.
`-ESCAPE` chars	Which characters to escape default is <>&"'\r\n.
`-EER`	Expand entity references default is not to expand.
`-V`	Version info.
`-QC`	Quiet Pattern Conflicts Warnings.
`-Q`	Quiet Mode.
`-INDENT` number	Number of spaces to indent each level in output tree default is 0.
`-VALIDATE`	Validate the XSL and XML input default is not to validate. If a DTD is specified in the input XML document, this will validate the XML using that DTD.
`-TT`	Trace the templates as they are being called.
`-TEXT`	Use simple Text formatter.
`-TG`	Trace each result tree generation event.
`-TS`	Trace each selection event.
`-TTC`	Trace the template children as they are being processed.
`-XML`	Use XML formatter and add XML header.
`-NH`	Don't write XML header. Requires that the -XML flag be set.
`-HTML`	Use HTML formatter to generate HTML 4.0.
`-NOINDENT`	Turns off HTML indenting. Requires that the -HTML flag be set.
`-STRIPCDATA`	Strip CDATA sections of their brackets, but do not escape. Requires that either the -XML or -HTML flag be set.
`-ESCAPECDATA`	Strip CDATA sections of their brackets, and escape. Requires that either the -XML or -HTML flag be set.
`-PARAM` name expression	Set a stylesheet parameter. String value expressions should be enclosed in single quotes (').

If you don't want to use the optional arguments, there is a compiled executable that is set up to run a basic transformation with just the input filename, XSLT stylesheet name, and the optional output filename. All the defaults are used, and the flags (-in, etc.) are not required:

XalanTransform XMLFileName XSLFileName [OutFileName]

Where:

XalanTransform is the command to run Xalan-C++

XMLFileName is the input XML file name

XSLFileName is the name of your XSLT stylesheet

OutFileName is the optional output file name

13.1.1.3 Extending Xalan-C++

Xalan-C++ provides the ability to create your own extension functions. See the Xalan-C++ documentation, provided with the download or on the Apache Web site at http://xml.apache.org/xalan-c/extensions.html for information on creating extension functions.

Xalan-C++ does not support creating extension elements at this time. Currently, support for extension elements is planned for version 1.4.

13.1.1.4 Limitations of Xalan-C++

There are several known limitations to the current version on Xalan-C++, including:

Does not support 20 or more digits of numerical precision after the decimal.
The namespace axis does not return the default "xml" namespace.
Does not support case-order and lang attributes in <xsl:sort>.
Does not support extension elements

The Xalan development mailing list (xalan-dev@xml.apache.org) is a good place for users to report bugs and other issues. Any bugs that are reported should be specified as Xalan-C++ issues on the subject line.

13.1.1.5 Internationalization with Xalan-C++

Xalan-C++ provides support for internationalization with the addition of the International Components for Unicode (ICU)^[2] from IBM's Developerworks.

The ICU provides support for number formatting using the XPath format-number() function, Unicode-style collation using the <xsl:sort> XSLT element, and character encoding using UTF-16.

Note

Xalan-C++ ignores the format pattern and optional decimal-format name arguments for format-number() unless you install the ICU.

To get the ICU:

Download and unzip the latest ICU source files from the IBM developerWorks open source page: http://oss.software.ibm.com/developerworks/opensource/icu/project/download/index.html
Do an ICU build according to the build instructions in the readme.html that is included with the download. When installing on Windows, the ICU should be on the same drive and at the same level as your installation of Xalan-C++.
Set the ICU_DATA environment variable as shown in the readme.html.

13.1.2 Xalan-J

Xalan-J takes a unique approach to the conventional representation of the input XML document instance. It can represent the XML document instance's nodes as an array, in a representation called the Document Table Model. This allows Xalan-J to outperform some other XSLT processors under certain conditions, such as with large documents. To learn more, consult the Xalan-J documentation. In these files are details about the features listed below; we have only summarized them here to aid in your selection of an appropriate XSLT processor, and we've included the various connectivity options with Java and applets, or wrappers.

If you choose to add functions and are familiar with Java or JavaScript, for example, Xalan-J uses the Bean Scripting Framework (BSF) for adding functionality. Java and JavaScript have both been tested according to the documentation. This is the reason for adding the bsf.jar and bsfengines.jar files with the command-line installation (described in Section 13.1.2). Otherwise, you do not need to add these to your CLASSPATH to use Xalan-J programmatically, or through the command line.

We will present two ways to work with Xalan-J. The first is the conventional command-line method, and the second involves an excellent and extremely convenient GUI interface developed by Eric Lawson of ISOGEN/DataChannel. In fact, one of the features of Xalan-J is that it can be run either by command-line or by its Java API. Additionally, it can be run within an applet or servlet, similar to the interface provided by Lawson, included on the CD.

13.1.3 Using Xalan-J with Eric Lawson's GUI

Using Xalan-J with the GUI by Eric Lawson is simple and convenient. Assuming that you have Java installed as described in Chapter 12, you only need the XSLTConv.zip file included on the CD (XSLTConv.zip in the /software/Xalan/Lawson_GUI directory). The following instructions are derived from the readme file. The instructions below will work with either Windows or UNIX. Macintosh is slightly different, as a command-line is not so readily available in the Mac OS.

The application is basically a graphical front end for the Xalan-J XSLT processor, which takes advantage of Xalan's predilection for being easily run from within an applet. This application frees users from having to actually install Xalan, because it contains all the necessary classes required to perform XSLT Transformations. This also frees users from having to use Xalan-J through the command-line, which can be very repetitive typing-wise, and liberates users from having to deal with some of the underlying complexity of XSLT processors once Java is on the system.

Java version 1.1.6 or greater with Swing 1.01 or greater should be installed, or any Java 2 installation (1.2 or greater we recommend Java 1.2). Simply copy the xsltconv.jar file into a directory of your choice. Then, type the following onto the command-line of a UNIX terminal window or an MS-DOS window (via Start, run, cmd):

java -jar xsltconv.jar

Make sure you run this from the directory where you copied the XSLTConverter files. Once the XSLT Transform window comes up, using it is very simple. Either type in the location of your files, along with filenames for the files you want processed by Xalan. Alternately, you can just click the Browse button next to each field XML Source File, XSLT Stylesheet, and Output File and, just as you would use any graphical interface File/Open command, point and click to select the files to be used.

One thing to remember is that the Output File field is not likely to be browsable to a file because, presumably, until it's output, it doesn't exist! Alternately, you can click Browse, open up the File Chooser, and then select the directory you wish the output file to be placed in. Then, type the name of the desired file into the File Chooser (as this file may not exist yet), and click Select File. To actually invoke your XSLT stylesheet and convert the files, simply click the Transform button and the XSLT processor will perform the transformation.

13.1.4 Installing the Basic Command-line Interface for Xalan-J

Using the command-line interface for Xalan-J affords access to a number of extensions and "switches" you can invoke when you run it. Xalan-J requires that you also have the Xerces parser, so you will need to download both it and Xalan-J from http://xml.apache.org/dist/xalan-j/ and http://xml.apache.org/dist/xerces-j/. If you plan to run XSLT extensions, you need bsf.jar and bsfengines.jar, both of which are included in the Xalan-Java distribution. Remember, we suggest that you use JDK 1.2+ so that your settings are simpler. To run the extensions, include bsf.jar and bsfengines.jar in the CLASSPATH.

Unzip the files into the directory you wish to use for running Xalan-J and make the following changes to your system.

At the very least, you must include xalan.jar and xerces.jar on the system CLASSPATH. Thus, where you had a basic ".", you would modify it as follows:
```
set CLASSPATH=/usr/bin/xalan/xerces.jar:/usr/bin/xalan/xalan.jar:. 
```
For your .cshrc on Solaris/UNIX or autoexec.bat file on Windows, remember to use a semicolon (;) and the reverse slash (\). For temporary on Solaris/UNIX:
```
setenv CLASSPATH=/usr/bin/xalan/xerces.jar:/usr/bin/xalan/xalan.jar:. 
```
To run the sample applications, include xalansamples.jar. Thus, where you had a basic .:/usr/bin, you would modify it as follows:
```
set CLASSPATH=/usr/bin/xalan/xerces.jar:/usr/bin/xalan/xalan.jar:.:/usr/bin/xalan/xala nsamples.jar 
```
For your .cshrc on Solaris/UNIX, use a colon separator, or use a semicolon for the autoexec.bat file on Windows, and remember to use the reverse slash (\).
```
setenv CLASSPATH=/usr/bin/xalan/xerces.jar:/usr/bin/xalan/xalan.jar:.:/usr/bin/xalan/xala nsamples.jar 
```
and so on, with the : separator on Solaris/UNIX. To use the extensions, add them as follows:
```
set CLASSPATH=/usr/bin/xalan/xerces.jar:/usr/bin/xalan/xalan.jar:.: /usr/bin/xalan/xalansamples.jar:/usr/bin/xalan/bsf.jar: /usr/bin/xalan/bsfengines.jar 
```
For your .cshrc on Solaris/UNIX, use a colon separator, or use a semicolon for autoexec.bat on Windows, and remember to use the reverse slash (\).
```
setenv CLASSPATH=/usr/bin/xalan/xerces.jar:/usr/bin/xalan/xalan.jar:.: /usr/bin/xalan/xalansamples.jar:/usr/bin/xalan/brf.jar: /usr/bin/xalan/bsfengines.jar 
```

13.1.5 Using Xalan-J with the Command-line Interface and Extensions

Once you have Xalan-J installed for command-line usage, you have several options when invoking it, which can affect the kinds of output and processing that take place. Be sure your CLASSPATH is set as above and that you have Java correctly installed. The basic invocation is as follows:

java org.apache.xalan.xslt.Process -in source.xml -xsl stylesheet.xsl -out output.xml

Notice that -in, -xsl, and -out precede the XML document instance, the XSLT stylesheet, and result file, respectively. The initial invocation of java simply invokes the JVM, and the org.apache.xalan.xslt.Process implements the actual XSLT processing class. If you are using the Microsoft virtual machine, use jview instead of java.

There are several additional aspects of running Xalan-J from the command line, such as having the ability to set a parameter value from "outside" the XSLT stylesheet. Thus, within the XSLT stylesheet, you might have a declared parameter that would be invoked somewhat differently each time you ran the XSLT stylesheet through Xalan.

<xsl:param name="birthday" value="******" />

Here, you could input the date, if your stylesheet was in some way dependent on the time at which it was run.

java org.apache.xalan.xslt.Process -PARAM birthday '09-13-63' -in source.xml -xsl process_date.xsl -out result.xml

Notice how -PARAM takes two arguments: the declared name value of <xsl:param> and the actual value set at the moment the XSLT stylesheet is processed that particular time.

There is a range of these command-line switches beginning with "-" that you can use with Xalan-J. The command-line utility can take the flags and arguments listed in Table 13-2 (these flags, or switches, are case-insensitive, and note that all of them begin with the Xalan-perfunctory "-").

Table 13-2. Xalan-J arguments and flags to be invoked at runtime
Argument	Action/Effect
`-IN` filename	Input filename.
`-XSL` filename	XSL transformation URL.
`-OUT` filename	Output filename.
`-LXCIN` filename	Compiled stylesheet filename in.
`-LXCOUT` filename	Compiled stylesheet filename out.
`-PARSER` classname	Fully qualified class name of parser liaison.
`-V`	Displays version information.
`-QC`	Quiet pattern conflicts warning.
`-Q`	Quiet mode.
`-LF`	Use linefeeds only on output; default is CR/LF.
`-CR`	Use carriage returns only on output; default is CR/LF.
`-INDENT` number	Number of spaces to indent each level in output tree; default is 0.
`-TT`	Trace the templates as they are being called.
`-TG`	Trace each result tree generation event.
`-TS`	Trace each selection event.
`-TTC`	Trace the template children as they are being processed.
`-VALIDATE`	Validate the XML and XSL input; validation is off by default.
`-EDUMP` filename	Do stackdump on error, output to optional filename.
`-XML`	Use XML formatter and add XML header.
`-TEXT`	Use simple text formatter.
`-HTML`	Use HTML formatter.
`-PARAM` name expression	Set a stylesheet parameter.

13.1.6 Xalan-J Extensions

Xalan-J has three built-in extension elements that together provide the functionality to send the output to several different files from a single input XML document instance. These elements are <redirect:open>, <redirect:write>, <redirect:close> and, together, are called the Xalan-J Redirect Extension. In addition, Xalan-J implements two extension elements, <lxslt:script> and <lxslt:component>, that allow user-defined extensions to be processed. The Xalan-J namespace must be declared on or before the Xalan-J extension elements as follows:

xmlns:lxslt="http://xml.apache.org/xslt"

Because the Xalan-J product was originally an implementation of the LotusXSL product, that namespace can also be used as follows:

xmlns:lxslt="http://xsl.lotus.com/"

13.1.6.1 Xalan-J Redirect Extension

The Redirect extension includes three elements that redirect portions of your XSLT stylesheet output to multiple files: <redirect:open>, <redirect:write>, and <redirect:close>. If you use the <redirect:write> element alone, the extension opens a file, writes to it, and closes the file immediately. If you want direct control over the opening and closing of files while your XSLT stylesheet is being processed by Xalan, use the <redirect:open> and <redirect:close> elements.

When the redirect extension elements are used, the redirect namespace should be declared, in addition to the Xalan-J namespace, using the following declaration:

xmlns:redirect="org.apache.xalan.xslt.extensions.Redirect"

Note that the redirect prefix should also be added the the extension-element-prefixes attribute on the document element. The following element model defiinitions show the structure of each of these elements:

<!-- Category: extension-element --> <redirect:write select = expression   file = string >  <!-- content: (template) --> </redirect:write> <!-- Category: extension-element --> <redirect:open   select = expression   file = string /> <!-- Category: extension-element --> <redirect:close   select = expression   file = string />

Each of these elements includes an optional file attribute and/or an optional select attribute to designate the output file. If you use file, this attribute requires a string. It can be used to directly specify the output filename, in a sense "hardwiring" it. The select attribute takes an XPath expression, so you can use it to dynamically generate the output filename with the evaluation of the contents of the expression. Using both attributes causes the Redirect implementation to first evaluate the value of the select attribute. This is a sort of contingency processing model, as Xalan-J "falls back" to the string value of the file attribute if the select attribute expression does not return a valid filename.

The <redirect:open> and <redirect:close> elements must be used together. Both elements are empty, but the <redirect:close> element acts as a closing tag for the <redirect:open> element. The file that is opened with <redirect:open> must be closed with an <redirect:close> element with the same file or select attribute value.

Example 13-1, from the Xalan-J documentation,^[3] shows the use of the Redirect extentions.

13.1.6.2 Xalan-J User-Defined Extensions

The Xalan-J namespace provides support for the <lxslt:component> extension element and its <lxslt:script> sub-element. Together these elements allow users to define their own extensions that will be implemented by the Xalan-J processor.

13.1.6.3 The Xalan-J `<lxslt:component>` Extension Element

The <lxslt:component> element is used to define the prefix for a user-defined namespace, as well as the names of any extension functions or elements that are being created. It has three attributes, prefix, functions, and elements. The following element model definition shows the structure of the <lxslt:component> element:

<!-- Category: extension-element --> <lxslt:component   prefix= prefix   functions="func-1 func-2 ...func-n"   elements="elem-1 elem-2 ...elem-n"   namespace-uri = string >   <!-- content: (lxslt:script) --> </lxslt:component>

This element contains the <lxslt:script> element used to define the extension function or element. The prefix attribute is used to specify the namespace prefix for the user-defined functions. The functions attribute is used to specify the names of the extension functions being defined, and the elements attribute is used to specify the names of the extension elements being defined. Both functions and elements attributes are required. The values for the functions and elements attributes are a list of names separated by whitespace. The namespace-uri attribute is used to specify a URI for the namespace prefix specified in the prefix attribute.

Example 13-1 Using Redirect with Xalan.

<?xml version="1.0"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/XSL/Transform/1.0"      version="1.0"      xmlns:lxslt="http://xml.apache.org/xslt"      xmlns:redirect="org.apache.xalan.xslt.extensions.Redirect"      extension-element-prefixes="redirect">    <xsl:template match="/">      <out>        default output.      </out>      <redirect:open file="doc3.out"/>      <redirect:write file="doc3.out">         <out>          <redirect:write file="doc1.out">            <out>              doc1 output.              <redirect:write file="doc3.out">                Some text to doc3               </redirect:write>            </out>          </redirect:write>          <redirect:write file="doc2.out">            <out>                doc2 output.              <redirect:write file="doc3.out">                Some more text to doc3                <redirect:write select="doc/foo">                  text for doc4                </redirect:write>              </redirect:write>            </out>          </redirect:write>        </out>      </redirect:write>      <redirect:close file="doc3.out"/>    </xsl:template> </xsl:stylesheet>

The Xalan-J documentation provides the following implicit DTD fragment for <lxslt:component>.

<!ELEMENT lxslt:component (lxslt:script)> <!ATTLIST lxslt:component   prefix CDATA #IMPLIED   namespace-uri CDATA #IMPLIED   elements NMTOKENS #REQUIRED   functions NMTOKENS #REQUIRED>

13.1.6.4 The Xalan-J `<lxslt:script>` Extension Element

For each <lxslt:component> element you must include an <lxslt:script> element to define the extension element or function. The <lxslt:script> element is similar to the <xsl:script> element specified in the XSLT 1.1 WD.^[4] It has two attributes: lang and script. The following element model definition shows the structure of the <lxslt:script> element:

<!-- Category: top-level-extension-element --> <lxslt:script   lang = string   src = string > <!-- content: (#PCDATA) --> />

The lang attribute is used to specify the name of the scripting language that the function uses, and the src attribute is used to specify the fully qualified class name. For example, if the extension is implemented in Java, the lang would be javaclass, the src would be the class name, and the <lxslt:script> element would be empty, as follows:

<lxslt:script lang="javaclass" src="classname"/>

If the extension is implemented in JavaScript, the lang would be javaScript, and the <lxslt:script> element would contain the JavaScript code. Example 13-2 from the Xalan-J documentation shows an example of a JavaScript implementation.

Example 13-2 Using `<lxslt:script>` to define an extension using JavaScript.

<lxslt:component prefix="counter"                    elements="init incr" functions="read">     <lxslt:script lang="javascript">       var counters = new Array();       function init (xslproc, elem) {         name = elem.getAttribute ("name");         value = parseInt(elem.getAttribute ("value"));         counters[name] = value;         return null;       }       function read (name) {         // Return a string.         return "" + (counters[name]);       }       function incr (xslproc, elem)       {         name = elem.getAttribute ("name");         counters[name]++;         return null;       }     </lxslt:script>   </lxslt:component>

The new extension elements and function can then be used in a template as shown in Example 13-3.

Example 13-3 Using Xalan-J with User-defined Extensions.

<xsl:template match="/">     <HTML>       <H1>Names in alphatebical order</H1>       <counter:init name="index" value="1"/>       <xsl:for-each select="doc/name">         <xsl:sort select="@last"/>         <xsl:sort select="@first"/>         <p>         <xsl:text>[</xsl:text>         <xsl:value-of select="counter:read('index')"/>         <xsl:text>]. </xsl:text>         <xsl:value-of select="@last"/>         <xsl:text>, </xsl:text>         <xsl:value-of select="@first"/>         </p>         <counter:incr name="index"/>       </xsl:for-each>     </HTML>   </xsl:template>

13.2 Saxon

Michael Kay has contributed what might be considered one of the most robust and versatile XSLT processors with his Saxon product. It has one of the largest sets of built-in extension top-level elements, instruction elements, and functions. It also runs on Java and is regularly updated at the source Web site, http://users.iclway.co.uk/mhkay/saxon.

Saxon includes a servlet that allows it to be invoked directly from a URL entered into a browser. You might think of Saxon as the "programmer's XSLT processor," due to its extended documentation for adding extensions, event handlers, and so forth (see the api-guide.html file in the Saxon user documentation).

The Saxon XSLT processor is available in two forms, a "complete" Saxon API for Java, and a simple command-line version of the processor, called Instant Saxon.

The complete Saxon API contains a Java library, which supports a similar processing model to XSL, but allows full programming capability, which you need if you want to perform complex processing of the data or to access external services such as a relational database. It includes a typical set of .jar files that are added to the CLASSPATH environment, and also contains utilities such as a DTD generator and other goodies, including documentation.

The simple version, Instant Saxon, runs straight from a Windows command-line. The Microsoft JVM must be installed on the system prior to using Instant Saxon. However, if you use Internet Explorer 4 or later, the JVM will already be on your system. The Instant Saxon installation does not emphasize the extras or the documentation, so it may be worth downloading both versions just to get these. The Instant Saxon installation comes bundled with the AElfred XML parser from Microstar.^[5]

13.2.1 Installing Full Saxon on Solaris/UNIX or Windows Java

If you are installing the full Saxon product, you will need the JDK 1.2 (1.1.6+ will do, but is not recommended). Kay notes that the current version is compiled with Java 2 and will run with 1.1, but will not compile under 1.1. If you do not use the default Aelfred parser included with Saxon, you will also need a SAX1 or SAX2 parser, such as XP.

The core program for working with objects is a JAR file, saxon.jar, which you must include on your CLASSPATH.We will continue to work with the model introduced above, which assumes you will put this in a /usr/bin directory, likely called /usr/bin/saxon.

You can find additional user documentation, covering both the XSLT and Java interfaces, included in the Saxon package as JAVADOC specifications. These package summaries give an overview in the form of a user guide. In addition, there is an introductory overview, included with the documentation provided with Saxon.

Saxon comes with a bundled XML parser, a modified copy of the AElfred parser, adapted to notify comments to the application. Saxon has been tested successfully in the past with Lark, MSXML, SUN Project X, Oracle XML, Xerces, xml4j, and XP. Use of a SAX2-compliant parser is preferred, as SAX1 does not allow XML comments to be passed to the application. However, Saxon works with either. All the relevant classes must be installed on your Java CLASSPATH. The following examples assume that you will use the default xp.jar XML processor and that you have put it in your directory with Saxon.

At the very least, you must include saxon.jar and xp.jar on the system CLASSPATH. Thus, where you had a basic ".", you would modify it as follows:

set CLASSPATH=/usr/bin/saxon/xp.jar:/usr/bin/saxon/saxon.jar:.

Use the above for your .cshrc on Solaris/UNIX; or for an autoexec.bat file on Windows, do the same syntax, but remember to use the semicolon (;) and reverse slash (\).

setenv CLASSPATH=/usr/bin/saxon/xp.jar:/usr/bin/saxon/saxon.jar:.

To run full Saxon, unless you've attached some applet wrapper or invoked it from a URL in a browser (in which case, you should review the Saxon documentation index.html file), open a command line window and run it with the following syntax:

saxon [options] source.xml stylesheet.xsl [params . . .]

13.2.2 Installing Instant Saxon on Windows

All you need to install Instant Saxon is the download zip file located at http://users.iclway.co.uk/mhkay/saxon. You do not need to add any extra parsers or to modify PATH or CLASSPATH environment variables, provided you have IE 4+ (IE 5 recommended) on your Windows 95, 98, or NT/2000 machine. Unzip the file in the directory where you plan to use Saxon and you are ready to go.

To run Instant Saxon on Windows, use a command-line or DOS window (select Start, Run, and type cmd) and run it with the following syntax:

saxon [options] input.xml stylesheet.xsl [params . . .]

Options and parameters for Instant Saxon are described in the following sections. The input.xml and stylesheet.xsl represent filenames for the input XML document and the XSL stylesheet being used, respectively.

13.2.3 Saxon Options

Saxon has a number of command-line options that are used when invoking Saxon with an XSLT stylesheet (see Table 13-3). The options must precede the input.xml and the stylesheet.xsl filenames on the command-line:

saxon [options] input.xml stylesheet.xsl [params . . .]

Table 13-3. Command-line options for Saxon
Argument	Action/Effect
`-a`	Used with XML documents that directly contain a stylesheet. This means that the filename for the stylesheet on the command line is not required. See Chapter 2, Section 2.7 for more information on including XSLT stylesheets in an XML document.
`-ds \| -dt`	Selects which internal tree model is to be used. -dt (which is the default) selects the "tinytree" model, and -ds selects the traditional tree model.
`-l`	Saxon implements a line numbering function `saxon: line-number(),` to access the line number for each line in the input document. This option enables (turns on) the line numbering for the source document.
`-m` classname	Used with the `<xsl:message>` element to control the output of messages as a new document. Must be used with the com.icl.saxon.output.Emitter class.
`-r` classname	Used with the `document()` function in the `<xsl:include>` and `<xsl:import>` elements to resolve URIs into a source document.
`-r` classname	Also used with the -u option to process the URIs of the input file and stylesheet file provided on the command-line.
`-o` filename	Used to provide a filename for the output from the processor. This option checks the extension of the filename provided to determine the output file type if one is not explicitly specified with the method attribute of `<xsl:output>.`
`-t`	Displays the version and timing information.
`-T`	Displays stylesheet tracing information. Also enables (turns on) the line numbering for the source document.
`-TL` classname	Signals the processor to use a TraceListener. The name of a user-defined class, which must implement com.icl.saxon.trace.TraceListener, is specified with the classname.
`-u`	Provides the ability to use URLs for the input and stylesheet filenames on the command line. If the filenames start with "http:" or "file:" they are assumed to be URLs, and this option is not required.
`-w0, w1, or w2`	Saxon implements 3 levels of recovery when an error occurs. The level can be specified on the command-line as: w0 - recover silently, w1 - recover after writing out a warning message w2 - signal the error and do not attempt recovery The default is w1.
`-x` classname	The SAX parser used to process the XML files can be specified using this option. The classname specifies a Java class that implements the org.xml.sax.Parser or org.xml.sax.XMLReader interface.
`-y` classname	The SAX parser used to process the XSLT files can be specified using this option. The classname specifies a Java class that implements the org.xml.sax.Parser or org.xml.sax.XMLReader interface.
`-?`	Displays the help for Saxon's command-line syntax.

13.2.4 Saxon Command-line Parameters

Saxon provides the ability to submit parameter values through the command-line at run-time to update global parameters defined in the stylesheet with the <xsl:param> top-level element. The parameters must follow the filenames for the input XML document and the XSLT stylesheet on the command-line as follows:

saxon [options] input.xml stylesheet.xsl [params . . .]

A parameter value is passed to the stylesheet in the form name=value, where name is the name of the parameter defined in the stylesheet with <xsl:param>, and value is the new value for the parameter. If the parameter is not declared in the stylesheet, the parameter from the command-line is ignored. Parameter values that contain spaces should be surrounded with double quotes on the command-line.

13.2.5 Saxon Extensions

Saxon includes what is one of the largest collections of built-in extensions. They include extension top-level elements, extension functions, extension attributes, and extension instruction elements. The following material is excerpted and annotated from the material included from the current download of Saxon (this is from the extensions.html file in the Saxon documentation). The most up-to-date documentation is available at http://users.iclway.co.uk/mhkay/saxon/. Kay provides the following preface to users of the Saxon extensions:

These extension functions and elements have been provided because there are things that are difficult to achieve, or inefficient, using standard XSLT facilities alone. As always, it is best to stick to standard if you possibly can: and most things are possible, even if it's not obvious at first sight.

13.2.5.1 Saxon Attribute Extensions

Saxon implements the following extension attributes: trace, allow-avt, disable-output-escaping,^[6] method,^[7] indent-spaces, character-representation, omit-meta-tag, and next-in-chain.

The use of the Saxon extension attributes requires that the Saxon namespace be declared either in the document element, an element that uses the extension, or an ancestor of the element that uses the extension. The Saxon namespace is declared using the following format:

xmlns:saxon="http://icl.com/saxon"

The `saxon:trace` Extension Attribute

This attribute can be used on either the document element or an <xsl:template> element, and turns on echoing of the instantiation for each template rule. The reporting is sent to the standard error output, whether the command-line or a GUI window, as implemented by the application.

If you use this attribute on the document element, all the top-level elements are listed along with their import precedence. All contained template rules are then traced as well. The default value for saxon:trace is no, as shown in the following attribute model definition:

EXTENSION ATTRIBUTE:  saxon:trace (yes|no) "no" VALUE = (yes|no) "no"

Use this attribute on either the <xsl:stylesheet> or <xsl:transform> document elements, or a template rule as follows:

<xsl:template match="block" saxon:trace="yes">

The `saxon:allow-avt` Extension Attribute

This extension attribute is used with the <xsl:call-template> instruction element. This attribute lets the value given for the name attribute of <xsl:call-template> to be interpreted as an attribute value template, when the value is surrounded by curly-braces {} (see Chapter 6, Section 6.6.1). Since attribute value templates are not normally allowed as the value for the name in <xsl:call-template>, adding the extension attribute will prevent a processor error. The default value for saxon:allow-avt is no, as shown in the following attribute model definition:

EXTENSION ATTRIBUTE:  saxon:allow-avt (yes|no) "no" VALUE = (yes|no) "no"

Use the saxon:allow-avt attribute as follows to permit AVT's in the value for name:

     <xsl:call-template name="{$some_variable}" saxon:allow-avt="yes" >

The `saxon:disable-output-escaping` Extension Attribute

The disable-output-escaping attribute has been implemented in the XSLT specification and is no longer a Saxon extension. Its use can be found in Chapter 3 in conjunction with the <xsl:value-of> element, and Chapter 6 in conjunction with the <xsl:text> element.

The `method` Attribute with Saxon

The method attribute of <xsl:output> and <xsl:document> is not an extension attribute, but its value can contain a QName that is governed by a processor. The prefix of the QName must be a valid namespace prefix. We use saxon as the prefix in the following examples, however it can be any valid prefix. Saxon implements the method attribute with the values shown in Table 13-4.

Table 13-4. Values of QNames implemented by the Saxon processor^[a]
QName	Action
Saxon:fop	Directs output to Apache's FOP processor (which must be installed separately from www.apache.org), which implements the developing W3C formatting objects, or FO, portion of XSLT.
Saxon:xhtml	Outputs the result tree in XHTML format. This follows the same rules as method="xml," except that it follows the guidelines for making the XML acceptable to legacy HTML browsers. Specifically (a) empty elements such as <br/> are output as <br/>, and (b) empty elements such as <p/> are output as <p></p>. The indent attribute defaults to "yes," and indenting follows the HTML rather than XML rules. Other attributes may be specified as for XML output, e.g. cdata-section-elements and omit-xml-declaration.
Saxon:classname	The fully qualified class name of a class that implements either the SAX org.xml.sax.DocumentHandler interface, or the SAX2 org.xml.sax.ContentHandler interface, or that is a subclass of the com.icl.saxon.output.Emitter class. If such a value is specified, output is directed to the user-supplied class.

^[a] The information for this table comes direcrly from the Saxon 6.2.2 documentarion.

Use the method attribute as follows:

<xsl:output method="saxon:fop"/>

The `saxon:indent-spaces` Extension Attribute

The saxon:indent-spaces controls the amount of indentation that is generated when the file output method is XML or HTML, and indent is set to yes on either <xsl:output> or <xsl:document> elements. The value of the attribute must be an integer.

The value for saxon:indent-spaces is a number, as shown in the following attribute model definition:

EXTENSION ATTRIBUTE:  saxon:indent-spaces NMTOKEN #IMPLIED VALUE = Number

Use the saxon:indent-spaces attribute as follows:

<xsl:output saxon:indent-spaces="10"/>

The `saxon:character-representation` Extension Attribute

This attribute is used with <xsl:output> or <xsl:document>, and controls how non-ASCII characters are represented in the output. It works with the two method values, xml and html.

When used with the xml method, its value can be either decimal or hex.

When used with the html method, the value has two strings, separated by a semicolon. The first string controls how non-ASCII characters within the character encoding is represented, the values being native, entity, decimal, or hex. The second string controls how characters outside the encoding will be represented, the values being entity, decimal, or hex.

The value for saxon:character-representation is a string, as shown in the following attribute model definition:

EXTENSION ATTRIBUTE:  saxon:character-representation CDATA #IMPLIED VALUE = String

Use the saxon:character-representation attribute as follows:

<xsl:output method="xml" saxon:character-representation="hex"/>

The `saxon:omit-meta-tag` Extension Attribute

This attribute is used with <xsl:output> and the html method. The normal action of the html output method is to generate a <META> tag immediately after the <HEAD> tag, containing details of the media type and character encoding. Setting this attribute to "yes" causes this output to be suppressed.

The values for saxon:omit-meta-tag are yes or no, as shown in the following attribute model definition:

EXTENSION ATTRIBUTE:  saxon:omit-meta-tag (yes|no) "no" VALUE = (yes|no) "no"

Use the saxon:omit-meta-tag attribute as follows:

<xsl:output method="html" saxon:omit-meta-tag="yes"/>

The `saxon:next-in-chain` Attribute

The saxon:next-in-chain attribute is used with either <xsl:output> or <xsl:document> to direct the output to another stylesheet. The output is then used as the input for the new stylesheet. The value of the attribute is the URL of the new stylesheet. The output stream must always be pure XML, and attributes that control the format of the output (e.g., method, cdata-section-elements, etc.) will be ignored. The output of the second stylesheet will be directed to the destination that would have been used for the first stylesheet if no saxon:next-in-chain attribute were present. When used with <xsl:output>, the original transformation result destination is used. When used with <xsl:document>, the file specified by the href attribute is used. The value for saxon:next-in-chain is a URL, as shown in the following attribute model definition:

EXTENSION ATTRIBUTE:  saxon:next-in-chain CDATA #IMPLIED VALUE = URL

Use the saxon:next-in-chain attribute as follows:

<xsl:output saxon:next-in-chain="http://mystyles/newstyle.xsl"/>

13.2.5.2 Saxon Extension Elements

Saxon adds four top-level extension elements: <saxon:handler>, <saxon:preview>, <saxon:function>, and <saxon:script>, as well as eight instruction extension elements: <saxon:assign>, <saxon:doctype>, <saxon:entity-ref>, <saxon:group>, <saxon:item>, <saxon:output>, <saxon:return>, and <saxon:while>.

To use Saxon extension elements, their namespace must be declared and the extension-element-prefixes attribute on the document element must include the saxon value.

All these extensions are available to either full Saxon or Instant Saxon. However, to use the external Java calls as with <saxon:output> you may need the accompanying documentation, which comes with full Saxon.

The `<saxon:handler>` Top-Level Extension Element

The <saxon:handler> top-level extension element is similar to <xsl:template>, and has the same uses for the match, mode, name, and priority attributes, as shown in the following element model definition:

<!-- Category: top-level-extension-element --> <saxon:handler  handler = classname   match = pattern   name = qname   priority = number   mode = qname> />

This element is sorted for precedence of instantiation in equal standing with any other <xsl:template> element. Its function is to call a user-written JavaNodeHandler with the mandatory handler attribute. The JavaNodeHandler and the <saxon:handler> element are explained in detail in the Saxon documentation (begin with the extensions.html file in the Saxon documentation).

The `<saxon:preview>` Top-Level Extension Element

This top-level extension element is designed to facilitate more efficient handling of large documents. In the traditional XSLT stylesheet processing model, each template rule is evaluated for a match to determine if it will be instantiated in turn. This means that the entire input XML document instance is parsed for a match for every single template rule very time- and system-resource consuming.

With <saxon:preview>, the relevant parts of the input source, those which find the template match, are processed as soon as they are parsed, then removed from the virtual document tree, saving on memory resources. In effect, it is possible to break the transformation of the document source into a series of separate smaller transformations. The elements listed in the mandatory elements attribute are "disregarded" by the Saxon processor after they have been treated according to whatever mode has been stipulated in the mandatory mode attribute. The results are written to the output result tree, but those elements in the input XML document instance are ignored in subsequent evaluation of other templates in the XSLT stylesheet. The following element model definition shows the structure of <saxon:preview>.

<!-- Category: top-level-extension-element --> <saxon:preview   mode = qname   elements = qnames >   <!-- Content: (xsl:param*, template) --> </saxon:preview>

The <saxon:preview> element can be used to simply weed out undesired input elements by using it as a template that does nothing in other words, give it no children instruction elements, only the list of elements to be ignored for that mode.

The `<saxon:function>` Top-Level Extension Element

The top-level <saxon:function> extension element is used to declare an extension function. It contains a template, preceded by zero or more <xsl:param> elements. It has a required name attribute whose value is a QName, evaluating to a URI, as shown in the following element model definition:

<!-- Category: top-level-extension-element --> <saxon:function   name = qname >   <!-- Content: (xsl:param*, template?, saxon:return*, xsl:fallback?) --> </saxon:function>

The function definition contains zero or more <saxon:return> instructions to define the return value. The Saxon documentation provides additional information for defining functions using the <saxon:function> element.

An example of using <saxon:function> from the Saxon Documentation is as follows:

<saxon:function name="my:initial">     <xsl:param name="size"/>     <saxon:return select="substring(.,1,$size)"/> </saxon:function> <xsl:template match="text()">     <xsl:value-of select="my:initial(3)"/> </xsl:template>

The `<saxon:script>` Top-Level Extension Element

The <saxon:script> element is a top-level element that is equal to <xsl:script>, defined in XSLT 1.1 WD. The reason Saxon provides this element is so it can be used in stylesheets that are shared and used with different processors. Any processor other than Saxon will ignore this element.

For example, to use an extension function like xx:intersection(), you can define the Saxon implementation as follows:

<saxon:script implements-prefix="xx" language="java"       src="java:com.icl.saxon.functions.Extensions">

The following element model definition shows the structure of the <saxon:script> element:

<!-- Category: top-level-extension-element --> <saxon:script   implements-prefix = ncname   language = "ecmascript" | "javascript" | "java" | qname-but-not-ncname   src = uri-reference   archive = uri-references >   <!-- Content: #PCDATA --> </saxon:script>

The `<saxon:assign>` Extension Element

This function provides a very useful feature that allows XSLT variables and parameters to be dynamically updated in the context of a template rule. Currently, XSLT variables and parameters, as codified in the W3C specification, cannot be updated other than in the case of a parameter with the use of <xsl:with-param>, which has limited uses. For example, you might have a declared variable of birthday that has been assigned to a variable as follows:

<xsl:variable name="birthday" select="{@date}" />

You can then update it to make an employee password a combination of start date, Social Security number, and birthdate.

<xsl:template match="password">       <xsl:attribute>             <xsl:value-of select="@ssn" />             <saxon:assign name="birthday"             expr="concat($birthday, @start-date" />       </xsl:attribute> </xsl:template>

This extension instruction element can also contain a template, as shown in the element model definition below:

<!-- Category: instruction-extension-element --> <saxon:assign   name = qname   select = node-set-expression >   <!-- Content: (template) --> </saxon:assign>

The variable being updated must have been defined using the extension attribute saxon:assignable="yes". The value of the variable is determined either using the select attribute or by instantiating the template it contains.

The `<saxon:doctype>` Extension Element

The <saxon:doctype> instruction element is used to insert a document type declaration into the current output file. It has no attributes, and its content is a template, as shown in the element model definition below. The template is instantiated to create an XML document that represents the DTD to be generated.

The Saxon documentation provides detailed information on the output format and usage of the <saxon:doctype> element. An example of using <saxon:doctype> from the Saxon documentation is as follows:

<xsl:template match="/">

Note

If this element is present the doctype-system and doctype-public attributes of <xsl:output> are ignored.

<!-- Category: instruction-extension-element --> <saxon:doctype>   <!-- Content: (template) --> </saxon:doctype>

  <saxon:doctype xsl:extension-element-prefixes="saxon">   <dtd:doctype name="booklist"        xmlns:dtd="http://icl.com/saxon/dtd" xsl:exclude-result- prefixes="dtd">     <dtd:element name="booklist" content="(book)*"/>     <dtd:element name="book" content="EMPTY"/>     <dtd:attlist element="book">       <dtd:attribute name="isbn" type="ID" value="#REQUIRED"/>       <dtd:attribute name="title" type="CDATA" value="#IMPLIED"/>     </dtd:attlist>     <dtd:entity name="blurb">'A <i>cool</i> book with &gt; 200 pictures!'</dtd:entity>     <dtd:entity name="cover" system="cover.gif" notation="GIF"      <dtd:notation name="GIF" system="http://gif.org/"/>   </dtd:doctype>   </saxon:doctype>   <xsl:apply-templates/> </xsl:template>

The `<saxon:entity-ref>` Extension Element

This instruction element allows HTML entities such as   to be generated in HTML output when the <xsl:output> top-level element has a method attribute of html. Use the element as follows:

<saxon:entity-ref name="nbsp" />

This empty element has one required attribute, name, as shown in the element model definition below:

<!-- Category: instruction-extension-element --> <saxon:entity-ref  name = qname />

The `<saxon:group>` Extension Element

The grouping mechanism provided by <saxon:group> allows iteration over nodes selected in an expression returning a node-set. The required select attribute is used to define the nodes which will be used for the iteration, as shown in the following element model definition:

<!-- Category: instruction-extension-element --> <saxon:group   select = node-set-expression   group-by = string >   <!-- Content: (xsl:sort*, template?, saxon:item, template?) --> </saxon:group>

This instruction element is similar in function to <xsl:for-each>. It also requires a group-by attribute to determine how the grouping is to be done whose value is a string expression that is applied to each item selected under the select attribute. This element can have <xsl:sort> children and must have a <saxon:item> children (see the section immediately below). The other instructions contained in <saxon:group> are performed once for each item in the group selected by the select attribute of the parent <saxon:group>.

The `<saxon:item>` Extension Element

This element is the required child of the <saxon:group> element, and stipulates the items within a group. XSLT instructions outside of <saxon:item> are executed once for each group that qualifies in the group-by attribute of the <saxon:group>. The XSLT instructions that are children of <saxon:item> are executed once per item. This element has no attributes, and contains a template, as shown in the element model definition below:

<!-- Category: extension-element --> <saxon:item>   <!-- Content: (template) --> </saxon:item>

The `<saxon:output>` Extension Element

This element allows redirection of output to different files of all result tree nodes produced within the <saxon:output> tags. After its contents have been executed and placed in the respective files, the output destination reverts back to the previous output destination stipulated when the XSLT stylesheet was invoked. This element is equal to the <xsl:document> element that is specified in XML 1.1 WD, which is implemented by many processors. Note that in previous versions of Saxon, <saxon:output> had additional functionality that has been removed. The <saxon:output> element is shown in the following element model definition:

<!-- Category: instruction-element --> <saxon:output   href = { uri-reference }   method = { "xml" | "html" | "text" | qname-but-not-ncname }   version = { nmtoken }   encoding = { string }   omit-xml-declaration = { "yes" | "no" }   standalone = { "yes" | "no" }   doctype-public = { string }   doctype-system = { string }   cdata-section-elements = { qnames }   indent = { "yes" | "no" }   media-type = { string } >   <!-- Content: (template) --> </saxon:output>

The `<saxon:return>` Extension Element

The <saxon:return> element is used to exit from a function, and provides a return value. It is only used within a <saxon:function> element, and it must not have any following sibling instructions other than <xsl:fallback>. However, there can be more than one <xsl:return> instruction in a function, for example, one in each branch of an <xsl:choose>.

The <saxon:return> element has one optional select attribute, whose value is an expression. The expression is evaluated and its value is sent as the return value of the function. If the select attribute is not used, the template in the <saxon:return> element is instantiated and the result is returned as a result tree fragment. The following element model definition shows the structure of the <saxon:return> element.

<!-- Category: extension-element --> <saxon:return   select = expression >   <!-- Content: (template) --> </saxon:return>

The `<saxon:while>` Extension Element

This element adds an iteration feature that processes as long as some given condition is true. The condition is a Boolean expression in the mandatory test attribute. To prevent endless looping, the <saxon:assign> element is required as a child to <saxon:while> and sets a variable that is updated at some point in the loop in order to terminate it.

<!-- Category: instruction-extension-element --> <saxon:while test = expression >   <!-- Content: (template?, saxon:assign, template?) --> </saxon:while>

An example of using <saxon:while>, from the Saxon documentation, is as follows:

<xsl:variable name="i" expr="0"/> <saxon:while test="$i &lt; 10">     The value of i is <xsl:value-of select="$i"/>     <saxon:assign name="i" expr="$i+1"/> </saxon:while>

13.2.5.3 Saxon Extension Functions

Saxon implements twenty-seven extension functions, ranging in application from basic existence to conditional functions. These functions include: saxon:after(), saxon:before(), saxon:difference(), saxon:distinct(), saxon:evaluate(), saxon:eval(), saxon:exists(), saxon:expression(), saxon:forAll(), saxon:getUserData(), saxon:hasSameNodes(), saxon: highest(), saxon:if(), saxon:ifNull(), saxon: intersection(), saxon:leading(), saxon:lineNumber(), saxon:lowest(), saxon:max(), saxon:min(), saxon:nodeSet(), saxon:path(), saxon:range(), saxon:set- UserData(), saxon:sum(), saxon:systemId(), and saxon:tokenize().

To invoke a Saxon function, the Saxon namespace must be declared at or above the element calling the function. A typical use of a Saxon extension function is shown below:

<xsl:template match="something">       <xsl:apply-templates              select="saxon:distinct($some_nodeset)" > </xsl:template>

More details about these functions and updates for newly added functions are available at http://users.iclway.co.uk/mhkay/saxon.

The documentation notes that these extension functions have a very simple source code for the most part which can be used as templates, or models, by users for writing their own extensions.

The `after()` Extension Function

Function: node-set after (node-set-1, node-set-2)

The after() function returns a node-set with all the nodes in node-set-2 that follow (in document order) at least one node of node-set-1. Its function return type is node-set, and it contains two node-set arguments.

The `before()` Extension Function

Function: node-set before (node-set-1, node-set-2)

The before() function returns a node-set with all the nodes in node-set-2 that precede (in document order) at least one node of node-set-1. Its function return type is node-set, and it contains two required node-set arguments.

The `difference()` Extension Function

Function: node-set difference (node-set-1, node-set-2)

The difference() function compares the two arguments and returns a node-set of those nodes in node-set-1 that are not in node-set-2. Its function return type is node-set, and it contains two required node-set arguments.

The `distinct()` Extension Function

Function: node-set distinct (node-set-1, stored-expression)

This function returns a node-set based on evaluating all the nodes in the set given in the first argument that a duplicate string value as the stored-expression in the second argument. Its function return type is node-set, and it contains two arguments, the first a required node-set and the second an optional string.

If the second argument is not used, the string that is used as a comparison is the string value of the current node. Every node following will be compared, removing any duplicates.

An example from the Saxon documentation is as follows:

<xsl:for-each select="saxon:distinct(surname, saxon:expression('substring(.,1,1)')">

This function will process the first surname starting with each letter of the alphabet in turn.

The `eval()` Extension Function

Function: string eval (stored-expression)

The eval() function evaluates the expression stored as its argument and returns the string value of that expression. See the saxon:expression() function for information about generating stored-expressions. The function return type is string, and it contains one string argument, which is an expression. The following example comes from the Saxon documentation:

saxon:eval(saxon:expression(concat(2, $op, 2)))

The `evaluate()` Extension Function

Function: string evaluate (string)

This function evaluates the expression that is passed in as a string argument and returns its value as a string. This allows the calculation of a variable, for instance, at runtime, based on the evaluation of this expression. One use might be to dynamically determine a sort key for <xsl:sort> based on different contingencies for various input XML document instances. The function saxon:evaluate(string) is shorthand for saxon:eval(saxon:expression(string)).

The `exists()` Extension Function

Function: boolean exists (node-set-1, stored-expression)

The exists() function is used to test whether the value of the stored-expression in the second argument is true for any node in the node-set supplied in the first argument. The function return type is boolean, and it has two required arguments, a node-set and a string (expression).

The `expression()` Extension Function

Function: string expression (string)

This function is used to create a stored expression that can be used in other Saxon extension functions. It contains one required argument, a string which must be an expression. Its function return type is string.

The `forAll()` Extension Function

Function: boolean forAll (node-set-1, stored-expression)

This function tests each node in the node-set provided in the first argument against the expression in the second argument. If each node in the node-set evaluates to true, the function returns true. Otherwise it returns false. It has two required arguments, a node-set and a string (expression). Its function return type is Boolean.

An example of using this function, from the Saxon documentation, is as follows:

saxon:forAll(sale, saxon:expression('@price * @qty &gt; 1000'))

This will return true if for every child <sale> element of the context node, the product of price and qty exceeds 1000.

The `getUserData()` Extension Function

Function: string getUserData (string)

This function returns a string value of the predefined user data associated with the context node. The user data is predefined using the saxon:setUserData() function. It has one required argument, a string, and its function return type is a string.

The `hasSameNodes()` Extension Function

Function: boolean hasSameNodes (node-set-1, node-set-2)

The has-same-nodes() function returns a Boolean true if node-set-1 and node-set-2 have exactly the same nodes (not merely an intersection). This is different from the XSLT = operator, which only compares the string values of nodes. The function has two required arguments, both node-sets, and its function return type is Boolean.

The `highest()` Extension Function

Function: node-set highest (node-set1, stored-expression)

This function returns a node-set of the one node that has the highest numerical value, evaluated as if using the number() function. If the second argument is used, the expression is evaluated and the node that is returned is the one that has the highest value according to that expression. NaN values are ignored. The function has one required attribute, a node-set, and one optional argument, a string (expression). Its function return type is node-set.

An example of using this function, from the Saxon documentation, is as follows:

saxon:highest(sale, saxon:expression('@price * @qty'))

This will evaluate price times quantity for each child <sale> element, and return the node for which this has the highest value.

The `if()` Extension Function

Function: object if (condition, value1, value2)

This function allows conditionals as part of an XPath expression. The first argument must be a Boolean function, such as contains(), or a similar test. If it is true, then it returns the value of the first argument; if it is false, it returns the value of the second argument. The function has three arguments, the first is a Boolean, the second and third are of type object (they can be of any type, node-set, string, number, or Boolean). Its function return type is object, the same type as the value of the argument being returned.

The `ifNull()` Extension Function

Function: boolean ifNull (java-object)

This function returns true if the java-object provided as the required argument is null. Its function return type is Boolean, and its one required argument is of type string (java-object).

The `intersection()` Extension Function

Function: node-set intersection (node-set-1, node-set-2)

This function will return a node-set containing only those nodes common to both node-set-1 and node-set-2, and discards all others. The function has two required arguments, both node-sets, and its function return type is node-set.

An added convenience is that the arguments can be a union of tests with the | operator to test one of several node-sets. This is very handy to use, for instance, with keys, as it can determine what both arguments have in common.

The `leading()` Extension Function

Function: node-set leading (node-set-1, stored-expression)

This function evaluates the expression in the second argument and returns each node in the node-set of the first argument that evaluates to true, up to, but not including, the first node that returns a false value. The function has two required arguments, a node-set and a string (expression), and its function return type is node-set.

An example of using this function, from the Saxon documentation, is as follows:

saxon:leading(following-sibling::*, saxon:expression('self::para'))

This will return the <para> elements following the current node, stopping at the first element that is not a <para>.

The `lineNumber()` Extension Function

Function: number lineNumber()

This function is used to determine the line number, in document order, of the input XML document at the point where it is used. It can be used with <xsl:message>, for instance, to diagnose where a match is or is not happening for a given template rule. The function has no arguments, and its function return type is a number.

Make sure line numbering is turned on by adding the -l option on the command-line.

The `lowest()` Extension Function

Function: node-set lowest (node-set-1, stored-expression)

This function returns a node-set of the one node that has the lowest numerical value, evaluated as if using the number() function. If the second argument is used, the expression is evaluated and the node that is returned is the one that has the lowest value according to that expression. NaN values are ignored. The function has one required attribute, a node-set, and one optional argument, a string (expression). Its function return type is node-set.

An example of using this function, from the Saxon documentation, is as follows:

saxon:lowest(sale, saxon:expression('@price * @qty'))

This will evaluate price times quantity for each child <sale> element, and return the node for which this has the lowest value.

The `max()` Extension Function

Function: number max (node-set-1, stored-expression)

This function returns a number which is the highest possible value of the evaluation of the expression in the second argument for each node in the node-set of the first argument. The number() function is used implicitly to evaluate the string value of each node prior to testing, and if there is no second argument, the highest value of that evaluation is returned. This function has one required argument, a node-set, and one optional argument, a string (expression). Its function return type is number.

An example of using this function, from the Saxon documentation, is as follows:

saxon:max(sale, saxon:expression('@price * @qty'))

This will evaluate price times quantity for each child <sale> element, and return the maximum amount.

The `min()` Extension Function

Function: number min (node-set-1, stored-expression)

This function returns a number which is the lowest possible value of the evaluation of the expression in the second argument for each node in the node-set of the first argument. The number() function is used implicitly to evaluate the string value of each node prior to testing, and if there is no second argument, the lowest value of that evaluation is returned. This function has one required argument, a node-set, and one optional argument, a string (expression). Its function return type is number.

An example of using this function, from the Saxon documentation, is as follows:

saxon:min(sale, saxon:expression('@price * @qty'))

This will evaluate price times quantity for each child <sale> element, and return the minimum amount.

The `nodeSet()` Extension Function (obsolete)

Function: node-set nodeSet ($fragment)

This function is now obsolete: a result-tree-fragment is now converted implicitly to a node-set if it is used in a context where a node-set is required.

The `path()` Extension Function

Function: string path()

The path() function returns the string value of the path (XPath pattern expression) of the context node. It has no arguments, and its function return type is string.

The `range()` Extension Function

Function: node-set range (number-1, number-2)

The range() function allows two arguments to be converted to numbers according to the XSLT number() function, and then further rounds them to nearest integers. A new node-set is then made which contains one node for each integer in the range, starting with the first number and all the integers between and including the last number. The values of the numbers are converted to strings and stored as the values of the nodes in the new node-set. Its two required arguments are both numbers, and its function return type is node-set.

For example, range(2, 5) creates a node-set with four nodes with string values 2, 3, 4, and 5.

The main intended usage, as stated in the Saxon documentation, is <xsl:for-each select="range($from, $to)"> which simulates a conventional for-loop in other programming languages.

The `setUserData()` Extension Function

Function: string setUserData (string, value)

This function associates property information with the context node that can then be accessed with the getUserData() extension function (within the same stylesheet). It has two arguments, both strings, although the second string contains an expression. The string value of the first argument is used as the name for the property. The value of the property is assigned using the second argument, which is an expression. The function return type for setUserData() is an empty string, because the values are retrieved using the getUserData() function.

The `sum()` Extension Function

Function: number sum (node-set-1, stored-expression)

This function evaluates the expression in the second argument and applies it to each node in the node-set of the first argument. Each value is then added up to provide a total sum of the numbers of the nodes. If the result of any node is NaN, the total will be NaN.

An example of using this function, from the Saxon documentation, is as follows:

saxon:sum(sale, saxon:expression('@price * @qty'))

This will evaluate price times quantity for each child <sale> element, and return the total amount.

The `systemId()` Extension Function

Function: string systemId()

This function returns the system identifier or URI of the XML entity that contains the context node. Its function return type is a string, and it has no arguments.

The `tokenize()` Extension Function

Function: node-set tokenize (string-1, string-2?)

This function builds a new node-set containing a node for each token in the first argument. The first argument is converted to a string, as with the XSLT string() function (see Chapter 5). This string is treated then as a whitespace-separated list of tokens. The second argument can set a delimiter other than whitespace, such as a comma. It can be used to break out, word by word, the contents of a sentence, for example. This function contains one required argument and one optional argument, both of type string, and its function return type is node-set.

13.3 XT

James Clark has been a driving force in specification design and authorship for markup technology. He was instrumental in the codification of the W3C specification for XSLT as its editor and was also principle author of the W3C specification for XPath. His processor, XT, is universally acknowledged as far and away the leanest, meanest, and fastest. It was no small shock to the markup community when he announced that he is no longer building revisions and upgrades. There is a list at the end of this section of the limitations in XT, as documented on James Clark's Web site. There is good news, however; a group called 4XT is taking on the task of developing XT further, and their Web site, http://www.4xt.org, documents their efforts.

Clark's XML parser, XP, is also highly regarded and is the default in countless implementations for this reason. In addition, XP allows comments to be passed to the application.

XT is designed as a filter for SAX, the Simple API for XML. It is a parallel technology to the DOM and has some distinct advantages over DOM, apart from the wide industry base of preference for SAX. XT takes the stream of SAX events from the XML processor e.g., XP as input, and outputs them to the result tree as an additional stream of SAX events.

It is well worthwhile to begin with XT simply due to its speed, and then move to another processor if and when additional functionality is required. Note that while XT conforms meticulously to the W3C specification for XSLT and XPath, as you might expect from a processor born of the editor for both specs, it does not implement all of the spec, most notably and regrettably omitting support for key(). In addition, <xsl:fallback>, <xsl:namespace-alias> element, the extension-element-prefixes attribute, and the element-available() function are not implemented. The optional third argument to format-number() and the <xsl:decimal-format> element are not supported. XT does not allow access to the namespace axis and you cannot add the xsl:exclude-result-prefixes attribute to literal result elements (it is allowed on the document element, however).

XT has a simple executable for Windows that runs in the same way as Instant Saxon. If you use it with a JVM other than the Internet Explorer engine, you will need to modify CLASSPATHS. When you download XT from Clark's site, the XP processor comes with it, so we will describe the installation of both together.

13.3.1 Installing XT for Windows

Download the xt.exe file from James Clark's site at http://www.jclark.com. You do not need to add any extra parsers or to modify PATH or CLASSPATH, provided you have IE 4+ (IE 5 recommended) on your Windows 95, 98, or NT/2000 machine. Unzip the file in a directory where you plan to use XT and you are ready to go.

To run Instant XT on Windows, select Start, Applications, MSDOS, and in the window, run it with the following syntax:

xt source.xml stylesheet.xsl output.xml [name=param]

The name is a name value for inserting a parameter's value at runtime, with the syntax of param=value, with name as the parameter's name declared in <xsl:param>, and value as whatever value you assign at runtime. In place of the parameter, you can stipulate the name of your output file.

13.3.2 Installing XT and XP on UNIX

If you are running XT, you will need the JDK 1.2 (1.1.6+ will do, but is not recommended) installation, described in Chapter 12. You will also need XP, which is included with the download of the XT processor from James Clark's site. The core program for working with the objects is a JAR file, xt.jar, which you must include on your CLASSPATH. We will continue to work with the model introduced in Chapter 12, which assumes you will put this in a /usr/bin directory, likely called /usr/bin/XT.

If you have followed the instructions above, you will have JDK 1.2.2, which will work fine. The following examples assume that you will use the default xp.jar XML processor and that you have put it in your directory with Saxon.

At the very least, you must include xt.jar and xp.jar on the system CLASSPATH. Thus, where you had a basic "." you would modify it as follows:

If you know where your XSLT processor files are, you can do all this in one step by separating each with a colon (:) for an installation of XT, where the XT files are all in your usr/bin directory (see details in Chapter 12):
1. For pre-JDK 1.2:
```
setenv CLASSPATH=/usr/bin/xt/sax.jar:/usr/bin/xt/xt:/java/classes/classes.zip 
```
2. For JDK 1.2+:
```
setenv CLASSPATH=/usr/bin/xt/sax.jar:/usr/bin/xt/xt:. 
```
You can also make this permanent if you're comfortable editing your .cshrc file; just add the following line to it (exactly as above, but with set rather than setenv, and remember that /java/classes/classes.zip is only for pre-JDK 1.2; otherwise only a "." period is needed):
1. For pre-JDK 1.2:
```
set CLASSPATH=/usr/bin/xt/sax.jar:/usr/bin/xt/xt:/java/classes/classes.zip 
```
2. For JDK 1.2+:
```
set CLASSPATH=/usr/bin/xt/sax.jar:/usr/bin/xt/xt:. 
```

13.3.3 Installing XT and XP on Macintosh

First, you need to download the Java for Mac that has JBindery. It comes with the MRJ SDK. You can find the MRJ 2.1 SDK (Macintosh Runtime for Java Software Development Kit) for free at http://developer.apple.com/java/text/download.html#sdk (it takes a few minutes from home, less on Ethernet). Now install it by double-clicking the icon and following all the automatic presets. When it's done, you will see the MRJ SDK 2.1 directory on your hard drive, with a JBindery folder (if not, trash MRJ, reinstall with custom, and select all components).

Now you need the files for XT, from James Clark's site. Be sure to choose the Java versions, not the Windows. You will need XT and XP. Get them from http://www.jclark.com/xml/xt.html, or get direct anonymous FTP from ftp://ftp.jclark.com/pub/xml/xp.zip and ftp://ftp.jclark.com/pub/xml/xt.zip. Unzip these and put all the XT files in one folder called "xt" inside the JBindery folder which is nested several folders inside of the MRJ folder (inside MRJ is a folder called Tools, and one called Application Builders. It is inside the Application Builders where you will find the JBindery folder) on your hard drive. Do the same with XP (except put its files in the JBindery folder in their own folder called "xp").

If you want to make your own applets after you have more experience with XSLT, and change the input and output filenames, just run JBindery itself and change the filenames in the Command window, as shown in Figure 13-1.

Figure 13-1. Screenshot of the JBindery Command window.

graphics/13fig01.gif

Then Save as Application in the JBindery folder with whatever name you wish, which means you can then run these applets by double-clicking the application (that is, as long as you are not changing the input, stylesheet, and output filenames).

You may also encounder Not Found/Directory Path errors. That comes from a trickier screen. If you look at the JBindery window, the lefthand icon list has a Properties icon, and this could be a source of troubles. You need to add the following two lines to the bottom left and bottom right windows (it'll spill over the viewable space of the window, so type carefully), respectively:

On the left window, add:
```
jclark.xsl.sax.parser 
```
On the right window, add:
```
com.jclark.xml.sax.Driver 
```

If done correctly, it should look like the window in Figure 13-2. The top large window fills in of its own accord as you use JBindery.

Figure 13-2. The JBindery Properties window for XT.

graphics/13fig02.gif

You should also have the appropriate .jar files in your CLASSPATH window. Assuming you've unzipped stuff all to the JBindery directory, your window should look like Figure 13-3.

Figure 13-3. The JBindery CLASSPATH window.

graphics/13fig03.gif

If it does not, use the Add .zip File button to select the appropriate .jar files, as shown through the Finder interface it provides.

It's important to remember that you will primarily use the JBindery folder while you're learning. Over time, as you gain confidence, you will likely want to work with different directories and the JBindary folder will likely get very crowded!

13.3.4 XT Extensions

XT implements one extension element: <xt:document>, and one extension attribute: xt:nxml. It also includes three extension functions: node-set(), intersection(), and difference().

The XT namespace must be declared if any extensions are to be used. The following example shows the proper way to declare the XT namespace:

xmlns:xt="http://www.jclark.com/xt"

13.3.4.1 The `xt:nxml` Extension Attribute

The xt:nxml value for the method attribute on <xsl:output> enables certain specific non-XML characters to be output which simply using <xsl:output> with method set to text would not allow.

Using xt:nxml as the output method stipulates a number of subelements to mark characters for a sort of "escaping" so the processor and parser do not signal an error. These subelement are shown in Table 13-5.

James Clark's XT documentation shows the following example of using the xt:nxml method:

<xsl:stylesheet

Table 13-5. Subelements escaped by `xt:nxml`
QName	Action
`<nxml></nxml>`	Contains the `<char>,` `<data>,` `<escape>,` and `<control>` elements.
`<char></char>`	Allows a non-XML character to be output, such as ASCII control characters outside the default accepted set.
`<data></data>`	Allows special characters to be, or to remain, escaped throughout processing.
`<escape></escape>`	Allows a special character and how it should be escaped to be defined by the user, for example, without using &.
`<control></control>`	Allows characters to be output directly with no escaping; it sort of forces a straight-through processing without additional treatment, not unlike CDATA sections.

  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output method="xt:nxml" xmlns:xt="http://www.jclark.com/xt"/> <xsl:template match="/"> <nxml> <escape char="\">\\</escape> <data>&amp;&lt;&gt;\</data> <control>&amp;&lt;&gt;\</control> </nxml> </xsl:template> </xsl:stylesheet>

This will generate the following output:

&<>\\&<>\

13.3.4.2 The `<xt:document>` Extension Element

The <xt:document> element is used to produce multiple output files from a single XML document instance. It has a mandatory href attribute whose value must be the relative URL for an output file. Its value can be interpreted as an attribute value template. In addition, all the same attributes that are allowed on the <xsl:document> element can be used with <xt:document>.The content of the <xt:document> element is a template, as shown in the following element model definition.

<!-- Category: instruction-element --> <xt:document   href = { uri-reference }   method = "xml" | "html" | "text" | qname-but-not-ncname   version = nmtoken   encoding = string   omit-xml-declaration = "yes" | "no"   standalone = "yes" | "no"   doctype-public = string   doctype-system = string   cdata-section-elements = qnames   indent = "yes" | "no"   media-type = string >   <!-- Content: template --> </xt:document>

See Section 13.4 for an example of an XSLT stylesheet designed for multiple-document output from either Saxon, Xalan, or XT.

13.3.4.3 The `node-set()` Extension Function

Function: node-set node-set ($fragment)

This function takes a result tree fragment and converts it to a node-set. This function provides the ability to choose parts of the result tree while it is still in process and treat those parts with additional template rules. The function return type of this function is node-set, and its one required attribute is a node-set.

13.3.4.4 The `intersection()` Extension Function

Function: node-set intersection (node-set-1, node-set-2)

This function returns a node-set containing the nodes common to two node-sets. Nodes from the first argument that are also found in the node-set in the second are returned as a new node-set. This function has two required arguments, both of type node-set, and its function return type is also a node-set.

13.3.4.5 The `difference()` Extension Function

Function: node-set difference (node-set-1, node-set-2)

This function returns a node-set containing the difference between two node-sets. Nodes from the first argument that are not found in the node-set in the second are returned as a new node-set. This function has two required arguments, both of type node-set, and its function return type is also a node-set.

13.3.5 XT Limitations

James Clark list the limitations and known bugs for the XT processor as follows. The following features of the XSLT PR are not yet implemented:

the element extension mechanism (the extension-element-prefixes and xsl:extension-element-prefixes attributes, the <xsl:fallback> element, and the element-available() function)
keys (the <xsl:key> element, and the key() function)
the <xsl:decimal-format> element and the optional third argument on the format-number() function
the namespace axis
forwards-compatible processing
the xsl:exclude-result-prefixes attribute on literal result elements (the exclude-result-prefixes attribute on <xsl:stylesheet> is implemented)
The xml output method ignores the encoding and cdata-section-elements attributes on <xsl:output>.

The following are some known bugs:

Many errors that the XSLT specification requires to be reported are silently ignored.
Comments and processing instructions occurring in the DTD are not excluded from the data model.
The node() node-test does not work in match patterns (it does work in expressions).
The document() function does not pay attention to the HTTP content-type header.
The <xsl:import> element does not conform to the requirement that when xsl:include is used to include a stylesheet, any <xsl:import> elements in the included document are moved up in the including document to after any existing <xsl:import> elements in the including document.
The HTML output method may get confused if you embed namespace-qualified XML elements with the HTML.

Improvement is needed in the following areas:

The implementation of the <xsl:number> element is slow.
Error reporting is often not as helpful as it might be.
No error recovery is attempted after an error is reported.
The document() function does not support fragment identifiers in URIs for any media types.

13.4 Generating Multiple Output Files Using Saxon, Xalan, or XT

Example 13-4, drawn from the earlier Markup City examples, shows how <xt:document>, <saxon:output>, and Xalan's Redirect extensions can be used in one XSLT stylesheet. We use <xsl:fallback> to make sure the stylesheet can run on systems using any of these processors with almost identical output. Note that even though XT does not support <xsl:fallback> and element-available(), it will find its own recognized extensions in this stylesheet and process them, thus still giving consistent output.

Example 13-4 XML for stylesheets using multiple processors.

<?xml version="1.0"?> <parkway>              <thoroughfare>Governor Drive</thoroughfare>              <thoroughfare name="Whitesburg Drive">                    <sidestreet name="Bob Wallace Avenue">                          <block>1st Street</block>                          <block>2nd Street</block>                          <block>3rd Street</block>                    </sidestreet>                    <sidestreet>Woodridge Street</sidestreet>              </thoroughfare>              <thoroughfare name="Bankhead">                    <sidestreet name="Tollgate Road">                          <block>First Street</block>                          <block>Second Street</block>                          <block>Third Street</block>                    </sidestreet>                    <sidestreet>Oak Drive</sidestreet>              </thoroughfare> </parkway>

In our example, we might want to chop up the <thoroughfare> elements in Markup City into individual XML data instances for further detailed work. Think of it as separating the city into districts for specific councilpersons to represent. We might want to use Clark's <xt:document> instruction element (or the <saxon:output> element) to do this. The territory to be divvied up consists of the offshoots of <parkway>.

Our goal is to get a single HTML page for each <thoroughfare>. In Example 13-5, we'll use <xt:document> and several LREs to do this (note that we've declared the XT namespace URI properly in the document element).

Example 13-5 Using the `<xt:document>` extension element.

<?xml version="1.0"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"       version="1.0"       xmlns:xt="http://www.jclark.com/xt"       extension-element-prefixes="xt"> <xsl:output omit-xml-declaration="yes"/> <xsl:template match="text()"/> <xsl:template match="thoroughfare">       <xt:document href="{@name | text()}.html">       <html>       <head><title><xsl:value-of select="@name | text()"/></title></head>       <body>       <h2><xsl:value-of select="@name | text()"/></h2>       <xsl:apply-templates/>       </body>       </html> </xt:document> </xsl:template> <xsl:template match="sidestreet">       <dl>              <dt>                    <xsl:value-of select="@name | text()" />              </dt>              <dd>                    <ul>                          <xsl:apply-templates/>                    </ul>              </dd> </dl> </xsl:template> <xsl:template match="block">       <li>              <xsl:value-of select="@name | text()" />       </li> </xsl:template> </xsl:stylesheet>

The attribute value template in the href attribute on <xt:document> provides access to the value of either the attribute or the text name of the <thoroughfare>.

The <xt:document> instruction element creates three new HTML files, one for each <thoroughfare>, containing the <sidestreet> and <block> children of each respective <thoroughfare>.

A contingency for this stylesheet would be very difficult if there were no processors that could handle the multiple outputs. In fact, it basically couldn't be done. We could make a contingency for whether a given processor that does multiple outputs is available though, for instance between XT or Saxon. In Example 13-6 we'll use a couple of more XSLT instruction elements, <xsl:when> and <xsl:otherwise>, children of <xsl:choose>.

We basically repeated the entire XT-dependent template, which used <xt:document>, within the context of an <xsl:choose> element. Each <xsl:when> element tests for the availability of an extension element, using the element-available() function. If the element is available, then the instructions under the <xsl:when> element are instantiated. If element-available() returns false for all the <xsl:when> elements, then the <xsl:otherwise> is instantiated. The <xsl:document> function shown here is defined in the XSLT 1.1 WD. Note that, since XT does not support the element-available() function, the stylesheet will fail using XT.

Example 13-6 Using contingencies for extension elements.

<?xml version="1.0"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"       version="1.0"       xmlns:xt="http://www.jclark.com/xt"       xmlns:saxon="http://icl.com/saxon"       xmlns:lxslt="http://xml.apache.org/xslt"       xmlns:redirect="org.apache.xalan.xslt.extensions.Redirect"       extension-element-prefixes="xt saxon lxslt redirect"> <xsl:output omit-xml-declaration="yes"/> <xsl:template match="text()"/> <xsl:template match="thoroughfare"> <xsl:choose>       <xsl:when test="element-available('xt:document')">             <xt:document href="{@name | text()}.html">                   <html>                         <head><title><xsl:value-of select="@name | text()"/></title></head>                        <body>                                <h2><xsl:value-of select="@name | text()"/></h2>                                <xsl:apply-templates/>                        </body>                  </html>            </xt:document>      </xsl:when>      <xsl:when test="element-available('saxon:output')">            <saxon:output file="{@name | text()}.html">                  <!-- use Saxon pre 6.2.2 version and the "file"                        attribute for saxon:output support -->                  <html>                        <head><title><xsl:value-of select="@name | text()"/></title></head>                        <body>                              <h2><xsl:value-of select="@name | text()"/></h2>                              <xsl:apply-templates/>                       </body>                  </html>             </saxon:output>       </xsl:when>       <xsl:when test="element-available('xsl:document')">             <xsl:document href="{@name | text()}.html">                   <html>                         <head><title><xsl:value-of select="@name | text()"/></title></head>                         <body>                               <h2><xsl:value-of select="@name | text()"/></h2>                               <xsl:apply-templates/>                         </body>                   </html>             </xsl:document>      </xsl:when>      <xsl:when test="element-available('redirect:write')">            <redirect:write select="{@name | text()}.html">                  <html>                        <head><title><xsl:value-of select="@name | text()"/></title></head>                        <body>                               <h2><xsl:value-of select="@name | text()"/></h2>                               <xsl:apply-templates/>                        </body>                  </html>             </redirect:write>       </xsl:when>       <xsl:otherwise>             <h2><xsl:value-of select="@name | text()"/></h2>       <xsl:apply-templates/>       </xsl:otherwise> </xsl:choose> </xsl:template> <xsl:template match="sidestreet">       <dl>             <dt>                  <xsl:value-of select="@name | text()" />             </dt>             <dd>                  <ul>                       <xsl:apply-templates/>                  </ul>             </dd>       </dl> </xsl:template> <xsl:template match="block">       <li>             <xsl:value-of select="@name | text()" />       </li> </xsl:template> </xsl:stylesheet>

^[1] See http://www.w3.org/TR/REC-xml.

^[2] See http://oss.software.ibm.com/developerworks/opensource/icu/project/index.html.

^[3] This is from the Xalan-Java Class Redirect.htm file in the Xalan-J documentation.

^[4] See http://www.w3.org/TR/xslt11.

^[5] See www.microstar.com.

^[6] The disable-output-escaping attribute has been implemented in the XSLT specification and is no longer a Saxon extension.

^[7] The method attribute is from the XSLT1.0 specification, but Saxon adds support for QName values.

CONTENTS

13.1 Xalan

13.1.1 Xalan-C++

13.1.1.1 Installing Xalan-C++

13.1.1.2 Using Xalan-C++ Command-line

Table 13-1. Command-line options for Xalan-C++

13.1.1.3 Extending Xalan-C++

13.1.1.4 Limitations of Xalan-C++

13.1.1.5 Internationalization with Xalan-C++

13.1.2 Xalan-J

13.1.3 Using Xalan-J with Eric Lawson's GUI

13.1.4 Installing the Basic Command-line Interface for Xalan-J

13.1.5 Using Xalan-J with the Command-line Interface and Extensions

Table 13-2. Xalan-J arguments and flags to be invoked at runtime

13.1.6 Xalan-J Extensions

13.1.6.1 Xalan-J Redirect Extension

13.1.6.2 Xalan-J User-Defined Extensions

13.1.6.3 The Xalan-J <lxslt:component> Extension Element

Example 13-1 Using Redirect with Xalan.

13.1.6.4 The Xalan-J <lxslt:script> Extension Element

Example 13-2 Using <lxslt:script> to define an extension using JavaScript.

Example 13-3 Using Xalan-J with User-defined Extensions.

13.2 Saxon

13.2.1 Installing Full Saxon on Solaris/UNIX or Windows Java

13.2.2 Installing Instant Saxon on Windows

13.2.3 Saxon Options

Table 13-3. Command-line options for Saxon

13.2.4 Saxon Command-line Parameters

13.2.5 Saxon Extensions

13.2.5.1 Saxon Attribute Extensions

The saxon:trace Extension Attribute

The saxon:allow-avt Extension Attribute

The saxon:disable-output-escaping Extension Attribute

The method Attribute with Saxon

Table 13-4. Values of QNames implemented by the Saxon processor[a]

The saxon:indent-spaces Extension Attribute

The saxon:character-representation Extension Attribute

The saxon:omit-meta-tag Extension Attribute

The saxon:next-in-chain Attribute

13.2.5.2 Saxon Extension Elements

The <saxon:handler> Top-Level Extension Element

The <saxon:preview> Top-Level Extension Element

The <saxon:function> Top-Level Extension Element

The <saxon:script> Top-Level Extension Element

The <saxon:assign> Extension Element

The <saxon:doctype> Extension Element

The <saxon:entity-ref> Extension Element

The <saxon:group> Extension Element

The <saxon:item> Extension Element

The <saxon:output> Extension Element

The <saxon:return> Extension Element

The <saxon:while> Extension Element

13.2.5.3 Saxon Extension Functions

The after() Extension Function

The before() Extension Function

The difference() Extension Function

The distinct() Extension Function

The eval() Extension Function

The evaluate() Extension Function

The exists() Extension Function

The expression() Extension Function

The forAll() Extension Function

The getUserData() Extension Function

The hasSameNodes() Extension Function

The highest() Extension Function

The if() Extension Function

The ifNull() Extension Function

The intersection() Extension Function

The leading() Extension Function

The lineNumber() Extension Function

The lowest() Extension Function

The max() Extension Function

The min() Extension Function

The nodeSet() Extension Function (obsolete)

The path() Extension Function

The range() Extension Function

The setUserData() Extension Function

The sum() Extension Function

The systemId() Extension Function

The tokenize() Extension Function

13.1.6.3 The Xalan-J `<lxslt:component>` Extension Element

13.1.6.4 The Xalan-J `<lxslt:script>` Extension Element

Example 13-2 Using `<lxslt:script>` to define an extension using JavaScript.

The `saxon:trace` Extension Attribute

The `saxon:allow-avt` Extension Attribute

The `saxon:disable-output-escaping` Extension Attribute

The `method` Attribute with Saxon

Table 13-4. Values of QNames implemented by the Saxon processor^[a]

The `saxon:indent-spaces` Extension Attribute

The `saxon:character-representation` Extension Attribute

The `saxon:omit-meta-tag` Extension Attribute

The `saxon:next-in-chain` Attribute

The `<saxon:handler>` Top-Level Extension Element

The `<saxon:preview>` Top-Level Extension Element

The `<saxon:function>` Top-Level Extension Element

The `<saxon:script>` Top-Level Extension Element

The `<saxon:assign>` Extension Element

The `<saxon:doctype>` Extension Element

The `<saxon:entity-ref>` Extension Element

The `<saxon:group>` Extension Element

The `<saxon:item>` Extension Element

The `<saxon:output>` Extension Element

The `<saxon:return>` Extension Element

The `<saxon:while>` Extension Element

The `after()` Extension Function

The `before()` Extension Function

The `difference()` Extension Function

The `distinct()` Extension Function

The `eval()` Extension Function

The `evaluate()` Extension Function

The `exists()` Extension Function

The `expression()` Extension Function

The `forAll()` Extension Function

The `getUserData()` Extension Function

The `hasSameNodes()` Extension Function

The `highest()` Extension Function

The `if()` Extension Function

The `ifNull()` Extension Function

The `intersection()` Extension Function

The `leading()` Extension Function

The `lineNumber()` Extension Function

The `lowest()` Extension Function

The `max()` Extension Function

The `min()` Extension Function

The `nodeSet()` Extension Function (obsolete)

The `path()` Extension Function

The `range()` Extension Function

The `setUserData()` Extension Function

The `sum()` Extension Function

The `systemId()` Extension Function

The `tokenize()` Extension Function

13.3.4.1 The `xt:nxml` Extension Attribute

Table 13-5. Subelements escaped by `xt:nxml`

13.3.4.2 The `<xt:document>` Extension Element

13.3.4.3 The `node-set()` Extension Function

13.3.4.4 The `intersection()` Extension Function

13.3.4.5 The `difference()` Extension Function

Example 13-5 Using the `<xt:document>` extension element.