Writing Portable Stylesheets | NetBeansв„ў IDE Field Guide: Developing Desktop, Web, Enterprise, and Mobile Applications (2nd Edition)

In this section we will examine a range of facilities that are included in XSLT to help you write portable stylesheets: that is, stylesheets that can run across different XSLT processors, possibly supporting different versions of the XSLT language.

We will look at the question of version compatibility: that is, how to write stylesheets that work with both XSLT 1.0 and XSLT 2.0. Then we will look at how to use vendor extensions, without sacrificing portability.

But before we do either of these things, I will describe a new feature that has been added to XSLT 2.0 to aid portability, namely the use-when attribute, which allows you to include or exclude stylesheet code conditionally at compile time.

Conditional Compilation

The use-when attribute serves a similar purpose to ifdef in the C language: it allows you to define conditions under which a section of the stylesheet can be conditionally included or excluded at compile time.

At the time of writing, this feature has been agreed by the XSL Working Group, but it is not present in any published language draft. Because it is a recent addition, the details could change, so check the latest specifications.

The use-when attribute can be used on any XSLT element. This includes declarations and instructions, and other elements such as <xsl: sort > and <xsl: with-param > . Written as ‰ xsl:use-when ‰« , it is also allowed on literal result elements. The value of the attribute is a condition to be evaluated at compile time. If the condition is false, then the element and the subtree rooted at that element are effectively eliminated from the stylesheet, before any further processing takes place: it is as if the element were not there. One consequence is that no XSLT errors will be reported in respect of this element or its descendants.

Here is an example, which defines two alternative entry points, one for an XSLT 1.0 processor and one for an XSLT 2.0 processor. This assumes that the <xsl:stylesheet> element specifies ‰ version=2.0 ‰« . This means that an XSLT 1.0 processor will be running in forwards-compatible mode (explained in the next section) and will therefore ignore attributes such as use-when that it does not understand. An XSLT 1.0 processor will use the first template rule as the entry point, because it has higher priority. An XSLT 2.0 processor, however, will behave as if the first template rule is not present, and will use the second one, which differs in that it invokes schema validation of the result document.

  <xsl:template match="/" priority="2"   use-when="system-property('xsl:version')='1.0'">   <xsl:apply-templates/>   </xsl:template>   <xsl:template match="/" priority="1">   <xsl:result-document validation="strict">   <xsl:apply-templates/>   </xsl:result-document>   </xsl:template>

The expression contained in the use-when attribute is in principle any XPath expression; however, it is constrained to have a very restricted evaluation context. This means there is no context item, there are no variables available, and no access to external documents. In practice, this means that the only things the expression can usefully do is to examine the results of functions such as system-property() , element-available() , and function-available() , to see what environment the stylesheet is running in. These three functions are fully described in Chapter 7.

One important reason for the introduction of the use-when attribute was to allow stylesheets that work both on schema-aware and non-schema-aware XSLT processors to be written. For example, you can use the system-property() function in a use-when attribute on the <xsl:import-schema> declaration so that a schema is imported only when using a schema-aware processor. For details of how schemas are imported into a stylesheet, see Chapter 4.

Version Compatibility

Version compatibility is about how to achieve resilience to differences between versions of the XSLT standard.

There are currently two versions of the XSLT Recommendation, version 1.0, and a draft of version 2.0 (the many intermediate working drafts don't count). So compatibility between versions has now become an issue. However, the language designers had the foresight to anticipate that it would become an issue, and made provision even in version 1.0 to allow stylesheets to be written in a portable way.

The stylesheet is required to carry a version number (typically ‰ version=1.0 ‰« or ‰ version=2.0 ‰« ) as an attribute of the <xsl:stylesheet> element. Specifying ‰ version=1.0 ‰« declares that the stylesheet is designed primarily for use with XSLT 1.0 processors, while specifying ‰ version=2.0 ‰« indicates that it is designed for XSLT 2.0.

The term backwards compatibility refers to the ability of version N of a language to accept programs or data that worked under version N - 1, while forwards compatibility refers to the ability of programs that worked under version N to move forward to version N + 1. The two concepts are therefore opposite sides of the same coin. However, the XSLT language specification distinguishes carefully between them. As far as an XSLT 2.0 processor is concerned , a stylesheet that specifies ‰ version="1.0" ‰« is operating in backwards-compatible mode, while a stylesheet that specifies ‰ version="3.0" ‰« is operating in forwards-compatible mode.

If you specify ‰ version="1.0" ‰« in the stylesheet, then you are signaling the fact that the stylesheet was designed to be run under XSLT 1.0, and that in some specific cases where XSLT 2.0 defines different behavior from 1.0, the 1.0 behavior should be used. For example, if you supply a sequence of nodes as the value of the select attribute of <xsl:value-of>, the XSLT 1.0 behavior is to output the first value in the sequence and ignore the others; the XSLT 2.0 behavior is to output all the values, space separated.

Specifying ‰ version="1.0" ‰« does not mean that it is an error to use facilities that were newly introduced in XSLT 2.0. It only means that an XSLT 2.0 processor should use the 1.0 behavior in certain specific cases where there are incompatibilities.

If you specify ‰ version="2.0" ‰« in a stylesheet, and then run it under an XSLT 1.0 processor, you are indicating that the stylesheet makes use of facilities that were newly introduced in XSLT 2.0, and that the 1.0 processor should not treat these constructs as an error unless they are actually evaluated. The stylesheet can use various mechanisms to avoid evaluating the constructs that depend on XSLT 2.0 when these features are not available. This only works, of course, because the need for it was anticipated in the XSLT 1.0 specification, and even though no details were known of what new features would be introduced in a later version of the language, XSLT 1.0 processors were required to behave in a particular way (called forwards-compatibility mode) when the version attribute was set to a value other than ‰ 1.0 ‰« . XSLT 2.0 similarly carries forward these provisions so that when the time comes, stylesheets that take advantage of new features in XSLT 3.0 or beyond will still be able to run under an XSLT 2.0 processor.

If you use facilities defined in XSLT version 1.0 only, but want your stylesheet to run under both XSLT 1.0 and 2.0 processors, then you should specify ‰ version=l.0 ‰« , and every conformant XSLT processor will then handle the stylesheet correctly, unless you rely on one of the few areas that are incompatible even in backwards-compatible mode. There is a list of these in Appendix F, and for the most part they are things that few reasonable users would do.

If you use facilities that are new in XSLT version 2.0, and you don't need the stylesheet to run under an XSLT 1.0 processor, then it's best to specify ‰ version=2.0 ‰« . If there are parts of the stylesheet that you haven't converted from XSLT 1.0, where you want backward-compatible behavior to be invoked, then you can leave those parts in a separate stylesheet module that specifies ‰ version= "1.0" ‰« . It's quite OK to mix versions like this. In fact XSLT 2.0 allows you to specify the version attribute at any level of granularity, for example on an <xsl:template> element, or even on an element that encloses one small part of a template. If you use it on a literal result element, the attribute should be named xsl:version to distinguish it from user -defined attributes. Bear in mind, however, that XSLT 1.0 processors allow the version attribute to appear only on the <xsl:stylesheet> element, or, as xsl:version , on a literal result element: it's not permitted, for example, on <xsl:template> .

If you use facilities that are new in XSLT version 2.0, but you also want the stylesheet to run under an XSLT 1.0 processor, then you may need to write it in such a way that it defines fallback behavior to be invoked when running under 1.0. There are various techniques you can use to achieve this. You can use the element-available() function to test whether a particular XSLT instruction is implemented; you can use <xsl:fallback> to define what the processor should do if a construct is not available; or you can use the system-property() function (described in Chapter 7) to test which version of XSLT is supported, and execute different code, depending on the result. Whichever technique you use, you need to ensure that those parts of the stylesheet that use XSLT 2.0 facilities are within the scope of an element that specifies ‰ version="2.0" ‰« , otherwise an XSLT 1.0 processor will reject them at compile time.

The following sections look in more detail at the rules for backwards-compatible and forwards-compatible behavior.

Forwards Compatibility in XSLT 1.0

At present, you are probably more concerned with migration of XSLT 1.0 stylesheets to XSLT 2.0 than with migration from 2.0 to 3.0, so it makes sense to look at the forwards-compatibility rules as they were defined in the XSLT 1.0 specification. In fact these rules are not greatly changed in XSLT 2.0, so if you are reading this perhaps in 2008 and planning the transition to a new version 3.0, the advice should still be relevant.

Forwards-compatibility mode is invoked, as far as an XSLT 1.0 processor is concerned, by setting the version attribute on the <xsl:stylesheet> element to any value other than ‰ 1.0 ‰« (even, surprisingly, a value lower than ‰ 1.0 ‰« ). For an XSLT 2.0 processor, forwards-compatibility mode is invoked by a version attribute greater than ‰ 2.0 ‰« .

This mode has static scope rather than dynamic scope: it affects the instructions in the stylesheet that are textually within the element that carries the relevant version attribute. It only affects the behavior of the compiler, it does not alter the way that any instruction or expression is evaluated at runtime.

In forwards-compatible mode, the XSLT processor must assume that the stylesheet is using XSLT facilities defined in a version of the standard that has been published since the software was released. The processor, of course, won't know what to do with these facilities, but it must assume that the stylesheetauthor is using them deliberately. It treats them in much the same way as vendor extensions that it doesn't understand:

It must report an error for XSLT elements it doesn't understand only if they are actually evaluated, and if there is no child <xsl:fallback> instruction.
It must ignore attributes it doesn't recognize, and unrecognized values for recognized attributes. One particular consequence of this is that if the stylesheet specifies ‰ version="2.0" ‰« , then an XSLT 1.0 processor will ignore any ‰ use-when ‰« attributes that it finds on XSLT elements.
It must report an error for functions it doesn't recognize, or that have the wrong number of arguments, only if the function is actually called. You can avoid this error condition by using function-available() to test whether the function exists before calling it.
It must report syntax errors in XPath expressions that use syntax that isn't allowed in the relevant version of XPath if the expression is actually evaluated. (XSLT 1.0 works only with XPath 1.0, while XSLT 2.0 works only with XPath 2.0.)

This behavior occurs only if the <xsl:stylesheet> element specifies a version other than ‰ 1.0 ‰« (or for XSLT 2.0, a value greater than ‰ 2.0 ‰« ). Forwards-compatible mode can also be specified for a portion of a stylesheet by specifying the xsl:version attribute on any literal result element, and in the case of XSLT 2.0, by specifying the version attribute on any XSLT element. If forwards-compatible mode is not enabled, then any use of an XSLT element, attribute, or function that isn't in the version of XSLT that the processor supports, or any use of XPath syntax that isn't in the corresponding XPath specification, is an error and must be reported, whether it is actually executed or not.

If you specify ‰ version=1.0 ‰« and then use XSLT 2.0 facilities such as <xsl:result-document> , then an XSLT 1.0 processor will reject this as an error. An XSLT 2.0 processor, however, will process your stylesheet successfully. An XSLT 2.0 processor when given a stylesheet that specifies ‰ version="1.0" ‰« is not expected to check that the stylesheet actually conforms to XSLT 1.0.

Forwards-compatible processing was specified to allow you to write a stylesheet that exploits facilities in version 2.0 while still behaving sensibly when run with an XSLT processor that supports version 1.0 only, or, at some point in the future, to use facilities in version 3.0 and still behave sensibly with an XSLT 2.0 processor. To achieve this, you can use the system-property() function (described on page 581, in Chapter 7) to discover which version of XSLT the processor implements, or which processor is being used. For example, you could write code such as follows .

  <xsl:if test="system-property('xsl:version')=2.0 or   starts-with(system-property('xsl:vendor'), 'xalan')">   <xsl:new-facility/>   </xsl:if>

Relying on the version number this returns is a rather crude mechanism: there are likely to be processors around that implement some of the new features in XSLT 2.0 but not yet all of them. Testing which vendor's processor is in use is therefore handy for portability, especially when vendors have not kept strictly to the conformance rules. Another possibility is to use the element-available() and function-available() functions described later in the chapter: although these are primarily intended to allow you to test whether particular vendor or user-defined extensions are available, they can also be used to test for the availability of specific XSLT instructions and functions in the core language.

Technically, a processor that implements some of the new XSLT 2.0 features but not all of them doesn't conform either with XSLT 1.0 or with XSLT 2.0. But since many XSLT processors are developed incrementally with new releases every few weeks, you might well find products that occupy this no man's land. A product will presumably return ‰ 2.0 ‰« as the value of ‰ system-property('xsl-version') ‰« when the vendor is confident that the product is "almost" conformant:past experience suggests that different vendors will interpret this in different ways.

There was a suggestion that one should ban processors from returning ‰ 2.0 ‰« unless they are fully conformant with the spec. But there is little point in such a prohibition , because a product that isn't fully conformant with the spec is by definition doing things that the spec doesn't allow.

Backwards Compatibility in XSLT 2.0

For an XSLT 2.0 processor, you can invoke backwards-compatibility mode by setting the version attribute on the <xsl:stylesheet> element (or in fact on any XSLT element) to the value ‰ 1.0 ‰« . In fact, any value less than ‰ 2.0 ‰« will do. You can also set the xsl:version attribute on a literal result element in the same way.

Like the switch for forwards-compatibility mode, this has static scope: it applies to all instructions and XPath expressions contained within the element where the version attribute is set. Unlike forwards-compatibility mode, however, this mode affects the results of evaluating instructions and expressions, rather than being purely a compile-time switch.

XSLT 2.0 processors aren't obliged to support backwards-compatible processing. If they don't, they must reject any attempt to specify ‰ version="1.0" ‰« as an error. In the early life of XSLT 2.0, I imagine that most vendors will want to support backwards-compatibility mode because their customers are likely to require it. The reason it is optional is that this need may gradually decline as XSLT 1.0 recedes into history. If XSLT 2.0 turns out to have a long life, and is not superseded by a subsequent version for 5 years or so, it's quite likely that a vendor developing a new XSLT 2.0 processor might decide that there is no longer a market need for 1.0 backwards compatibility.

How does backwards compatibility actually affect the results of the stylesheet? One thing that it does not do is to say "process this according to the rules in the XSLT 1.0 specification." This wouldn't work, because the parts of the stylesheet that use 2.0 facilities and the parts that use backwards-compatibility mode need to work with the same data model, and the data model used by an XSLT 2.0 processor is the 2.0 data model, not the 1.0 data model. Instead, backwards-compatibility mode changes the behavior of a small number of specific XSLT and XPath constructs, in quite specific ways.

Here is a checklist of the things that are done differently. The left-hand column indicates the normal XSLT 2.0 (or XPath 2.0) behavior, the right-hand column the behavior in backwards-compatibility mode.

First, the differences covered by the XSLT 2.0 specification are given as follows.

2.0 Behavior	1.0 Behavior
When the value selected by the <xsl:value-of> instruction is a sequence, all the values are output, separated by spaces	When the value selected by the <xsl:value-of> instruction is a sequence, the first value is output, and the rest are ignored
When the value produced by an expression in an attribute value template is a sequence, all the values are output, separated by spaces	When the value produced by an expression in an attribute value template is a sequence, the first value is output, and the rest are ignored
When the value returned by the expression in the value attribute of <xsl:number> is a sequence, all the numbers in the sequence are output, according to the format described in the format attribute	When the value returned by the expression in the value attribute of <xsl:number> is a sequence, the first number in the sequence is output, and the rest are discarded
When the value of a sort key is a sequence containing more than one item, a type error is reported	When the value of a sort key is a sequence containing more than one item, the first item is used as the sort key, and remaining items are ignored
When <xsl:call-template> supplies a parameter that's not defined in the template being called, a static error is reported	When <xsl:call-template> supplies a parameter that's not defined in the template being called, the extra parameter is ignored

Backwards-compatibility mode also affects the way that XPath expressions in the stylesheet are evaluated. Here are the differences.

2.0 Behavior	1.0 Behavior
When a function expects a single node or a single item as an argument, and the supplied value of the argument is a sequence containing more than one item, a type error is reported	When a function expects a single node or a single item as an argument, and the supplied value of the argument is a sequence containing more than one item, all items except the first are ignored
When a function expects a string or a number as an argument, and the supplied value is of the wrong type, a type error is reported	When a function expects a string or a number as an argument, and the supplied value is of the wrong type, the system convert the supplied value to a number using the string() or number() function as appropriate
When one of the operands to an operator such as ‰ = ‰« or ‰ < ‰« is a number, and the other is not, a type error is reported	When one of the operands to an operator such as ‰ = ‰« or ‰ < ‰« is a number, and the other is not, the non-numeric operand is converted to a number using the number() function (see the following note)
When one of the operands of an arithmetic operator such as ‰ + ‰« or ‰ * ‰« is a sequence containing more than one item, a type error is reported	When one of the operands of an arithmetic operator such as ‰ + ‰« or ‰ * ‰« is a sequence containing more than one item, all items except the first are ignored
When the operands of an arithmetic operator such as ‰ + ‰« or ‰ * ‰« have types for which this operator is not defined, a type error is reported	When the operands of an arithmetic operator such as ‰ + ‰« or ‰ * ‰« have types for which this operator is not defined, the supplied operands are converted to numbers (if possible) using the number() function
Note: The November 2003 draft of XPath 2.0 states that in backwards-compatible mode, non-numeric operands of an arithmetic operator are converted to numbers by casting. However, I believe this is incorrect and the conversion should be done using the number() function. The difference is that casting treats non-numeric values as an error, while the number() function converts them to the special value NaN.

My personal preference when moving forward to a new software version or language version is to take the pain of the conversion all at once, and try to make the program or stylesheet look as if it had been written for the new version from the word go. Most of the changes listed above are to reflect the fact that XSLT 2.0 and XPath 2.0 no longer use the "first node" rule when a sequence is supplied in a context where a singleton is needed. You can always get the effect of selecting the first item in the sequence, by using the predicate ‰ [1] ‰« . For example, if an XSLT 1.0 stylesheet contains the instruction:

  <xsl:value-of select="following-sibling::*"/>

then it should be changed to:

  <xsl:value-of select="following-sibling::*[1]"/>

But doing the conversion all at once is a luxury that you can't always afford. Backwards-compatibility mode is there to allow you to spread the cost of making the changes by doing it gradually.

Extensibility

Bitten by years of experience with proprietary vendor extensions to HTML, the W3C committee responsible for XSLT took great care to allow vendor extensions in a tightly controlled way.

The extensibility mechanisms in XSLT are governed by several unstated design principles:

Namespaces are used to ensure that vendor extensions cannot clash with facilities in the standard (including facilities introduced in future versions), or with extensions defined by a different vendor.
It is possible for an XSLT processor to recognize where extensions have been used, including extensions defined by a different vendor, and to fail cleanly if it cannot implement those extensions.
It is possible for the writer of a stylesheet to test whether particular extensions are available, and to define fallback behavior if they are not. For example, the stylesheet might be able to achieve the same effect in a different way, or it might make do without some special effect in the output.

The principal extension mechanisms are extension functions and extension instructions. However, it is also possible for vendors to define other kinds of extensions, or to provide mechanisms for users or third parties to do so. These include the following:

XSLT-defined elements can have additional vendor-defined attributes, provided they use a non-null namespace URI, and that they do not introduce nonconformant behavior for standard elements and attributes. For example, a vendor could add an attribute such as acme:debug to the <xsl:template> element, whose effect is to pause execution when the template is evaluated. But adding an attribute ‰ acme:repeat="2" ‰« whose effect is to execute the template twice would be against the conformance rules.
Vendors can define additional top-level elements; again provided that they use a non-null namespace URI, and that they do not cause nonconformant behavior for standard elements and attributes. An example of such an element is Microsoft's < msxsl :script> element, for defining external functions in VBScript or JScript. Any processor that doesn't recognize the namespace URI will ignore such top-level elements.
Certain XSLT-defined attributes have an open -ended set of values, where vendors have discretion on the range of values to be supported. Examples are the lang attribute of <xsl:number> and <xsl:sort>, which provides language-dependent numbering and sorting; the method attribute of <xsl:output>, which defines how the result tree is output to a file; and the format attribute of <xsl:number>, which allows the vendor to provide additional numbering sequences beyond those defined in the standard. The list of system properties supplied in the first argument of the system-property() function is similarly open ended.

Extension Functions

Extension functions provide a mechanism for extending the capabilities of XSLT by escaping into another language such as Java or JavaScript. The most usual reasons for doing this are as follows:

To improve performance
To exploit system capabilities and services
To reuse code that already exists in another language
For convenience, as complex algorithms and computations can be very verbose when written in XSLT

XSLT 2.0 allows functions to be written using the < xsl:function> declaration in a stylesheet (these are referred to as stylesheet functions, and they are not considered to be extension functions). This facility, together with the increase in the size of the core function library, greatly reduces the need to escape into other programming languages. However, it is still necessary if you need access to external resources or services from within the stylesheet.

The term extension function is used both for functions supplied by the vendor beyond the core functions defined in the XSLT and XPath standards (those described in Chapter 7 of this book, and Chapter 10 of XPath 2.0 Programmer's Reference ), and also for functions written by users and third parties.

The XSLT Recommendation allows extension functions to be called, but does not define how they are written, or how they should be bound to the stylesheet, or which languages should be supported. This means that it is quite difficult to write extension functions that work with more than one vendor's XSLT processor, even though in the Java world there are some conventions that several vendors have adopted.

In December 2000 the XSL Working Group in W3C published a working draft for XSLT 1.1, which proposed detailed conventions for writing extension functions in Java and JavaScript. These proposals met a rather hostile reception , for a variety of reasons. The working draft was subsequently withdrawn and the work hasn't been taken forward. It is still available, if you are interested, at http : //www.w3.org/TR/xslt11 .

A function name in the XPath expression syntax is a QName, that is, a name with an optional namespace prefix. Under XSLT, the default namespace for functions (the namespace that is assumed when the function name has no prefix) is always the namespace for the core function library. This function library includes the XSLT-defined functions listed in Chapter 7 of this book, as well as the functions defined in the XPath 2.0 specifications, which are listed in Chapter 10 of XPath 2.0 Programmer's Reference . For example the core function not() can be invoked as follows.

  <xsl:if test="not( @name = 'Mozart' )">

If the function name has a prefix, the function can come from a number of other sources, depending on its namespace:

Functions in the XML Schema namespace http://www.w3.org/2001/XMLSchema (traditionally associated with the prefix ‰ xs ‰« , though ‰ xsd ‰« is also used) are used to construct values of built-in types. For example, the function call ‰ xs:date('2004-02-29') ‰« is used to convert a string to an xs:date value.
Supplementing the XML Schema built-in types, there are four additional types defined in XPath 2.0: xdt:untyped, xdt:untypedAtomic, xdt:yearMonthDuration , and xdt:dayMonthDuration. The last three of these have constructor functions in the same way as the XML Schema built-in types. The ‰ xdt ‰« prefix here is used to represent the namespace http://www.w3.org/2003/11/xpath-datatypes . (There is no constructor function for xdt:untyped because it is a complex type rather than an atomic type.)
XSLT vendors will often provide additional functions in their own namespace. For example, Saxon provides a number of functions in the namespace http://saxon.sf.net/ . An example is saxon:evaluate(), which allows an XPath expression to be constructed dynamically from a string, and then executed.
Third parties may also define function libraries. Of particular note is the EXSLT library at http://www.exslt.org/ . This provides, among other things, a useful library of mathematical functions. (It also provides capabilities such as date and time handling, and regular expression processing, that have largely been overtaken by standard facilities in XSLT 2.0 and XPath 2.0.) This is primarily a library of function specifications, but implementations of the functions are available for many popular XSLT processors, either from the XSLT vendor or from some other party. This ensures that you can use these functions in a stylesheet and still retain portability across XSLT processors. Note, however, that implementations of these functions are not generally portable: an implementation of math:sqrt() that's written for MSXML3 won't work with Xalan, for example.
You can write your own functions in XSLT, using the <xsl:function> declaration. These will be completely portable across XSLT 2.0 implementations, but of course they are restricted to things that can be coded in XSLT and XPath. These functions can be in any namespace apart from a small number of reserved namespaces. The namespaces that are reserved are the obvious ones such as the XSLT, XML, and XML Schema namespaces.
The <xsl:function> element has an attribute override that can be set to ‰ yes ‰« or ‰ no ‰« to indicate whether the stylesheet function should override any vendor-defined function of the same name. This is useful because there might be a portable cross-platform implementation of a function such as math:sqrt() specified in a third-party library such as EXSLT, as well as a native implementation provided by the XSLT vendor. This attribute allows you to choose which implementation is preferred.
Finally, if the XSLT processor allows it, you may be able to write functions in an external programming language. Microsoft's XSLT processors, for example, allow you to invoke functions in scripting languages such as JavaScript, and all the Java-based processors such as Xalan-J, Saxon, and jd.xslt allow you to invoke methods written in Java. Saxon also allows you to call functions written in XQuery. Other processors will tend to support the native language of the processor: Xalan-C++ allows you to write extension functions in C++ (you need to be aware that installing these is lot more complex than in the case of Java), while the 4XSLT processor ( http://4suite.org ) focuses on Python.

The language specification says nothing about how extension functions are written, and how they are linked to the stylesheet. The notes that follow are provided to give an indication of the kind of techniques you are likely to encounter.

In the case of Java, several processors have provided a mechanism in which the name of the Java class is contained in the namespace URI of the function, while the name of the method is represented by the local name. This mechanism means that all the information needed to identify and call the function is contained within the function name itself. For example, if you want to call the Java method random() in class java.lang.Math to obtain a random number between 0.0 and 1.0, you can write:

  <xsl:variable name="random-number" select="Math:random()"   xmlns:Maths="ext://java.lang.Math"/>

Unfortunately, each processor has slightly different rules for forming the namespace URI, as well as different rules for converting function arguments and results between Java classes and the XPath type system, so it won't always be possible to make such calls portable between XSLT processors. But the example above works with both Saxon and Xalan.

This example calls a static method in Java, but most products also allow you to call Java constructors to return object instances, and then to call instance methods on those objects. To make this possible, the processor needs to extend the XPath type system to allow expressions to return values that are essentially wrappers around external Java objects. The XSLT and XPath specifications are written to explicitly permit this, though the details are left to the implementation.

For example, suppose you want to monitor the amount of free memory that is available, perhaps to diagnose an "out of memory" error in a stylesheet. You could do this by writing:

  <xsl:message>   <xsl:text>Free memory: </xsl:text>   <xsl:value-of select="rt:freeMemory(rt:getRuntime())"   xmlns:rt="ext://java.lang.Runtime"/>   </xsl:message>

Again, this example is written to work with both Saxon and Xalan.

There are two extension function calls here: the call on getRuntime() calls a static method in the class java.lang.Runtime , which returns an instance of this class. The call on freeMemory() is an instance method in this class. By convention, instance methods are called by supplying the target instance as an extra first parameter in the call.

Another technique that's used for linking an extension function is to use a declaration in the stylesheet. Microsoft's processors use this approach to bind JavaScript functions. Here is an example of a simple extension function implemented using this mechanism with Microsoft's MSXML3/4 processor, and an expression that calls it.

  <xsl:stylesheet version="1.0"   xmlns:xsl="http://www.w3.org/1999/XSL/Transform"   xmlns:ms="javascript:my-extensions">   <msxsl:script   xmlns:msxsl="urn:schemas-microsoft-com:xslt"   language= "VBScript";   implements-prefix="ms"   >   Function ToMillimetres(inches)   ToMillimetres = inches * 25,4   End Function   </msxsl:script>   <xsl:template match="/">   <xsl:variable name="test" select="l2"/>   <size><xsl:value-of select="ms:ToMillimetres($test)" /></size>   </xsl:template>   </xsl:stylesheet>

This is not a particularly well- chosen example, because it could easily be coded in XSLT, and it's generally a good idea to stick to XSLT code unless there is a very good reason not to; but it illustrates how it's done.

People sometimes get confused about the difference between script in the stylesheet, which is designed to be called as part of the transformation process, and script in the HTML output page, which is designed to be called during the display of the HTML in the browser, When the transformation is being done within the browser, and is perhaps invoked from script in another HTML page, it can be difficult to keep the distinctions clearly in mind. I find that it always helps in this environment to create a mock-up of the HTML page that you want to generate, test that it works as expected in the browser, and then start thinking about writing the XSLT code to generate it.

Sometimes you need to change configuration files or environment variables, or call special methods in the processor's API to make extension functions available; this is particularly true of products written in C or C++, which are less well suited to dynamic loading and linking.

In XSLT 2.0 (this is a change from XSLT 1.0), it is a static error if the stylesheet contains a call on a function that the compiler cannot locate. If you want to write code that is portable across processors offering different extension functions, you should therefore use the new use-when attribute to ensure that code containing such calls is not compiled unless the function is available. You can test whether a particular extension function is available by using the function-available() function. For example:

  <xsl:sequence xmlns:acme="http://acme.co.jp/xslt">   <xsl:value-of select="acme:moonshine($x)"   use-when="function-available('acme:moonshine')"/>   <xsl:text use-when="not(function-available('acme:moonshine'))"   >***  Sorry, moonshine is off today ***</xsl:text>   </xsl:sequence>

Extension functions, because they are written in general-purpose programming languages, can have side effects. For example, they can write to databases, they can ask the user for input, or they can maintain counters. At one time Xalan provided a sample application to implement a counter using extension functions, effectively circumventing the restriction that XSLT variables cannot be modified in situ. However, extension functions with side effects should be used with great care, because the XSLT specification doesn't say what order things are supposed to happen in. For example, it doesn't say whether a variable is evaluated when its declaration is first encountered , or when its value is first used. The more advanced XSLT processors adopt a lazy evaluation strategy in which (for example) variables are not evaluated until they are used. If extension functions with side effects are used to evaluate such variables, the results can be very surprising, because the order in which the extension functions are called becomes quite unpredictable. For example, if one function writes to a log file and another closes this file, you could find that the log file is closed before it is written to. In fact, if a variable is never used, the extension function contained in its definition might not be evaluated at all.

Before writing an extension function, there are a number of alternatives you should consider:

Can the function be written in XSLT, using an <xsl:function> element?
Is it possible to supply the required information as a stylesheet parameter? Generally this provides a cleaner and more portable solution.
Is it possible to get the result by calling the document() function, with a suitable URI? The URI passed to the document() function does not have to identify a static file; it could also invoke code on an ASP page or a Java servlet. The Java JAXP API allows you to write a URIResolver class that intercepts the call on the document() function, so the URIResolver can return the results directly without needing to access any external resources. The System.Xml.Xsl interface in the Microsoft .NET framework has a similar capability, referred to as an XmlResolver .

Extension Instructions

An extension instruction is an element occurring within a sequence constructor, that belongs to a namespace designated as an extension namespace. A namespace is designated as an extension namespace by including its namespace prefix in the extension-element-prefixes attribute of the <xsl:stylesheet> element, or in the xsl:extension-element-prefixes attribute of the element itself, or of a containing extension element or literal result element.

For example, Saxon provides an extension instruction <saxon:while> to perform looping while a condition remains true. There is no standard XSLT construct for this because without side effects, a condition once true can never become false. But when used in conjunction with extension functions, <saxon:while> can be a useful addition.

Using an Extension Instruction

The following stylesheet uses the <saxon:while> element to process all the Java system properties. It can be run with any source document.

Stylesheet

The stylesheet calls five methods in the Java class library:

System.getProperties() to get a Properties object containing all the system properties
Properties.propertyNames() to get an Enumeration of the names of the system properties
Enumeration.hasMoreElements() to determine whether there are more system properties to come
Enumeration.nextElement() to get the next system property
Properties.getProperty() to get the value of the system property with a given name. For this method, the Properties object is supplied as the first argument, and the name of the required property in the second

  <xsl:stylesheet version="2.0"   xmlns:xsl="http://www.w3.org/1999/XSL/Transform"   >   <xsl:output indent="yes"/>   <xsl:template match="/">   <system-properties   xmlns:System="ext://java.lang.System"   xmlns:Properties="ext://java.util.Properties"   xmlns:Enumeration="ext://java.util.Enumeration"   xsl:exclude-result-prefixes="System Properties Enumeration">   <xsl:variable name="props"   select="System:getProperties()"/>   <xsl:variable name="enum"   select="Properties:propertyNames($props)"/>   <saxon:while test="Enumeration:hasMoreElements($enum)"   xsl:extension-element-prefixes="saxon"   xmlns:saxon="http://saxon.sf.net/">   <xsl:variable name="property-name"   select="Enumeration:nextElement($enum)"/>   <property name="{$property-name}"   value="{Properties:getProperty($props, $property-name)}"/>   </saxon:while>   </system-properties>   </xsl:template>   </xsl:stylesheet>

Note that for this to work, ‰ saxon ‰« must be declared as an extension element prefix, otherwise <saxon:while> would be interpreted as a literal result element and would be copied to the output. The xsl:exclude-result-prefixes attribute is not strictly necessary, but it prevents the output being cluttered with unnecessary namespace declarations.

Technically, this code is unsafe. Although it appears that the extension functions are read-only, the Enumeration object actually contains information about the current position in a sequence, and the call to nextElement() modifies this information: it is therefore a function call with side effects. In practice you can usually get away with such calls. However, as optimizers become more sophisticated, stylesheets that rely on side effects can sometimes work with one version of an XSLT processor, and fail with the next version. So you should use such constructs only when you have no alternative.

A tip: if you have problems getting such stylesheets to work in Saxon, the -TJ option on the command line can be useful for debugging. It gives you diagnostic output showing which Java classes were searched to find methods matching the extension function calls.

As with extension functions, the term extension instruction covers both nonstandard instructions provided by the vendor, and nonstandard instructions implemented by a user or third party. There is no requirement that an XSLT implementation must allow users to define new extension instructions, only that it should behave in a particular way when it encounters extension instructions that it cannot process.

Where a product does allow users to implement extension instructions (two products that do so are Saxon and Xalan), the mechanisms and APIs involved are likely to be rather more complex than those for extension functions, and the task is not one to be undertaken lightly. However, extension instructions can offer capabilities that would be very hard to provide with extension functions alone.

If there is an extension instruction in a stylesheet, then all XSLT processors will recognize it as such, but in general some will be able to handle it and others won't (because it is defined by a different vendor). As with extension functions, the rule is that a processor mustn't fail merely because an extension instruction is present; it should fail only if an attempt is made to evaluate it.

There are two mechanisms to allow stylesheet authors to test whether a particular extension instruction is available: the element-available() function and the <xsl:fallback> instruction.

The element-available() function works in a very similar way to function-available() . You can use it in a use-when attribute to include stylesheet code conditionally. In this case, however, you can also do the test at evaluation time if you prefer, because calls to unknown extension instructions don't generate an error unless then are executed. For example:

  <xsl:choose xmlns:acme="http://acme.co.jp/xslt">   <xsl:when test="element-available('acme:moonshine')">   <acme:moonshine select="$x" xsl:extension-element-prefixes="acme"/>   </xsl:when>   <xsl:otherwise>   <xsl:text>*** Sorry, moonshine is off today ***</xsl:text>   </xsl:otherwise>   </xsl:choose>

Note that at the time element-available() is called, the prefix for the extension element (here ‰ acme ‰« ) must have been declared in a namespace declaration, but it does not need to have been designated as an extension element.

The <xsl:fallback> instruction (which is fully described on page 271, in Chapter 5) provides an alternative way of specifying what should happen when an extension instruction is not available. The following example is equivalent to the previous one.

  <acme:moonshine select="$x"   xmlns:acme="http://acme.co.jp/xslt"   xsl:extension-element-prefixes="acme">   <xsl:fallback>   <xsl:text>*** Sorry, moonshine is off today ***</xsl:text>   </xsl:fallback>   </acme:moonshine>

When an extension instruction is evaluated, and the XSLT processor does not know what to do with it, it should evaluate any child <xsl:fallback> element. If there are several <xsl:fallback> children, it should evaluate them all. Only if there is no <xsl:fallback> element should it report an error. Conversely, if the XSLT processor can evaluate the instruction, it should ignore any child <xsl:fallback> element.

The specification doesn't actually say that an extension instruction must allow an <xsl:fallback> child to be present. There are plenty of XSLT instructions that do not allow <xsl:fallback> as a child, for example <xsl:copy-of> and <xsl:value-of> . However, an extension instruction that didn't allow <xsl:fallback> would certainly be against the spirit of the standard.

Vendor-defined or user-defined elements at the top level of the stylesheet are not technically extension instructions, because they don't appear within a sequence constructor; therefore the namespace they appear in does not need to be designated as an extension namespace.