TrAX


TrAX, the Transformations API for XML, is a Java API for performing XSLT transforms. It is sufficiently parser independent that it can work with many different XSLT processors, including Xalan and Saxon. It is sufficiently model independent that it can transform to and from XML streams, SAX event sequences, and DOM and JDOM trees.

TrAX is a standard part of JAXP and bundled with Java 1.4 and later. Furthermore, most current XSLT processors written in Java support TrAX, including Xalan-J 2.x, jd.xslt, LotusXSL, and Saxon. The specific implementation included with Java 1.4 is Xalan-J 2.2D10.

Note

Annoyingly, the Xalan-J classes included in Java 1.4 are zipped into the rt.jar archive, so it's hard to replace them with a less buggy release version of Xalan. It can be done, but you have to put the xalan.jar file in your $JAVA_HOME/lib/endorsed directory rather than in the normal jre/lib/ext directory . The exact location of $JAVA_HOME varies from system to system, but it's probably something like C:\j2sdk1.4.0 on Windows. None of this is an issue with Java 1.3 and earlier, which don't bundle these classes. On these systems, you just need to install whatever JAR files your XSLT engine vendor provides in the usual locations, the same as you would any other third-party library.


There are four main classes and interfaces in TrAX that you need to use, all in the javax.xml.transforms package:

Transformer

The class that represents the stylesheet. It transforms a Source into a Result .

TransformerFactory

The class that represents the XSLT processor. This is a factory class that reads a stylesheet to produce a new Transformer .

Source

The interface that represents the input XML document to be transformed, whether presented as a DOM tree, an InputStream , or a SAX event sequence.

Result

The interface that represents the XML document produced by the transformation, whether generated as a DOM tree, an OutputStream , or a SAX event sequence.

To transform an input document into an output document, follow these steps:

  1. Load the TransformerFactory with the static TransformerFactory.newInstance() factory method.

  2. Form a Source object from the XSLT stylesheet.

  3. Pass this Source object to the factory's newTransformer() factory method to build a Transformer object.

  4. Build a Source object from the input XML document you wish to transform.

  5. Build a Result object for the target of the transformation.

  6. Pass both the source and the result to the Transformer object's transform() method.

Steps 4 through 6 can be repeated for as many different input documents as you want. You can reuse the same Transformer object repeatedly in series, but you can't use it in multiple threads in parallel.

For example, suppose you want to use the Fibonacci stylesheet in Example 17.5 to implement a simple XML-RPC server. The request document will arrive on an InputStream named in and will be returned on an OutputStream named out . Therefore, we'll use javax.xml.transform.stream.StreamSource as the Source for the input document and javax.xml.transform.stream.StreamResult as the Result for the output document. We will assume that the stylesheet itself lives at the relative URL FibonacciXMLRPC.xsl, and that it is also loaded into a javax.xml.transform.stream.StreamSource . The following code fragment performs that transform:

 try {   TransformerFactory xformFactory    = TransformerFactory.newInstance();   Source xsl = new StreamSource("FibonacciXMLRPC.xsl");   Transformer stylesheet = xformFactory.newTransformer(xsl);   Source request  = new StreamSource(in);   Result response = new StreamResult(out);   stylesheet.transform(request, response); } catch (TransformerException e) {   System.err.println(e); } 

Thread Safety

Neither TransformerFactory nor Transformer is guaranteed to be thread safe. If your program is multithreaded, then the simplest solution is to give each separate thread its own TransformerFactory and Transformer objects. This can be expensive, especially if you frequently reuse the same large stylesheet, because it will need to be read from disk or the network and parsed every time you create a new Transformer object. There is also likely to be some overhead in building the processor's internal representation of an XSLT stylesheet from the parsed XML tree.

An alternative is to ask the TransformerFactory to build a Templates object. The Templates class represents the parsed stylesheet, and you can ask the Templates class to give you as many separate Transformer objects as you need. Each of these can be created very quickly by copying the processor's in-memory data structures rather than by reparsing the entire stylesheet from disk or the network. The Templates class itself can be used safely across multiple threads.

For example, you might begin loading and compiling the stylesheet, like this:

 TransformerFactory xformFactory   = TransformerFactory.newInstance(); Source xsl = new StreamSource("FibonacciXMLRPC.xsl"); Templates stylesheet = xformFactory.newTemplates(xsl); 

Then later in a loop, you would repeatedly load documents and transform them like this:

 while (true) {   InputStream  in   = getNextDocument();   OutputStream out  = getNextTarget();   Source request    = new StreamSource(in);   Result response   = new StreamResult(out);   Transformer transform = templates.newTransformer();   transformer.transform(request, response); } 

The thread-unsafe Transformer object is local to the while loop; therefore, references to it don't escape into other threads. This prevents the transform() method from being called concurrently. The Templates object may be shared among multiple threads. It is thread safe, so this isn't a problem. Furthermore, all of the time-consuming work is done when the Templates object is created. Calling templates.newTransformer() is very quick by comparison.

This technique is particularly important in server environments in which the transform may be applied to thousands of different input documents, with potentially dozens being processed in parallel in separate threads. Example 17.6 demonstrates with yet another variation of the Fibonacci XML-RPC servlet. This is the first variation that does not implement the SingleThreadModel interface. It can safely run in multiple threads simultaneously .

Example 17.6 A Servlet That Uses TrAX and XSLT to Respond to XML-RPC Requests
 import java.io.*; import javax.servlet.*; import javax.servlet.http.*; import javax.xml.transform.*; import javax.xml.transform.stream.*; public class FibonacciXMLRPCXSLServlet extends HttpServlet {   private Templates stylesheet;   // Load the stylesheet   public void init() throws ServletException {     try {       TransformerFactory xformFactory        = TransformerFactory.newInstance();       Source source   = new StreamSource("FibonacciXMLRPC.xsl");       this.stylesheet = xformFactory.newTemplates(source);     }     catch (TransformerException e) {       throw new ServletException(        "Could not load the stylesheet", e);     }   }   // Respond to an XML-RPC request   public void doPost(HttpServletRequest servletRequest,    HttpServletResponse servletResponse)    throws ServletException, IOException {     servletResponse.setContentType("text/xml; charset=UTF-8");     try {       InputStream in  = servletRequest.getInputStream();       Source source   = new StreamSource(in);       PrintWriter out = servletResponse.getWriter();       Result result   = new StreamResult(out);       Transformer transformer = stylesheet.newTransformer();       transformer.transform(source, result);       servletResponse.flushBuffer();       out.flush();       out.println();     }     catch (TransformerException e) {       // If we get an exception at this point, it's too late to       // switch over to an XML-RPC fault.       throw new ServletException(e);     }   } } 

The init() method simply loads the stylesheet that will transform requests into responses. The doPost() method reads the request and returns the response. The Source is a StreamSource . The result is a StreamResult .

I'm not sure I would recommend this as the proper design for a servlet of this nature. The XSLT transform comes with a lot of overhead. At the least, I would definitely recommend doing the math in Java, as XSLT is not optimized for this sort of work. Still, I'm quite impressed with the simplicity and robustness of this code. The thread safety is just the first benefit. Shifting the XML generation into an XSLT document makes the whole program a lot more modular. It's easy to change the expected input or output format without even recompiling the servlet.

Locating Transformers

The javax.xml.transform.TransformerFactory Java system property determines which XSLT engine TrAX uses. Its value is the fully qualified name of the implementation of the abstract javax.xml.transform.TransformerFactory class. Possible values of this property include

  • Saxon 6.x: com.icl.saxon.TransformerFactoryImpl

  • Saxon 7.x: net.sf.saxon.TransformerFactoryImpl

  • Xalan: org.apache.xalan.processor.TransformerFactoryImpl

  • jd.xslt: jd.xml.xslt.trax.TransformerFactoryImpl

  • Oracle: oracle.xml.jaxp.JXSAXTransformerFactory

This property can be set in all the usual ways in which a Java system property can be set. TrAX picks from them in the following order:

  1. Invoking System.setProperty "javax.xml.transform.TransformerFactory", "classname" ).

  2. The value specified at the command line using the -Djavax.xml.transform.TransformerFactory= classname option to the java interpreter.

  3. The class named in the lib/jaxp.properties properties file in the JRE directory, in a line like this one:

     javax.xml.parsers.DocumentBuilderFactory=  classname  
  4. The class named in the META-INF/services/javax.xml.transform.TransformerFactory file in the JAR archives available to the runtime.

  5. Finally, if all of the preceding options fail, TransformerFactory.newInstance() returns a default implementation. In Sun's JDK 1.4, this is Xalan 2.2d10.

The xml-stylesheet Processing Instruction

XML documents may contain an xml-stylesheet processing instruction in their prologs that specifies the stylesheet to apply to the XML document. At a minimum, this has an href pseudo-attribute specifying the location of the stylesheet to apply, and a type pseudo-attribute specifying the MIME media type of the stylesheet. For XSLT stylesheets, the proper type is application/xml. For example, this xml-stylesheet processing instruction indicates the XSLT stylesheet found at the relative URL docbook-xsl-1.50.0/fo/docbook.xsl:

 <?xml-stylesheet href="docbook-xsl-1.50.0/fo/docbook.xsl"                   type="application/xml"?> 

This processing instruction is a hint. It is only a hint. Programs are not required to use the stylesheet that the document indicates. They are free to choose a different transform, multiple transforms, or no transform at all. Indeed, the purpose of this processing instruction is primarily browser display. Programs doing something other than loading the document into a browser for a human to read will likely want to use their own XSLT transforms for their own purposes.

Note

Contrary to what some other books will tell you, there is no such MIME media type as text/xsl , nor is it correct to use it as the value of the type pseudo-attribute . This alleged type is a figment of Microsoft's imagination . It has never been registered with the Internet Assigned Numbers Authority (IANA) as MIME types must be. It is not endorsed by the relevant W3C specifications for XSLT and attaching stylesheets to XML documents, and it is unlikely to be in the future.

Official registration of an XSLT specific-media-type application/xml+xslt has begun, and this type may be used in the future to distinguish XSLT stylesheets from other kinds of XML documents. However, the registration has not been completed at the time of this writing.


In addition to the required href and type pseudo-attributes, the xml-stylesheet processing instruction can have up to four other optional pseudo-attributes:

alternate

no if this stylesheet is the primary stylesheet for the document; yes if it isn't. The default is no .

media

A string indicating in which kinds of environments this stylesheet should be used. Possible values include screen (the default), tty , tv , projection , handheld , print , braille , aural , and all .

charset

The character encoding of the stylesheet; for example, ISO-8859-1 , UTF-8 , or SJIS .

title

A name for the stylesheet.

For example, the following xml-stylesheet processing instructions point at two different XSLT stylesheets, one intended for print and found at the relative URL docbook-xsl-1.50.0/fo/docbook.xsl, and the other intended for on-screen display and found at docbook-xsl-1.50.0/html/docbook.xsl. Each is the primary stylesheet for its media.

 <?xml-stylesheet href="docbook-xsl-1.50.0/fo/docbook.xsl"                   type="application/xml"                  media="print"                  title="XSL-FO"                  encoding="UTF-8"                  alternate="no"?> <?xml-stylesheet href="docbook-xsl-1.50.0/html/docbook.xsl"                  type="application/xml"                  media="screen"                  title="HTML"                  encoding="UTF-8"                  alternate="no"?> 

The TransformerFactory class has a getAssociatedStylesheet() method that loads the stylesheet indicated by such a processing instruction:

 public abstract Source  getAssociatedStylesheet  (Source  xmlDocument,  String  media,  String  title,  String  charset  )  throws TransformerConfigurationException 

This method reads the XML document indicated by the first argument, and looks in its prolog for the stylesheet that matches the criteria given in the other three arguments. If any of these are null, it ignores that criterion. The method then loads the stylesheet matching the criteria into a JAXP Source object and returns it. You can use the TransformerFactory.newTransformer() object to convert this Source into a Transformer object. For example, the following code fragment attempts to transform the document read from the InputStream in according to an xml-stylesheet processing instruction for print media found in that document's prolog. The title and encoding of the stylesheet are not considered , and thus are set to null.

 // The InputStream in contains the XML document to be transformed  try {   Source inputDocument = new StreamSource(in);   TransformerFactory xformFactory    = TransformerFactory.newInstance();   Source xsl = xformFactory.getAssociatedStyleSheet(    inputDocument, "print", null, null);   Transformer stylesheet = xformFactory.newTransformer(xsl);   Result outputDocument = new StreamResult(out);   stylesheet.transform(inputDocument, outputDocument); } catch (TransformerConfigurationException e) {   System.err.println(    "Problem with the xml-stylesheet processing instruction"); } catch (TransformerException e) {   System.err.println("Problem with the stylesheet"); } 

A TransformerConfigurationException is thrown if there is no xml-stylesheet processing instruction that points to an XSLT stylesheet matching the specified criteria.

Features

Not all XSLT processors support exactly the same set of capabilities, even within the limits defined by XSLT 1.0. For example, some processors can only transform DOM trees, whereas others may require a sequence of SAX events, and still others may only be able to work with raw streams of text. TrAX uses URI-named features to indicate which of the TrAX classes any given implementation supports. It defines eight standard features as unresolvable URL strings, each of which is also available as a named constant in the relevant TrAX class:

  • StreamSource.FEATURE :

    http://javax.xml.transform.stream.StreamSource/feature

  • StreamResult.FEATURE :

    http://javax.xml.transform.stream.StreamResult/feature

  • DOMSource.FEATURE :

    http://javax.xml.transform.dom.DOMSource/feature

  • DOMResult.FEATURE :

    http://javax.xml.transform.dom.DOMResult/feature

  • SAXSource.FEATURE :

    http://javax.xml.transform.dom.SAXSource/feature

  • SAXResult.FEATURE :

    http://javax.xml.transform.dom.SAXResult/feature

  • SAXTransformerFactory.FEATURE :

    http://javax.xml.transform.sax.SAXTransformerFactory/feature

  • SAXTransformerFactory.FEATURE_XMLFILTER :

    http://javax.xml.transform.sax.SAXTransformerFactory/feature/xmlfilter

It's possible to test the boolean values of these features for the current XSLT engine with the getFeature() method in the TransformerFactory class:

 public abstract boolean  getFeature  (String  Name  ) 

Note

These URLs are just identifiers like namespace URLs. They do not need to be and indeed cannot be resolved. A system does not need to be connected to the Internet to use a transformer that supports these features.


There's no corresponding setFeature() method, because a TrAX feature reflects the nature of the underlying parser. Unlike a SAX feature, it is not something you can just turn on or off with a switch. A processor either supports DOM input or it doesn't. A processor either supports SAX output or it doesn't, and so on.

Example 17.7 is a simple program that tests an XSLT processor's support for the standard JAXP 1.1 features.

Example 17.7 Testing the Availability of TrAX Features
 import javax.xml.transform.*; import javax.xml.transform.dom.*; import javax.xml.transform.stream.*; import javax.xml.transform.sax.*; public class TrAXFeatureTester {   public static void main(String[] args) {     TransformerFactory xformFactory      = TransformerFactory.newInstance();     String name = xformFactory.getClass().getName();     if (xformFactory.getFeature(DOMResult.FEATURE)) {       System.out.println(name + " supports DOM output.");     }     else {       System.out.println(name + " does not support DOM output.");     }     if (xformFactory.getFeature(DOMSource.FEATURE)) {       System.out.println(name + " supports DOM input.");     }     else {       System.out.println(name + " does not support DOM input.");     }     if (xformFactory.getFeature(SAXResult.FEATURE)) {       System.out.println(name + " supports SAX output.");     }     else {       System.out.println(name + " does not support SAX output.");     }     if (xformFactory.getFeature(SAXSource.FEATURE)) {       System.out.println(name + " supports SAX input.");     }     else {       System.out.println(name + " does not support SAX input.");     }     if (xformFactory.getFeature(StreamResult.FEATURE)) {       System.out.println(name + " supports stream output.");     }     else {       System.out.println(name        + " does not support stream output.");     }     if (xformFactory.getFeature(StreamSource.FEATURE)) {       System.out.println(name + " supports stream input.");     }     else {       System.out.println(name        + " does not support stream input.");     }     if (xformFactory.getFeature(SAXTransformerFactory.FEATURE)) {       System.out.println(name + " returns SAXTransformerFactory "        + "objects from TransformerFactory.newInstance().");     }     else {       System.out.println(name        + " does not use SAXTransformerFactory.");     }     if (xformFactory.getFeature(SAXTransformerFactory.FEATURE_XMLFILTER)){       System.out.println(        name + " supports the newXMLFilter() methods.");     }     else {       System.out.println(        name + " does not support the newXMLFilter() methods.");     }   } } 

Following are the results of running this program against Saxon 6.5.1:

 C:\XMLJAVA>  java -Djavax.xml.transform.TransformerFactory=   com.icl.saxon.TransformerFactoryImpl TrAXFeatureTester  com.icl.saxon.TransformerFactoryImpl supports DOM output. com.icl.saxon.TransformerFactoryImpl supports DOM input. com.icl.saxon.TransformerFactoryImpl supports SAX output. com.icl.saxon.TransformerFactoryImpl supports SAX input. com.icl.saxon.TransformerFactoryImpl supports stream output. com.icl.saxon.TransformerFactoryImpl supports stream input. com.icl.saxon.TransformerFactoryImpl returns  SAXTransformerFactory objects from  TransformerFactory.newInstance(). com.icl.saxon.TransformerFactoryImpl supports the newXMLFilter()  methods. 

As you can see, Saxon supports all eight features. Xalan also supports all eight features.

XSLT Processor Attributes

Some XSLT processors provide nonstandard, custom attributes that control their behavior. Like features, these are also named via URIs. For example, Xalan-J 2.3 defines the following three attributes:

http://apache.org/xalan/features/optimize

By default, Xalan rewrites stylesheets in an attempt to optimize them (similar to the behavior of an optimizing compiler for Java or other languages). This can confuse tools that need direct access to the stylesheet, such as XSLT profilers and debuggers . If you're using such a tool with Xalan, you should set this attribute to false.

http://apache.org/xalan/features/incremental

Setting this to true allows Xalan to begin producing output before it has finished processing the entire input document. This may cause problems if an error is detected late in the process, but it shouldn't be a big problem in fully debugged and tested environments.

http://apache.org/xalan/features/source_location

Setting this to true tells Xalan to provide a JAXP SourceLocator that a program can use to determine the location (line numbers, column numbers, system IDs, and public IDs) of individual nodes during the transform. It engenders a substantial performance hit, so it's turned off by default.

Other processors define their own attributes. Although TrAX is designed as a generic API, it does let you access such custom features with these two methods:

 public abstract void  setAttribute  (String  name,  Object  value  )  throws IllegalArgumentException  public abstract Object  getAttribute  (String  name  )  throws IllegalArgumentException 

For example, the following code tries to turn on incremental output:

 TransformerFactory xformFactory   = TransformerFactory.newInstance(); try {   xformFactory.setAttribute(    "http://apache.org/xalan/features/incremental", Boolean.TRUE); } catch (IllegalArgumentException e) {   // This XSLT processor does not support the   // http://apache.org/xalan/features/incremental attribute,   // but we can still use the processor anyway } 

If you're using any processor except Xalan-J 2.x., this will not exactly fail but it won't exactly succeed either. Using nonstandard attributes may limit the portability of your programs. However, most attributes (and all of the Xalan attributes) merely adjust how the processor achieves its result; they do not change the final result in any way.

URI Resolution

An XSLT stylesheet can use the document() function to load additional source documents for processing. It can also import or include additional stylesheets with the xsl:import and xsl:include instructions. In all three cases, the document to load is identified by a URI.

Normally a Transformer simply loads the document at that URL. However, by using a URIResolver , you can redirect the request to a proxy server, to local copies, or to previously cached copies. This interface, summarized in Example 17.8, returns Source objects for a specified URL and an optional base. It is similar in intent to SAX's EntityResolver except that EntityResolver is based on public and system IDs, whereas this interface is based on URLs and base URLs.

Example 17.8 The TrAX URIResolver Interface
 package javax.xml.transform; public interface URIResolver {   public Source resolve(String href, String base)    throws TransformerException; } 

The resolve() method should return a Source object if it successfully resolves the URL. Otherwise it should return null to indicate that the default URL resolution mechanism should be used. Example 17.9 is a simple URIResolver implementation that looks for a gzipped version of a document (that is, a file name that ends with .gz ). If it finds one, it uses the java.util.zip.GZIPInputStream class to build a StreamSource from the gzipped document. Otherwise, it returns null, and the usual methods for resolving URLs are followed.

Example 17.9 A URIResolver Class
 import javax.xml.transform.URIResolver; import javax.xml.transform.stream.StreamSource; import java.util.zip.GZIPInputStream; import java.net.URL; import java.io.InputStream; public class GZipURIResolver implements URIResolver {   public Source resolve(String href, String base) {     try {       href = href + ".gz";       URL context = new URL(base);       URL u = new URL(context, href);       InputStream in = u.openStream();       GZIPInputStream gin = new GZIPInputStream(in);       return new StreamSource(gin, u.toString());     }     catch (Exception e) {       // If anything goes wrong, just return null and let       // the default resolver try.     }     return null;   } } 

The following two methods in TransformerFactory set and get the URIResolver that the Transformer objects it creates will use to resolve URIs:

 public abstract void  setURIResolver  (URIResolver  resolver  )  public abstract URIResolver  getURIResolver  () 

For example,

 URIResolver resolver = new GZipURIResolver();  factory.setURIResolver(resolver); 

Error Handling

XSLT transformations can fail for any of several reasons, including the following:

  • The stylesheet is syntactically incorrect.

  • The source document is malformed .

  • Some external resource that the processor needs to load, such as a document referenced by the document() function or the .class file that implements an extension function, is unavailable.

By default, any such problems are reported by printing them on System.err . You also can provide more sophisticated error handling, reporting, and logging by implementing the ErrorListener interface. This interface, shown in Example 17.10, is modeled after SAX's ErrorHandler interface. Indeed, aside from all of the arguments being TransformerException s instead of SAXException s, it's almost identical.

Example 17.10 The TrAX ErrorListener Interface
 package javax.xml.transform; public interface ErrorListener {   public void warning(TransformerException exception)    throws TransformerException;   public void error(TransformerException exception)    throws TransformerException;   public void fatalError(TransformerException exception)    throws TransformerException; } 

Example 17.11 demonstrates with a simple class that uses the java.util.logging package, introduced in Java 1.4 to report errors rather than printing them on System.err . Each exception is logged to a Logger specified in the constructor. Unfortunately, the Logging API doesn't really have separate categories for fatal and nonfatal errors, so I just classify them both as "severe." [2]

[2] You could define a custom subclass of Level that did differentiate fatal and nonfatal errors, but because this is not a book about the Logging API, I leave that as an exercise for the reader.

Example 17.11 An ErrorListener That Uses the Logging API
 import javax.xml.transform.*; import java.util.logging.*; public class LoggingErrorListener implements ErrorListener {   private Logger logger;    public LoggingErrorListener(Logger logger) {     this.logger = logger;   }   public void warning(TransformerException exception) {     logger.log(Level.WARNING, exception.getMessage(), exception);     // Don't throw an exception and stop the processor     // just for a warning; but do log the problem   }   public void error(TransformerException exception)    throws TransformerException {     logger.log(Level.SEVERE, exception.getMessage(), exception);     // XSLT is not as draconian as XML. There are numerous errors     // which the processor may but does not have to recover from;     // e.g., multiple templates that match a node with the same     // priority. I do not want to allow that, so I throw this     // exception here.     throw exception;   }   public void fatalError(TransformerException exception)    throws TransformerException {     logger.log(Level.SEVERE, exception.getMessage(), exception);     // This is an error which the processor cannot recover from;     // e.g., a malformed stylesheet or input document,     // so I must throw this exception here.     throw exception;   } } 

The following two methods appear in both TransformerFactory and Transformer . They enable you to set and get the ErrorListener to which the object will report problems.

 public abstract void  setErrorListener  (ErrorListener  listener  )  throws IllegalArgumentException  public abstract ErrorListener  getErrorListener  () 

An ErrorListener registered with a Transformer will report errors with the transformation. An ErrorListener registered with a TransformerFactory will report errors with the factory's attempts to create new Transformer objects. For example, the following code fragment installs separate LoggingErrorListener s on the TransformerFactory , as well as the Transformer object it creates, which will record messages in two different logs.

 TransformerFactory factory = TransformerFactory.newInstance();  Logger factoryLogger  = Logger.getLogger("com.macfaq.trax.factory"); ErrorListener factoryListener  = new LoggingErrorListener(factoryLogger); factory.setErrorListener(factoryListener); Source source = new StreamSource("FibonacciXMLRPC.xsl"); Transformer stylesheet = factory.newTransformer(source); Logger transformerLogger  = Logger.getLogger("com.macfaq.trax.transformer"); ErrorListener transformerListener  = new LoggingErrorListener(transformerLogger); stylesheet.setErrorListener(transformerListener); 

Passing Parameters to Stylesheets

Top-level xsl:param and xsl:variable elements both define variables by binding a name to a value. This variable can be de-referenced elsewhere in the stylesheet using the form $ name . Once set, the value of an XSLT variable is fixed and cannot be changed. There is an exception, however: if the variable is defined with a top-level xsl:param element instead of an xsl:variable element, then the default value can be changed before the transformation begins.

For example, the DocBook XSL stylesheets I used to generate this book have a number of parameters that set various formatting options. I used these settings:

 <xsl:param name="fop.extensions">1</xsl:param>  <xsl:param name="page.width.portrait">7.375in</xsl:param> <xsl:param name="page.height.portrait">9.25in</xsl:param> <xsl:param name="page.margin.top">0.5in</xsl:param> <xsl:param name="page.margin.bottom">0.5in</xsl:param> <xsl:param name="region.before.extent">0.5in</xsl:param> <xsl:param name="body.margin.top">0.5in</xsl:param> <xsl:param name="page.margin.outer">1.0in</xsl:param> <xsl:param name="page.margin.inner">1.0in</xsl:param> <xsl:param name="body.font.family">Times</xsl:param> <xsl:param name="variablelist.as.blocks" select="1"/> <xsl:param name="generate.section.toc.level" select="1"/> <xsl:param name="generate.component.toc" select="0"/> 

You can change the initial (and thus final) value of any parameter inside your Java code using these three methods of the Transformer class:

 public abstract void  setParameter  (String  name,  Object  value  )  public abstract Object  getParameter  (String  name  ) public abstract void  clearParameters  () 

The setParameter() method provides a value for a parameter that overrides any value used in the stylesheet itself. The processor is responsible for converting the Java object type passed to a reasonable XSLT equivalent. This should work well enough for String , Integer , Double , and Boolean , as well as for DOM types such as Node and NodeList . I wouldn't rely on it for anything more complex, such as a File or a Frame .

The getParameter() method returns the value of a parameter previously set by Java. It will not return any value from the stylesheet itself, even if it has not been overridden by the Java code. Finally, the clearParameters() method eliminates all Java mappings of parameters, so that those variables are returned to whatever value is specified in the stylesheet.

For example, in Java the preceding list of parameters for the DocBook stylesheets could be set with a JAXP Transformer object, like this:

 transformer.setParameter("fop.extensions", "1");  transformer.setParameter("page.width.portrait", "7.375in"); transformer.setParameter("page.height.portrait", "9.25in"); transformer.setParameter("page.margin.top", "0.5in"); transformer.setParameter("region.before.extent", "0.5in"); transformer.setParameter("body.margin.top", "0.5in"); transformer.setParameter("page.margin.bottom", "0.5in"); transformer.setParameter("page.margin.outer", "1.0in"); transformer.setParameter("page.margin.inner", "1.0in"); transformer.setParameter("body.font.family", "Times"); transformer.setParameter("variablelist.as.blocks", "1"); transformer.setParameter("generate.section.toc.level", "1"); transformer.setParameter("generate.component.toc", "0"); 

Here I used strings for all of the values, but in a few cases I could have used a Number of some kind instead.

Output Properties

XSLT is defined in terms of a transformation from one tree to a different tree, all of which takes place in memory. The actual conversion of that tree to a stream of bytes or a file is an optional step. If that step is taken, the xsl:output instruction controls the details of serialization. For example, it can specify XML, HTML, or plain text output. It can specify the encoding of the output, what the document type declaration points to, whether the elements should be indented, what the value of the standalone declaration is, where CDATA sections should be used, and more. For example, adding this xsl:output element to a stylesheet would produce plain text output instead of XML:

 <xsl:output    method="text"   encoding="US-ASCII"   media-type="text/plain" /> 

This xsl:output element asks for "pretty-printed" XML:

 <xsl:output    method="xml"   encoding="UTF-16"   indent="yes"   media-type="text/xml"   standalone="yes" /> 

In all, there are ten attributes of the xsl:output element that control serialization of the result tree:

method= " xml html text "

The output method. xml is the default. html uses classic HTML syntax, such as <hr> instead of <hr /> . text outputs plain text but no markup.

version ="1.0"

The version number used in the XML declaration. Currently, this should always have the value 1.0 .

encoding= " UTF-8 UTF-16 ISO-8859-1 ... "

The encoding used for the output and in the encoding declaration of the output document.

omit-xml-declaration= " yes no "

yes if the XML declaration should be omitted, no otherwise (that is, no if the XML declaration should be included, yes if it shouldn't be). The default is no .

standalone= " yes no "

The value of the standalone attribute for the XML declaration: either yes or no .

doctype-public= " public ID "

The public identifier used in the DOCTYPE declaration.

doctype-system= " URI "

The URL used as a system identifier in the DOCTYPE declaration.

cdata-section-elements= " element_name_1 element_name_2 ... "

A white-space -separated list of the qualified names of the elements whose content should be output as a CDATA section.

indent= " yes no "

yes if extra white space should be added to "pretty print" the result, no otherwise. The default is no .

media-type= " text/xml text/html text/plain application/xml... "

The MIME media type of the output, such as text/html, application/xml, or application/xml+svg.

Note

All of these output properties are at the discretion of the XSLT processor. The processor is not required to serialize the result tree at all, much less to serialize it with extra white space, a document type declaration, and so forth. In particular, I have encountered XSLT processors that only partially support indent="yes . "


You can also control these output properties from inside your Java programs using these four methods in the Transformer class. You can set them either one by one or as a group with the java.util.Properties class.

 public abstract void  setOutputProperties  (Properties  outputFormat  )  throws IllegalArgumentException  public abstract Properties  getOutputProperties  () public abstract void  setOutputProperty  (String  name,  String  value  )  throws IllegalArgumentException public abstract String  getOutputProperty  (String  name  ) 

The keys and values for these properties are simply the string names established by the XSLT 1.0 specification. For convenience, the javax.xml.transform.OutputKeys class in Example 17.12 provides named constants for all of the property names.

Example 17.12 The TrAX OutputKeys Class
 package javax.xml.transform; public class OutputKeys {   private OutputKeys() {}   public static final String METHOD = "method";   public static final String VERSION = "version";   public static final String ENCODING = "encoding";   public static final String OMIT_XML_DECLARATION    = "omit-xml-declaration";   public static final String STANDALONE = "standalone";   public static final String DOCTYPE_PUBLIC = "doctype-public";   public static final String DOCTYPE_SYSTEM = "doctype-system";   public static final String CDATA_SECTION_ELEMENTS    = "cdata-section-elements";   public static final String INDENT = "indent";   public static final String MEDIA_TYPE = "media-type"; } 

For example, the following Java code fragment has the same effect as the preceding xsl:output element:

 transformer.setOutputProperty(OutputKeys.METHOD, "xml");  transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-16"); transformer.setOutputProperty(OutputKeys.INDENT, "yes"); transformer.setOutputProperty(OutputKeys.MEDIA_TYPE, "text/xml"); transformer.setOutputProperty(OutputKeys.STANDALONE, "yes"); 

In the event of a conflict between what the Java code requests (with output properties requests) and what the stylesheet requests (with an xsl:output element), the ones specified in the Java code take precedence.

Sources and Results

The Source and Result interfaces abstract out the API dependent details of how an XML document is represented. You can construct sources from DOM nodes, SAX event sequences, and raw streams. You can target the result of a transform at a DOM Node , a SAX ContentHandler , or a stream-based target such as an OutputStream , Writer , File , or String . Other models may provide their own implementations of these interfaces. For example, JDOM has an org.jdom.transform package that includes a JDOMSource and JDOMResult class.

In fact, these different models have very little in common, other than that they all hold an XML document. Consequently, the Source and Result interfaces don't themselves provide a lot of the functionality you need, just methods to get the system and public IDs of the document. Everything else is deferred to the implementations. In fact, XSLT engines generally need to work directly with the subclasses rather than with the generic superclasses; and not all engines are able to process all three kinds of sources and targets. Polymorphism just doesn't work very well here.

Note

It is important to set at least the system IDs of your sources, because some parts of the stylesheet may rely on this. In particular, if any of your xsl:import or xsl:include elements or document() functions contain relative URLs, then they will be resolved relative to the URL of the stylesheet source.


DOMSource and DOMResult

A DOMSource is a wrapper around a DOM Node . The DOMSource class, shown in Example 17.13, provides methods to set and get the node that serves as the root of the transform, as well as the system and public IDs of that node.

Example 17.13 The TrAX DOMSource Class
 package javax.xml.transform.dom; public class DOMSource implements Source {   public static final String FEATURE =     "http://javax.xml.transform.dom.DOMSource/feature";   public DOMSource() {}   public DOMSource(Node node);   public DOMSource(Node node, String systemID);   public void   setNode(Node node);   public Node   getNode();   public void   setSystemId(String baseID);   public String getSystemId(); } 

In theory, you should be able to convert any DOM Node object into a DOMSource and transform it. In practice, transforming document nodes is all that is truly reliable. (It's not even clear that the XSLT processing model applies to anything that isn't a complete document.) In my tests, Xalan-J could transform all of the nodes I threw at it. However, Saxon could only transform Document objects and Element objects that were part of a document tree.

A DOMResult is a wrapper around a DOM Document , DocumentFragment , or Element Node to which the output of the transform will be appended. The DOMResult class, shown in Example 17.14, provides constructors and methods to set and get the node that serves as the root of the transform, as well as the system and public IDs of that node.

Example 17.14 The TrAX DOMResult Class
 package javax.xml.transform.dom; public class DOMResult implements Result {   public static final String FEATURE =   "http://javax.xml.transform.dom.DOMResult/feature";   public DOMResult();   public DOMResult(Node node);   public DOMResult(Node node, String systemID);   public void setNode(Node node);   public Node getNode();   public void setSystemId(String systemId);   public String getSystemId(); } 

If you specify a Node for the result, either via the constructor or by calling setNode() , then the output of the transform will be appended to that node's children. Otherwise, the transform output will be appended to a new Document or DocumentFragment Node . The getNode() method returns this Node .

SAXSource and SAXResult

The SAXSource class, shown in Example 17.15, provides input to the XSLT processor that an XMLReader reads from a SAX InputSource .

Example 17.15 The TrAX SAXSource Class
 package javax.xml.transform.sax; public class SAXSource implements Source {   public static final String FEATURE =    "http://javax.xml.transform.sax.SAXSource/feature";   public SAXSource();   public SAXSource(XMLReader reader, InputSource inputSource);   public SAXSource(InputSource inputSource);   public void        setXMLReader(XMLReader reader);   public XMLReader   getXMLReader();   public void        setInputSource(InputSource inputSource);   public InputSource getInputSource();   public void        setSystemId(String systemID);   public String      getSystemId();   public static InputSource sourceToInputSource(Source source); } 

The SAXResult class, shown in Example 17.16, receives output from the XSLT processor as a stream of SAX events fired at a specified ContentHandler and optional LexicalHandler .

Example 17.16 The TrAX SAXResult Class
 package javax.xml.transform.sax; public class SAXResult implements Result   public static final String FEATURE =    "http://javax.xml.transform.sax.SAXResult/feature";   public SAXResult();   public SAXResult(ContentHandler handler);   public void           setHandler(ContentHandler handler);   public ContentHandler getHandler();   public void           setLexicalHandler(LexicalHandler handler);   public LexicalHandler getLexicalHandler();   public void           setSystemId(String systemId);   public String         getSystemId(); } 
StreamSource and StreamResult

The StreamSource and StreamResult classes are used as sources and targets for transforms from sequences of bytes and characters . These include streams, readers, writers, strings, and files. What unifies these is that none of them know they contain an XML document. Indeed, on input they may not always contain an XML document, in which case an exception will be thrown as soon as you attempt to build a Transformer or a Templates object from the StreamSource .

The StreamSource class, shown in Example 17.17, provides constructors and methods to get and set the actual source of data.

Example 17.17 The TrAX StreamSource Class
 package javax.xml.transform.stream; public class StreamSource implements Source {   public static final String FEATURE =    "http://javax.xml.transform.stream.StreamSource/feature";   public StreamSource();   public StreamSource(InputStream inputStream);   public StreamSource(InputStream inputStream, String systemID);   public StreamSource(Reader reader);   public StreamSource(Reader reader, String systemID);   public StreamSource(String systemID);   public StreamSource(File f);   public void        setInputStream(InputStream inputStream);   public InputStream getInputStream();   public void        setReader(Reader reader);   public Reader      getReader();   public void        setPublicId(String publicID);   public String      getPublicId();   public void        setSystemId(String systemID);   public String      getSystemId();   public void        setSystemId(File f); } 

Avoid specifying both an InputStream and a Reader . If both of these are specified, then which one the processor reads from is implementation dependent. If neither an InputStream nor a Reader is available, then the processor will attempt to open a connection to the URI specified by the system ID. Be sure to set the system ID even if you do specify an InputStream or a Reader , because this will be needed to resolve relative URLs that appear inside the stylesheet and input document.

The StreamResult class, shown in Example 17.18, provides constructors and methods to get and set the actual target of the data.

Example 17.18 The TrAX StreamResult Class
 package javax.xml.transform.stream; public class StreamResult implements Result   public static final String FEATURE =    "http://javax.xml.transform.stream.StreamResult/feature";   public StreamResult() {}   public StreamResult(OutputStream outputStream);   public StreamResult(Writer writer);   public StreamResult(String systemID);   public StreamResult(File f);   public void         setOutputStream(OutputStream outputStream);   public OutputStream getOutputStream();   public void         setWriter(Writer writer);   public Writer       getWriter();   public void         setSystemId(String systemID);   public void         setSystemId(File f);   public String       getSystemId(); } 

Be sure to specify the system ID URL and only one of the other identifiers ( File , OutputStream , Writer , or String ). If you specify more than one possible target, then which one the processor chooses is implementation dependent.



Processing XML with Java. A Guide to SAX, DOM, JDOM, JAXP, and TrAX
Processing XML with Javaв„ў: A Guide to SAX, DOM, JDOM, JAXP, and TrAX
ISBN: 0201771861
EAN: 2147483647
Year: 2001
Pages: 191

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net