Invoking the Saxon Processor | NetBeansв„ў IDE Field Guide: Developing Desktop, Web, Enterprise, and Mobile Applications (2nd Edition)

Saxon is written in Java, and implements the JAXP API defined in JAXP 1.2, which is described in detail in Appendix D. This allows it to be invoked from a Java application. There is also a command line interface. In addition, Saxon provides a servlet wrapper allowing a stylesheet to be invoked directly from a URL entered at a browser; however, this is more in the nature of a demonstration application than a real part of the product.

There is no graphical user interface; however, Saxon can be used with visual tools such as Stylus Studio ( http://www.sonicsoftware.com ) or XMLSpy ( http://www.xmlspy.com ) to provide a friendlier front end.

Using Saxon from the Command Line

If you are using Saxon on a Windows platform (and even more so if you are running on a Mac), then you may not be accustomed to using the command line to run applications. You can do this from the standard MS-DOS console that comes with Windows, but I wouldn't recommend it because it's too difficult to correct your typing mistakes and to stop output scrolling off the screen. It is far better to install a text editor that includes a Windows-friendly command line capability. I use the open -source jEdit editor (from www.jedit.org ), mainly because it has good Unicode support. For jEdit you'll need to install the Console plugin, which is an optional component.

You can then run a transformation using Saxon. Assuming all the files are in your current directory, you can use the command

  java -jar saxon8.jar source.xml style.xsl

which will send the output of the transformation to the standard output (that is, to the console window).

This runs the main program contained in the JAR file, which is «net.sf.saxon.Transform » (or «com. saxonica. Transform » for the schema-aware version of the product). You can also invoke this entry point directly, once you have successfully added the JAR file to the classpath, by writing:

  java net.sf.saxon.Transform source.xml style.xsl

You will need to use this form if your stylesheet tries to load extension functions from the classpath.

There are a number of options you can use on the command line. These are written immediately before the name of the source file, for example:

  java net.sf.saxon.Transform -t -w2 source.xml style.xsl

The command line options are as follows :

Option	Description
-a	Use the <?xml-stylesheet?> processing instruction in the source document to identify the stylesheet to be used. The stylesheet argument should then be omitted
-c	Indicates that the stylesheet parameter is not a source XML stylesheet, but a stylesheet that has been previously compiled using the net.sf.saxon.Compile command
-ds -dt	Selects the implementation of the internal tree model. -dt selects the "tinytree" model (the default). -ds selects the traditional tree model. This is a performance tuning option: The tinytree model is faster to build, and occupies less memory, but is sometimes slower to navigate. The default is -dt
-immode	Specifies the initial mode: The transformation will start by looking for a template rule that matches the document node of the source document in the specified mode
-ittemplate	Specifies the name of the initial template. The transformation will start by evaluating this named template. In this case, the source filename argument should be omitted (the stylesheet must obtain all the data it needs using the document ( ) function, or via stylesheet parameters)
-1	Switches line numbering on for the source document. Line numbers are accessible through the extension function saxon :line-number () , or from a trace listener. Line numbering will be switched on automatically if the -T option is used
-m classname	Specifies the full name of a Java class used to process the output of <xsl :message> instructions in the stylesheet. Details are available in the Saxon documentation
-noext	This option prevents the stylesheet calling extension functions, which is an important security measure if the stylesheet code is untrusted
-o filename	Defines a filename to contain the output of the transformation. You must specify this option if the stylesheet creates multiple output files, as the filenames for secondary output files (created using <xsl : result-document> ) will be interpreted relative to the location of this primary output file
-r classname	Specifies the full name of a Java class that implements the JAXP URIResolver interface: This will be used to resolve all URIs used in <xsl : include> , <xsl:import> , or in the doc() and document ( ) functions
-t	This option causes Saxon to display information about the Saxon and Java versions in use, and progress messages indicating which files are being processed and how long the key stages of processing took
-T	Traces execution of the stylesheet. Each instruction is traced as it is executed, identifying the instruction and the current location in the source document by line number. The trace is written to System.err. It is written in the form of an XML document, so if you want to analyze the trace, you can write a stylesheet to do it
-TJ	Traces the loading of Java extension functions. This is a useful debugging aid if you are having problems in this area
-TL classname	Traces execution with a user-defined trace routine. Details are available in the Saxon documentation
-u	Indicates that the names of the source document and stylesheet given on the command line are to be interpreted as URLs rather than file names. (If the names start with «http : » or «file : », this will be assumed automatically)
-v	Requests the XML parser to perform DTD-based validation of all source documents
-val	Performs schema-based validation of all source documents. This option is available only with the schema-aware version of the Saxon product
-wN (where N is 0, 1 or 2)	Indicates how XSLT-defined recoverable errors are to be handled. w0 means recover silently; w1 means output a warning message and continue; and w2 means treat the error as fatal.
-x classname	Defines the XML parser to be used for the source document, and for any additional document loaded using the document() function. The classname must be the name of a parser that implements the SAX2 org.xml.sax.XMLReader interface
-y classname	Defines the XML parser to be used for the stylesheet document, and for any additional stylesheet module loaded using <xsl:include> or <xsl:import>. This parser is also used when parsing a schema. The classname must be the name of a parser that implements the SAX2 org. xml. sax. XMLReader interface

(Why would you want to use different parsers for the source document and the stylesheet? One reason is that the source document might not really be XML; see the GEDCOM example in Chapter 11. Another reason is that you might want to use a validating parser for the source document, but not for the stylesheet.)

You can specify values for global parameters defined in the stylesheet using a keyword=value notation; for example:

  Java net.sf.saxon.Transform source.xml style.xsl paraml=value1.   param2=value2

If the parameter names have a non-null namespace, you can use Clark notation for expanded names, for example «{namespace-uri} local -name » . The parameter values are interpreted as strings. If the string contains a space, you should enclose it in quotes, for example «paraml =" John Brown" » .

If you want to pass an XML document as a parameter to the stylesheet, you can do this by prefixing the parameter name with «+ » and supplying the name of the XML file as the parameter value. For example:

  java net.sf.saxon.Transform source.xml style.xsl +lookup=lookup.xml

The XML contained in lookup. xml will be parsed, and the document node of the resulting tree will be passed to the stylesheet as the value of the stylesheet parameter named «lookup » .

You can also override <xsl: output> attributes using a similar notation, but prefixing the keyword with «! » . For example, to get indented output write:

  java net.sf.saxon.Transform source.xml style.xsl !indent=yes

Using Saxon from a Java Application

Saxon can be invoked from a Java application by using the JAXP API, which is described in Appendix D. This allows you to compile a stylesheet into a Templates object, which can then be used repeatedly (in series or in multiple threads) to process different source documents through the same stylesheet. This can greatly improve throughput on a Web server. A sample application to achieve this, in the form of a Java servlet, is provided with the product.

Saxon implements the whole of the javax.xml. transform package, including the dom, sax , and stream subpackages, both for input and output. It also implements the SAXTransformerFactory , which allows you to do the transformation as part of a SAX pipeline.

The saxon8.jar package includes a file that has the effect of causing the JAXP TransformerFactory to choose Saxon as the default XSLT processor. It can be tricky to ensure that Saxon is loaded, now that JDK 1.4 includes an XSLT implementation (Xalan) as a standard component. The best policy, if you require Saxon because your stylesheet is written in XSLT 2.0, is to select it explicitly. There are several ways this can be achieved:

You can choose Saxon by setting the Java system property named javax.xml.transform .Trans formerFactory to the value net.sf.saxon.TransformerFactoryImpl . Use the -D option on the Java command when you invoke your application. Note that this goes before the name of the class to be executed:
```
 java  -Djavax.xml.transform.TransformerFactory=        net.sf.saxon.TransformerFactoryImpl com.my-com.appl.Program 
```
This all goes on one line. In practice of course you won't want to type this more than once, so create a batch file or shell script using your text editor, and invoke this instead.
Create a file called jaxp.properties within the directory $JAVA_HOME/lib (where $JAVA_HOME is the directory containing your Java installation), and include in this file a line of the form key=value , where key is the property key javax. xml.transform.TransformerFactory and value is the Saxon class net.sf.saxon .TransformerFactoryImpl.
Put the call
```
 System.setProperty ("javax.xml.transform.TransformerFactory",                     "net.sf.saxon.TransformerFactoryImpl") 
```
in your application, to be executed at runtime. This is the only technique that works if you want to run several different JAXP processors from the same application, perhaps in order to compare their results or to benchmark their performance.

If you want to control the choice of XML parser within your application, or to configure the setting of the XML parser, the best approach is to supply source documents in the form of a SAXSource, which encapsulates the XML parser (an instanceof org. xml. sax. XMLReader ) to be used. To do this for documents loaded with the document () function, write your own custom URIResolver .

Saxon Tree Models

Saxon defines an internal interface, the NodeInfo interface, to represent the XPath data model, and it is capable of transforming any data source that supplies an implementation of this interface. There are four implementations of this interface available:

The default is the tinytree, which as the name implies, is optimized for space, but also turns out to be the fastest implementation under many circumstances.
The original model is called the Standard Tree, now something of a misnomer, which is sometimes faster to navigate than the tinytree but takes longer to build and occupies more space.
There is an implementation of NodeInfo that wraps a standard level-2 DOM.
There is another implementation of NodeInfo that wraps a JDOM tree (see www.jdom.org).

If none of these are suitable, you can in principle write your own. For example, you could write an implementation of NodeInfo that fetches the underlying data from a relational database.

Using XPath Expressions in Saxon

It's likely that JDK 1.5 will define a standard interface for executing XPath expressions from Java, but in the absence of a standard, Saxon provides its own API. The relevant classes are in package net.sf.saxon.xpath .

In outline, what you need to do is:

Create a JAXP Source object, for example:

  SAXSource source = new SAXSource(new File("source.xml"));

Create an XPathEvaluator :

  XPathEvaluator xpath = new XPathEvaluator(source);

If you want to define an expression that contains variables, declare the variables :

  StandaloneContext sc = StandaloneContext)xpath.getStaticContext();   Variable param = sc.declareVariable("p"," ");

Define the XPath expression:

  XPathExpression search =   xpath.createExpression("//LINE[contains(., $p)]");

Set the values of the variables:
```
  param.setValue ("apple");  
```
Evaluate the XPath expression:
```
  List results = search.evaluate();  
```

There are many variations on this theme: For example, you can get the results as an Iterator rather than as a List , and there is an evaluateSingle() method, which is useful when you know the XPath expression will return a single value.