Section 6.4.  XSLT tools

  
Prev don't be afraid of buying books Next

6.4 XSLT tools

6.4.1 Processors

Saxon and the rest. After an XML editor, an XSLT processor is the first tool any XML web developer needs, and a lot of parameters of your project will depend on the capabilities of your chosen processor. Much of this book focuses on Saxon, [55] the primary reason being its support for XSLT and XPath 2.0. Saxon has many other benefits as well: Java extensibility, exceptional standards compliance, good performance, and active development.

[55] saxon.sf.net



Still, there are many other XSLT processors out there. This section is not intended to be a comprehensive directory; I only mention those processors that I tested and found interesting in some aspects. Unless otherwise noted, all tools in this section are open source and support XSLT 1.0.

The speed race. Usually, one of the main concerns of those who shop around for an XSLT processor is speed. This is understandable, given that XSLT is typically orders of magnitude slower than traditional non-XML text processing tools even when their tasks appear comparable in complexity. Unfortunately, it is very hard to tell which processor is the fastest , as the speed very much depends on the stylesheet, source documents, and (for Java- and Python-based processors) the virtual machine used. XSLTMark [56] is a benchmark suite often used to measure the performance of XSLT processors, but its results are to be taken with a grain of salt, so are the notes belowwhich, nonetheless, might give you some idea of what to expect from major processors.

[56] www.datapower.com/xml_community/xsltmark.html



6.4.1.1 Java

On the Java platform, the Apache XML Project [57] hosts a lot of XML- related open source software, including many products mentioned in this book (Cocoon, Batik, Xindice, FOP). Notably, Xerces [58] is perhaps the most robust and advanced Java-based validating XML parser (it can use both DTDs and XSDL schemas for validation). You can successfully use Saxon with Xerces instead of JDK's default parser (or the lfred parser that was included with Saxon up to version 7.1).

[57] xml.apache.org

[58] xml.apache.org/xerces2-j

Xalan is the XSLT processor of the Apache XML Project. It exists in two versions, one written in C++ and the other in Java, but the C++ version lags behind in version numbers and appears to have stagnated. As for the Java version of Xalancalled Xalan-J [59] it is actively developed, complete, and well-tested (in part, perhaps, because it is usually the default XSLT processor used in other Apache XML projects).

[59] xml.apache.org/xalan-j

Xalan is extensible in Java and (indirectly) in many other languages, including JavaScript and Python. It offers a growing library of ready-to-run extensions, covering most of EXSLT ( 4.4.1 ).

Xalan-J is generally slower than Saxon, but it includes an XSLT compiler , XSLTC (originally developed by Sun), that creates a set of compiled Java classes out of an XSLT stylesheet. Such a compiled version of the stylesheet, called a translet ("transformation applet"), runs much faster than the original stylesheet interpreted by Xalan (although a translet still requires Xalan itself to be installed).

XT , [60] originally written by James Clark, is another well-known XSLT processor in Java. XT is quite old and seems to be abandoned ; it is small and very fast (definitely faster than Saxon), but its support of XSLT is incomplete. A relatively new processor, jd-xslt [61] by Johannes Dbler, is one of the fastest Java processors currently available (in my testing it was faster than XT); it appears to offer solid XSLT support and Java extensibility.

[60] www.blnz.com/xt

[61] www.aztecrider.com/xslt

6.4.1.2 Python

4xslt is part of an XML processing framework called 4Suite [62] published by Fourthought. The entire framework is written in Python and includes, along with an XSLT processor, an XML repository (database) and server as well as tools for RDF processing and support for XLink, XPointer, RELAX NG, and other standards. The 4xslt processor supports some EXSLT functions and is extensible in Python; the biggest problem with it is that it is quite slow compared to most other processors.

[62] www.4suite.org

6.4.1.3 Native binary

You might expect processors that are compiled into native binary executables, and therefore do not require any virtual machine, to be among the fastest. This is not so clear-cut , however.

Sablotron, [63] by Ginger Alliance, is a small processor written in C++ and running on many platforms. Surprisingly, Sablotron's performance for large transformations is not top of the class; it can be outperformed by a fast Java processor such as Saxon. However, as a native binary, it starts up much faster than any Java- or Python-based processor. Its quick startup makes Sablotron ideal for projects where simple XSLT transformations (such as filters or format converters) are chained or intermingled with non-XSLT processing. Sablotron supports a few EXSLT functions and is extensible in Javascript.

[63] www.gingerall.com/charlie/ga/xml/p_sab.xml

Another multiplatform native-binary processor written in C++ is Gnome's libxslt . [64] It starts up a bit more slowly than Sablotron but easily wins on large input files. It also supports almost all EXSLT extensions and even some Saxon extensions.

[64] www.xmlsoft.org/XSLT

6.4.2 Generators

There are tools that claim to generate your stylesheet for you, based on some sort of a simplified description of the desired transformation, or controlled by an interactive interface.

If we ignore for a moment all the advanced XSLT tricks we did in Chapter 5, an XML-to-HTML stylesheet is little more than a simple mapping between source XML elements and areas of the rendered HTML pagethat is, a list of instructions like "this thing goes here and that thing goes there." If you already have a sample XML source and a corresponding formatted web page (note that the latter can be built in a GUI without any manual HTML coding), such a basic mapping may be established simply by drag-and-drop.

This is the approach taken by Altova's Stylevision, [65] an XML and XSLT editor specifically geared toward web designers interested in "migration of traditional HTML web sites to advanced XML-based sites." This application is one of Altova's suite of Windows-only XML tools, which also includes a well-known XML editor called XML Spy.

[65] www.xmlspy.com/products_xsl.html

It goes without mention that such a "point and drool" interface is unable to create anything more complex than a primitive outline of a stylesheet. Stylevision just tears your HTML into pieces and wraps each piece into an xsl:template with a simple match attribute (usually, only containing an element type name ). Even this simple process may go awry, so you'll have to manually move some misplaced HTML code into another template.

Still, such a tool may be useful for quick XSLT prototyping if you plan to take the automatically generated stylesheet as a starting point for further development (which you can do in Stylevision as well, for it has a mode for editing XSLT and an XPath analyzer). This kind of interface seems to be better suited for database-like XML, since handling mixed content via drag-and-drop is less than intuitive. [66]

[66] Mixed content <...> has always separated the men from the boys in the *ML editing sweepstakes .Tim Bray, co-editor of the XML specification.

6.4.3 IDEs

With most programming languages, you can conveniently program inside an IDE (Integrated Development Environment). XSLT is no exception.

An IDE is, basically, a text editor tweaked for convenient coding in the given language, with a number of specialized functions for running and debugging the program without leaving the IDE. Exactly how much stuff is added to the text editor foundation varies widely from language to language and from IDE to IDE.

6.4.3.1 Processor-neutral

XSLT is special in that to debug a stylesheet, you'll want to simultaneously see two inputs (the source document and the stylesheet) and two outputs (the result of the transformation and the stream of messages produced by the stylesheet). Treebeard [67] is a simple IDE that displays the source, stylesheet, and transformation output in the three panes of its window, while the messages are shown in a separate floating window. It also boasts detailed XSLT syntax coloring, with separate colors for functions, variables , axes, and so on.

[67] treebeard.sf.net

One area where Treebeard is lacking is debugging. When you run an external XSLT processor from within an IDE, you need to have support from that processor if you want to use facilities like breakpoints, step-by-step execution, and watched variables. XSLT processors vary widely in the amount of such debugging support they provide, and those that provide it do so each in its own special way. Hopefully, this area will be standardized one day.

Since Treebeard is supposed to be processor-neutral and works with many Java-based processors (including, by the way, Saxon 7), it cannot take advantage of any processor-specific debugging capabilities. Therefore, the only debugging features available with Treebeard are those provided by XSLT itself or your chosen processor's XSLT extensions (see 4.4.2.3 for Saxon's debugging features).

6.4.3.2 Processor-specific

More convenient for complex XSLT debugging is the XSLT-process [68] package for Emacs. This is a full-featured IDE that adds a lot of XSLT-related features to the already quite good XML editing support in Emacs.

[68] xslt-process.sf.net

Unlike Treebeard, XSLT-process supports only two processors: Saxon and Xalan, as these are the only ones with a "tracing" interface making it possible to track execution of a stylesheet and query the processor status. By using this interface, XSLT-process can provide a complete array of debugging tools, including breakpoints (both in the stylesheet and in the source document), step-by-step execution of the stylesheet, and display of both local and global variables at any point.

Figure 6.9 shows an XSLT-process session with the source view, stylesheet view, output view, and a debugging console where you can run debugger commands. In any of the panes, you can also view the stylesheet output or diagnostic messages. A vertical sidebar window on the right shows breakpoints, describes execution context (i.e., all the elements that were entered but not yet left) in both the source and the stylesheet, and lists the type and value of all global and local variables.

Figure 6.9. An XSLT-process session in XEmacs; from top to bottom: source, stylesheet (both with breakpoints highlighted), messages, debugging console, and output; the sidebar (right) lists breakpoints, execution context, and variables at the current breakpoint.
graphics/06fig09.jpg




6.4.4 Profilers

Another good tool that takes advantage of Saxon's and Xalan's debugging interfaces is an XSLT profiler called catchXSL!. [69] It was designed to perform a single task: compile an execution profile of an XSLT stylesheetthat is, the list of all its instructions with information on how many times each one was called and how much time it took.

[69] www.xslprofiler.org. (No, I'm not that excited about the programthe exclamation mark is part of its name.)

True, optimization should not be your priority until the stylesheet is fully debugged and proved stable ( 4.4.2.2 ). But then, only real bottlenecks need to be addressed, and profiling stylesheet execution is the best way to discover these bottlenecks. Even if you're not really interested in optimization at this time, seeing a detailed analysis of a stylesheet run may be very instructive.

catchXSL! (Figure 6.10) presents its findings both in a tree form and as a table that can be sorted on any column. The tree may reflect either the stylesheet elements' hierarchy ("template view") or the hierarchy of calls ("tree call view"). You can ask the program to run your stylesheet several times and average the results.

Figure 6.10. catchXSL!, an XSLT profiler, displays execution timings for each instruction in a stylesheet.
graphics/06fig10.jpg




 
  
Amazon


XSLT 2.0 Web Development
ASP.Net 2.0 Cookbook (Cookbooks (OReilly))
ISBN: 0596100647
EAN: 2147483647
Year: 2006
Pages: 90

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net