Recipe14.15.Adding Extension Elements Using Java


Recipe 14.15. Adding Extension Elements Using Java

Problem

You want to extend the functionality of XSLT by adding elements with custom behavior.

Solution

Prior sections considered how extensions provided by the XSLT implementers could be used to your advantage. This section develops your own extension elements from scratch. Unlike extension functions, creating extension elements requires much more intimacy with a particular processor's implementation details. Because processor designs vary widely, much of the code will not be portable between processors.

This section begins with a simple extension that provides syntactic sugar rather than extended functionality. A common requirement in XSLT coding is to switch context to another node. Using an xsl:for-each is an idiomatic way of accomplishing this. The process is somewhat confusing because the intent is not to loop but to change context to the single node defined by the xsl:for-each's select:

<xsl:for-each select="document('new.xml')">      <!-- Process new document --> </xsl:for-each>

You will implement an extension element called xslx:set-context, which acts exactly like xsl:for-each, but only on the first node of the node set defined by the select (normally, you have only one node anyway).

Saxon requires an implementation of the com.icl.saxon.style.ExtensionElementFactory interface for all extension elements associated with a particular namespace. The factory is responsible for creating the extension elements from the element's local name. The second extension, named templtext, is covered later:

package com.ora.xsltckbk; import com.icl.saxon.style.ExtensionElementFactory; import org.xml.sax.SAXException;     public class CkBkElementFactory implements ExtensionElementFactory {         public Class getExtensionClass(String localname)  {         if (localname.equals("set-context")) return CkBkSetContext.class;         if (localname.equals("templtext")) return CkBkTemplText.class;         return null;     }     }

When using a stylesheet extension, you must use a namespace that ends in a /, followed by the factory's fully qualified name. The namespace prefix must also appear in the xsl:stylesheet's extension-element-prefixes attribute:

<xsl:stylesheet version="1.0"   xmlns:xsl="http://www.w3.org/1999/XSL/Transform"  xmlns:xslx="http://com.ora.xsltckbk.CkBkElementFactory"   extension-element-prefixes="xslx">     <xsl:template match="/">   <xslx:set-context select="foo/bar">     <xsl:value-of select="."/>   </xslx:set-context> </xsl:template>     </xsl:stylesheet>

The set-context element implementation derives from com.icl.saxon.style.StyleElement and must implement prepareAttributes( ) and process( ), but it will usually implement the others shown in Table 14-3.

Table 14-3. Important Saxon StyleElement methods

Method

Effect

isInstruction( )

Extensions always return true.

mayContainTemplateBody( )

Returns true if this element can contain child elements. Often returns true to allow an xsl:fallback child.

prepareAttributes( )

Called at compile time to allow the class to parse information contained in the extensions attributes. It is also the time to do local validation.

validate( )

Called at compile time after all stylesheet elements have done local validation. It allows cross validation between this element and its parents or children.

process(Context context)

Called at runtime to execute the extension. This method can access or modify information in the context, but must not modify the stylesheet tree.


The xslx:set-context element was easy to implement because the code was stolen from Saxon's XSLForEach implementation and modified to do what XSLForEach does, but only once:

public class CkBkSetContext extends com.icl.saxon.style.StyleElement {         Expression select = null;         public boolean isInstruction( ) {         return true;     }         public boolean mayContainTemplateBody( ) {         return true;     }

Here you make sure @select is present. If it is, call makeExpression, which parses it into an XPath expression:

    public void prepareAttributes( )                        throws TransformerConfigurationException {               StandardNames sn = getStandardNames( );           AttributeCollection atts = getAttributeList( );               String selectAtt = null;               for (int a=0; a<atts.getLength( ); a++) {                int nc = atts.getNameCode(a);                int f = nc & 0xfffff;                if (f == sn.SELECT) {                   selectAtt = atts.getValue(a);              } else {                   checkUnknownAttribute(nc);              }         }             if (selectAtt=  =null) {             reportAbsence("select");         } else {             select = makeExpression(selectAtt);         }     }         public void validate( ) throws TransformerConfigurationException {         checkWithinTemplate( );     }

This code is identical to Saxon's for-each, except instead of looping selection.hasMoreElements, it simply checks once, extracts the element, sets the context and current node, processes children, and returns the result to the context:

    public void process(Context context) throws TransformerException     {         NodeEnumeration selection = select.enumerate(context, false);         if (!(selection instanceof LastPositionFinder)) {             selection = new LookaheadEnumerator(selection);         }             Context c = context.newContext( );         c.setLastPositionFinder((LastPositionFinder)selection);         int position = 1;               if (selection.hasMoreElements( )) {               NodeInfo node = selection.nextElement( );               c.setPosition(position++);               c.setCurrentNode(node);               c.setContextNode(node);               processChildren(c);               context.setReturnValue(c.getReturnValue( ));           }     } }

The next example extension is not as simple because it extends XSLT's capabilities rather than creating an alternate implementation for existing functionality.

You can see that because a whole chapter of this book is dedicated to code generation, the task interests me. However, although XSLT is near optimal in its XML manipulation capabilities, it lacks output capabilities due to the XML's verbosity. Consider a simple C++ code generation task in native XSLT:

<classes>   <class>     <name>MyClass1</name>   </class>       <class>     <name>MyClass2</name>   </class>       <class>     <name>MyClass3</name>     <bases>       <base>MyClass1</base>       <base>MyClass2</base>     </bases>   </class>    </classes>

A stylesheet that transforms this XML into C++ might look like this:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">       <xsl:output method="text"/>     <xsl:template match="class"> class <xsl:value-of select="name"/> <xsl:apply-templates select="bases"/> { public:       <xsl:value-of select="name"/>( ) ;   ~<xsl:value-of select="name"/>( ) ;   <xsl:value-of select="name"/>(const <xsl:value-of select="name"/>&amp; other) ;   <xsl:value-of select="name"/>&amp; operator =(const <xsl:value-of select="name"/> &amp; other) ; } ; </xsl:template>          <xsl:template match="bases"> <xsl:text>: public </xsl:text> <xsl:for-each select="base">   <xsl:value-of select="."/>   <xsl:if test="position( ) != last( )">     <xsl:text>, public </xsl:text>   </xsl:if> </xsl:for-each> </xsl:template>     <xsl:template match="text( )"/>     </xsl:stylesheet>

This code is tedious to write and difficult to read because the C++ is lost in a rat's nest of markup.

The extension xslx:templtext addresses this problem by creating an alternate implementation of xsl:text that can contain special escapes and indicate special processing. An escape is indicated by surrounding backslashes (\) and comes in two forms. An obvious alternative would use { and } to mimic attribute value templates and XQuery; however, because you use these common characters in code generators, I opted for the backslashes.

Escape

Equivalent XSLT

\expression\

<xsl:value-of select="expression"/>

\expression%delimit\[3]

 <xsl:for-each select="expression">   <xsl:value-of select="."/>  <xsl:if test="position( ) !=       last( )>      <xsl:value-of select="delimit"/>  </xsl:if> </xsl:for-each>


[3] XSLT 2.0 will provide this functionality via <xsl:value-of select="expression" separator="delimit" />.

Given this facility, your code generator would look as follows:

<xsl:stylesheet   version="1.0"   xmlns:xsl="http://www.w3.org/1999/XSL/Transform"  xmlns:xslx="http://com.ora.xsltckbk.CkBkElementFactory"   extension-element-prefixes="xslx">     <xsl:output method="text"/>     <xsl:template match="class"> <xslx:templtext> class \name\ <xsl:apply-templates select="bases"/>  { public:       \name\( ) ;   ~\name\( ) ;   \name\(const \name\&amp; other) ;   \name\&amp; operator =(const \name\&amp; other) ; } ; </xslx:templtext> </xsl:template>          <xsl:template match="bases"> <xslx:templtext>: public \base%', public '\</xslx:templtext> </xsl:template>     <xsl:template match="text( )"/>     </xsl:stylesheet>

This code is substantially easier to read and write. This facility is applicable to any context where a lot of boilerplate text will be generated. An XSLT purist may frown on such an extension because it introduces a foreign syntax into XSLT that is not subject to simple XML manipulation. This argument is valid; however, from a practical standpoint, many developers would reject XSLT (in favor of Perl) for boilerplate generation simply because it lacks a concise and unobtrusive syntax for getting the job done. So enough hemming and hawing; let's just code it:

package com.ora.xsltckbk; import java.util.Vector ; import java.util.Enumeration ; import com.icl.saxon.tree.AttributeCollection; import com.icl.saxon.*; import com.icl.saxon.expr.*; import javax.xml.transform.*; import com.icl.saxon.output.*; import com.icl.saxon.trace.TraceListener; import com.icl.saxon.om.NodeInfo; import com.icl.saxon.om.NodeEnumeration; import com.icl.saxon.style.StyleElement; import com.icl.saxon.style.StandardNames; import com.icl.saxon.tree.AttributeCollection; import com.icl.saxon.tree.NodeImpl;

Your extension class first declares constants that will be used in a simple state machine that parses the escapes:

public class CkBkTemplText extends com.icl.saxon.style.StyleElement {   private static final int SCANNING_STATE = 0 ;   private static final int FOUND1_STATE   = 1 ;   private static final int EXPR_STATE     = 2 ;   private static final int FOUND2_STATE   = 3 ;   private static final int DELIMIT_STATE  = 4 ; ...

Then define four private classes that implement the mini-language contained within the xslx:templtext element. The base class, CkBkTemplParam , captures literal text that may come before an escape:

  private class CkBkTemplParam   {     public CkBkTemplParam(String prefix)     {       m_prefix = prefix ;     }         public void process(Context context) throws TransformerException     {       if (!m_prefix.equals(""))       {           Outputter out = context.getOutputter( );           out.setEscaping(false);           out.writeContent(m_prefix);           out.setEscaping(true);       }     }         protected String m_prefix ;   }

The CkBkValueTemplParam class derives from CkBkTemplParam and implements the behavior of a simple value-of escape \expr\. To simplify the implementation in this example, the disabled output escaping will be the norm inside a xslx:templtext element:

  private class CkBkValueTemplParam extends CkBkTemplParam   {     public CkBkValueTemplParam(String prefix, Expression value)     {       super(prefix) ;       m_value = value ;     }         public void process(Context context) throws TransformerException     {       super.process(context) ;       Outputter out = context.getOutputter( );       out.setEscaping(false);       if (m_value != null)       {           m_value.outputStringValue(out, context);       }       out.setEscaping(true);     }         private Expression m_value ;       }

The CkBkTemplParam class implements the of \expr%delimit\ behavior, largely by mimicking the behavior of a Saxon XslForEach class:

  private class CkBkListTemplParam extends CkBkTemplParam   {     public CkBkListTemplParam(String prefix, Expression list,                               Expression delimit)     {       super(prefix) ;       m_list = list ;       m_delimit = delimit ;     }         public void process(Context context) throws TransformerException     {       super.process(context) ;       if (m_list != null)       {         NodeEnumeration m_listEnum = m_list.enumerate(context, false);             Outputter out = context.getOutputter( );         out.setEscaping(false);         while(m_listEnum.hasMoreElements( ))         {           NodeInfo node = m_listEnum.nextElement( );           if (node != null)           {             node.copyStringValue(out);           }           if (m_listEnum.hasMoreElements( ) && m_delimit != null)           {             m_delimit.outputStringValue(out, context);           }         }         out.setEscaping(true);       }     }         private Expression m_list = null;     private Expression m_delimit = null ;   }

The last private class is CkBkStyleTemplParam, and it is used as a holder of elements nested within the xslx:templtext, for example, xsl:apply-templates:

  private class CkBkStyleTemplParam extends CkBkTemplParam   {     public CkBkStyleTemplParam(StyleElement snode)     {       m_snode = snode ;     }         public void process(Context context) throws TransformerException     {        if (m_snode.validationError != null)       {               fallbackProcessing(m_snode, context);        }       else       {            try         {            context.setStaticContext(m_snode.staticContext);            m_snode.process(context);          }         catch (TransformerException err)         {            throw snode.styleError(err);          }       }     }   }

The next three methods are standard. If you allow the standard disable-output-escaping attribute to control output escaping, you would capture its value in prepareAttributes(). The Saxon XslText.java source provides the necessary code:

  public boolean isInstruction( )   {       return true;   }       public boolean mayContainTemplateBody( )   {     return true;   }       public void prepareAttributes( ) throws TransformerConfigurationException   {     StandardNames sn = getStandardNames( );      AttributeCollection atts = getAttributeList( );      for (int a=0; a<atts.getLength( ); a++)     {        int nc = atts.getNameCode(a);       checkUnknownAttribute(nc);     }      }

The validate stage is an opportunity to parse the contents of the xslx:templtext element, looking for escapes. You send every text node to a parser function. Element style content is converted into instances CkBkStyleTemplParam. The member m_TemplParms is a vector where the results of parsing are stored:

  public void validate( ) throws TransformerConfigurationException   {       checkWithinTemplate( );       m_TemplParms = new Vector( ) ;           NodeImpl node = (NodeImpl)getFirstChild( );       String value ;       while (node!=null)       {         if (node.getNodeType( ) =  = NodeInfo.TEXT)         {           parseTemplText(node.getStringValue( )) ;         }         else         if (node instanceof StyleElement)         {            StyleElement snode = (StyleElement) node;           m_TemplParms.addElement(new CkBkStyleTemplParam(snode)) ;         }         node = (NodeImpl)node.getNextSibling( );       }   }

The process method loops over m_TemplParms and calls each implementation's process method:

  public void process(Context context) throws TransformerException   {     Enumeration iter = m_TemplParms.elements( ) ;     while (iter.hasMoreElements( ))     {        CkBkTemplParam param = (CkBkTemplParam) iter.nextElement( ) ;        param.process(context) ;     }   }

The following private functions implement a simple state-machine-driven parser that would be easier to implement if you had access to a regular-expression engine (which is actually available to Java Version 1.4.1). The parser handles two consecutive backslashes (\\) as a request for a literal backslash. Likewise, %% is translated into a single %:

  private void parseTemplText(String value)   {       //This state machine parses the text looking for parameters       int ii = 0 ;       int len = value.length( ) ;           int state = SCANNING_STATE ;       StringBuffer temp = new StringBuffer("") ;       StringBuffer expr = new StringBuffer("") ;       while(ii < len)       {         char c = value.charAt(ii++) ;         switch (state)         {           case SCANNING_STATE:           {             if (c == '\\')             {               state = FOUND1_STATE ;             }             else             {               temp.append(c) ;             }           }           break ;               case FOUND1_STATE:           {             if (c == '\\')             {               temp.append(c) ;               state = SCANNING_STATE ;             }             else             {               expr.append(c) ;               state = EXPR_STATE ;             }           }           break ;               case EXPR_STATE:           {             if (c == '\\')             {               state = FOUND2_STATE ;             }             else             {               expr.append(c) ;             }           }           break ;               case FOUND2_STATE:           {             if (c =  = '\\')             {               state = EXPR_STATE ;               expr.append(c) ;             }             else             {               processParam(temp, expr) ;               state = SCANNING_STATE ;               temp = new StringBuffer("") ;                     temp.append(c) ;               expr = new StringBuffer("") ;             }           }           break ;         }           }       if (state == FOUND1_STATE || state == EXPR_STATE)       {           compileError("xslx:templtext dangling \\");       }       else       if (state == FOUND2_STATE)       {         processParam(temp, expr) ;       }       else       {         processParam(temp, new StringBuffer("")) ;       }   }       private void processParam(StringBuffer prefix, StringBuffer expr)   {     if (expr.length( ) == 0)     {       m_TemplParms.addElement(new CkBkTemplParam(new String(prefix))) ;     }     else     {       processParamExpr(prefix, expr) ;     }   }       private void processParamExpr(StringBuffer prefix, StringBuffer expr)   {       int ii = 0 ;       int len = expr.length( ) ;           int state = SCANNING_STATE ;       StringBuffer list = new StringBuffer("") ;       StringBuffer delimit = new StringBuffer("") ;       while(ii < len)       {         char c = expr.charAt(ii++) ;         switch (state)         {           case SCANNING_STATE:           {             if (c == '%')             {               state = FOUND1_STATE ;             }             else             {               list.append(c) ;             }           }           break ;               case FOUND1_STATE:           {             if (c == '%')             {               list.append(c) ;               state = SCANNING_STATE ;             }             else             {               delimit.append(c) ;               state = DELIMIT_STATE ;             }           }           break ;               case DELIMIT_STATE:           {             if (c == '%')             {               state = FOUND2_STATE ;             }             else             {               delimit.append(c) ;             }           }           break ;         }       }       try       {         if (state =  = FOUND1_STATE)         {             compileError("xslx:templtext trailing %");         }         else         if (state == FOUND2_STATE)         {             compileError("xslx:templtext extra %");         }         else         if (state =  = SCANNING_STATE)         {           String prefixStr = new String(prefix) ;           Expression value = makeExpression(new String(list)) ;           m_TemplParms.addElement(                  new CkBkValueTemplParam(prefixStr, value)) ;         }         else         {           String prefixStr = new String(prefix) ;           Expression listExpr = makeExpression(new String(list)) ;           Expression delimitExpr = makeExpression(new String(delimit)) ;           m_TemplParms.addElement(             new CkBkListTemplParam(prefixStr, listExpr, delimitExpr)) ;         }       }       catch(Exception e)       {       }   }   //A vector of CBkTemplParms parse form text   private Vector m_TemplParms = null;  }

You can make some useful enhancements to the functionality of xslx:templtext. For example, you could expand the functionality of the list escape to multiple lists (e.g., /expr1%delim1%expr2%delim2/.). This enhancement would roughly translate into the following XSLT equivalent:

<xsl:for-each select="expr1">   <xsl:variable name="pos" select="position( )"/>   <xsl:value-of select="."/>   <xsl:if test="$pos != last( )">     <xsl:value-of select="delim1"/>   </xsl:if>   <xsl:value-of select="expr2[$pos]"/>   <xsl:if test="$pos != last( )">     <xsl:value-of select="delim2"/>   </xsl:if> </xsl:for-each >

This facility would be useful when pairs of lists need to be sequenced into text. For example, consider a C++ function's parameters, which consist of name and type pairs. The XSLT code is only a rough specification of semantics because it assumes that the node sets specified by expr1 and expr2 have the same number of elements. I believe that an actual implementation would continue to expand the lists as long as any set still had nodes, suppressing delimiters for those that did not. Better yet, the behavior could be controlled by attributes of xslx:templtext.

Discussion

Space does not permit full implementations of these extension elements in Xalan. However, based on the information provided in the introduction, the path should be relatively clear.

See Also

Developers interested in extending Saxon should read Michael Kay's article on Saxon design (http://www-106.ibm.com/developerworks/library/x-xslt2).




XSLT Cookbook
XSLT Cookbook: Solutions and Examples for XML and XSLT Developers, 2nd Edition
ISBN: 0596009747
EAN: 2147483647
Year: 2003
Pages: 208
Authors: Sal Mangano

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net