Recipe16.2.Creating Generic Element Aggregation Functions


Recipe 16.2. Creating Generic Element Aggregation Functions

Problem

You want to create reusable templates that perform a wide variety of node-set aggregation operations.

Solution

A fully generic extensible solution exploits the template-tagging method discussed in this chapter's introduction:

<xsl:stylesheet version="1.0"    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"   xmlns:generic="http://www.ora.com/XSLTCookbook/namespaces/generic"   xmlns:aggr="http://www.ora.com/XSLTCookbook/namespaces/aggregate"   extension-element-prefixes="generic">       <xsl:variable name="generic:public-generics"                   select="document('')/*/generic:*"/>       <xsl:variable name="generic:generics" select="$generic:public-generics"/>         <!-- Primitive generic functions on x -->         <generic:func name="identity"/>   <xsl:template match="generic:func[@name='identity']">        <xsl:param name="x"/>        <xsl:value-of select="$x"/>   </xsl:template>      <generic:func name="square"/>   <xsl:template match="generic:func[@name='square']">        <xsl:param name="x"/>        <xsl:value-of select="$x * $x"/>   </xsl:template>      <generic:func name="cube"/>   <xsl:template match="generic:func[@name='cube']">        <xsl:param name="x"/>        <xsl:value-of select="$x * $x * $x"/>   </xsl:template>      <generic:func name="incr" param1="1"/>   <xsl:template match="generic:func[@name='incr']">        <xsl:param name="x"/>        <xsl:param name="param1" select="@param1"/>         <xsl:value-of select="$x + $param1"/>   </xsl:template>     <!-- Primitive generic aggregators -->            <generic:aggr-func name="sum" identity="0"/>   <xsl:template match="generic:aggr-func[@name='sum']">        <xsl:param name="x"/>        <xsl:param name="accum"/>        <xsl:value-of select="$x + $accum"/>   </xsl:template>      <generic:aggr-func name="product" identity="1"/>   <xsl:template match="generic:aggr-func[@name='product']">        <xsl:param name="x"/>        <xsl:param name="accum"/>        <xsl:value-of select="$x * $accum"/>   </xsl:template>       <!-- Generic aggregation template -->   <xsl:template name="generic:aggregation">     <xsl:param name="nodes"/>     <xsl:param name="aggr-func" select=" 'sum' "/>     <xsl:param name="func" select=" 'identity' "/>     <xsl:param name="func-param1"                 select="$generic:generics[self::generic:func and                                           @name = $func]/@param1"/>     <xsl:param name="i" select="1"/>     <xsl:param name="accum"                 select="$generic:generics[self::generic:aggr-func and                                           @name = $aggr-func]/@identity"/>      <xsl:choose>       <xsl:when test="$nodes">                 <!--Compute func($x) -->          <xsl:variable name="f-of-x">           <xsl:apply-templates                  select="$generic:generics[self::generic:func and                                            @name = $func]">             <xsl:with-param name="x" select="$nodes[1]"/>             <xsl:with-param name="i" select="$i"/>             <xsl:with-param name="param1" select="$func-param1"/>           </xsl:apply-templates>         </xsl:variable>             <!-- Aggregate current $f-of-x with $accum -->             <xsl:variable name="temp">           <xsl:apply-templates                select="$generic:generics[self::generic:aggr-func and                                    @name = $aggr-func]">             <xsl:with-param name="x" select="$f-of-x"/>             <xsl:with-param name="accum" select="$accum"/>             <xsl:with-param name="i" select="$i"/>           </xsl:apply-templates>         </xsl:variable>                         <!--We tail recursively process the remaining nodes using position( )          -->         <xsl:call-template name="generic:aggregation">           <xsl:with-param name="nodes" select="$nodes[position( )!=1]"/>           <xsl:with-param name="aggr-func" select="$aggr-func"/>           <xsl:with-param name="func" select="$func"/>           <xsl:with-param name="func-param1" select="$func-param1"/>           <xsl:with-param name="i" select="$i + 1"/>           <xsl:with-param name="accum" select="$temp"/>         </xsl:call-template>       </xsl:when>       <xsl:otherwise>         <xsl:value-of select="$accum"/>       </xsl:otherwise>     </xsl:choose>   </xsl:template>     </xsl:stylesheet>

The generic code has three basic parts.

The first part consists of tagged generic functions on a single variable x. These functions allow performance of aggregation operations on functions of an input set. The simplest such function is identity, which is used when you want to aggregate the input set itself. Square, cube, and incr functions are also predefined. Users of the stylesheet can define other functions.

The second part consists of tagged generic aggregator functions. You will see two common implemented aggregators: sum and product. Again, importing stylesheets can add other forms of aggregation.

The third part consists of the generic aggregation algorithm. It accepts as parameters a set of nodes to aggregate, the name of an aggregator function (default is sum), and the name of a single element function (the default is identity). The $i parameter keeps track of the position of the currently processed node and is made available to both the element and aggregation functions, should they desire it. The $accum keeps a working value of the aggregation. Notice how the default value is initialized from the @identity attribute kept with the aggregate function's tag. This initialization demonstrates a powerful feature of the generic approach with which metadata can be associated with the function tags. This feature is reminiscent of the way C++-based generic programming uses traits classes.

The first step to understanding this code is to show a simple application that both uses and extends the aggregation facilities, as shown in Example 16-1.

Example 16-1. Using and extending generic aggregation
<xsl:stylesheet version="1.0"    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"   xmlns:generic="http://www.ora.com/XSLTCookbook/namespaces/generic"   xmlns:aggr="http://www.ora.com/XSLTCookbook/namespaces/aggregate"   extension-element-prefixes="generic aggr">     <xsl:import href="aggregation.xslt"/>     <xsl:output method="xml" indent="yes"/>     <!-- Extend the available generic functions --> <xsl:variable name="generic:generics" select="$generic:public-generics | document('')/*/ generic:*"/>     <!--Add a generic element function for computing reciprocal --> <generic:func name="reciprocal"/> <xsl:template match="generic:func[@name='reciprocal']">      <xsl:param name="x"/>      <xsl:value-of select="1 div $x"/> </xsl:template>     <!--Add generic aggregators for computing the min and the max values in a node set--> <generic:aggr-func name="min" identity=""/> <xsl:template match="generic:aggr-func[@name='min']">      <xsl:param name="x"/>      <xsl:param name="accum"/>   <xsl:choose>     <xsl:when test="$accum = @identity or $accum >= $x">        <xsl:value-of select="$x"/>     </xsl:when>     <xsl:otherwise>       <xsl:value-of select="$accum"/>     </xsl:otherwise>   </xsl:choose> </xsl:template>     <generic:aggr-func name="max" identity=""/> <xsl:template match="generic:aggr-func[@name='max']">      <xsl:param name="x"/>      <xsl:param name="accum"/>   <xsl:choose>     <xsl:when test="$accum = @identity or $accum &lt; $x">        <xsl:value-of select="$x"/>     </xsl:when>     <xsl:otherwise>       <xsl:value-of select="$accum"/>     </xsl:otherwise>   </xsl:choose> </xsl:template>     <!--Test aggregation functionality --> <xsl:template match="numbers">     <results>       <!-- Sum the numbers -->     <sum>     <xsl:call-template name="generic:aggregation">       <xsl:with-param name="nodes" select="number"/>     </xsl:call-template>  </sum>       <!-- Sum the squares -->   <sumSq>     <xsl:call-template name="generic:aggregation">       <xsl:with-param name="nodes" select="number"/>       <xsl:with-param name="func" select=" 'square' "/>     </xsl:call-template>   </sumSq>      <!-- Product of the reciprocals -->   <prodRecip>     <xsl:call-template name="generic:aggregation">       <xsl:with-param name="nodes" select="number"/>       <xsl:with-param name="aggr-func" select=" 'product' "/>       <xsl:with-param name="func" select=" 'reciprocal' "/>     </xsl:call-template>   </prodRecip>       <!-- Maximum -->   <max>       <xsl:call-template name="generic:aggregation">       <xsl:with-param name="nodes" select="number"/>       <xsl:with-param name="aggr-func" select=" 'max' "/>     </xsl:call-template>   </max>       <!-- Minimum -->   <min>       <xsl:call-template name="generic:aggregation">       <xsl:with-param name="nodes" select="number"/>       <xsl:with-param name="aggr-func" select=" 'min' "/>     </xsl:call-template>   </min>     </results>     </xsl:template>     </xsl:stylesheet>

Example 16-1 shows how new element and aggregation functions can be added to those prepackaged with aggregation.xslt. You might not initially expect that computing minimums and maximums can be accomplished with this generic code, but it is quite easy to do.

You can test this code against the following input:

<numbers>   <number>1</number>   <number>2</number>   <number>3</number> </numbers>

The result is:

<?xml version="1.0" encoding="utf-8"?> <results>    <sum>6</sum>    <sumSq>14</sumSq>    <prodRecip>0.16666666666666666</prodRecip>    <max>3</max>    <min>1</min> </results>

Discussion

The "Solution" section shows only the tip of the iceberg in relation to what can be done with this generic aggregation framework. For example, nothing says you must aggregate numbers. The following code shows how this generic code can be applied to strings as well:

<strings>   <string>camel</string>   <string>through</string>   <string>the</string>   <string>eye</string>   <string>of</string>   <string>needle</string> </strings>     <!DOCTYPE stylesheet [      <!ENTITY % standard SYSTEM "../strings/standard.ent">      %standard; ]> <xsl:stylesheet version="1.0"    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"   xmlns:generic="http://www.ora.com/XSLTCookbook/namespaces/generic"   extension-element-prefixes="generic">     <xsl:import href="aggregation.xslt"/>     <xsl:output method="xml" indent="yes"/>     <!-- Extend the available generic functions --> <xsl:variable name="generic:generics" select="$generic:public-generics |  document('')/*/generic:*"/>     <!--Add a generic element function for converting first character of $x  to uppercase --> <generic:func name="upperFirst"/> <xsl:template match="generic:func[@name='upperFirst']">      <xsl:param name="x"/>    <!-- See Recipe 2.8 for an explanation of LOWER_TO_UPPER -->      <xsl:variable name="upper"           select="translate(substring($x,1,1),&LOWER_TO_UPPER;)"/>      <xsl:value-of select="concat($upper, substring($x,2))"/> </xsl:template>     <!--Add generic aggregator that concatenates --> <generic:aggr-func name="concat" identity=""/> <xsl:template match="generic:aggr-func[@name='concat']">      <xsl:param name="x"/>      <xsl:param name="accum"/>      <xsl:value-of select="concat($accum,$x)"/> </xsl:template>     <!--Test aggregation functionality --> <xsl:template match="strings">     <results>       <camelCase>     <xsl:call-template name="generic:aggregation">       <xsl:with-param name="nodes" select="string"/>       <xsl:with-param name="aggr-func" select=" 'concat' "/>       <xsl:with-param name="func" select=" 'upperFirst' "/>     </xsl:call-template>  </camelCase>     </results>      </xsl:template>     </xsl:stylesheet>     <results>    <camelCase>CamelThroughTheEyeOfNeedle</camelCase> </results>

Aggregation can also compute the statistical functions' average and variance. Here you exploit the $i index parameter. You need to be a little crafty to compute variance; you need to maintain three values in the $accum parameterthe sum, the sum of the squares, and the running variance. You can do this by using an element with attributes. The only downside is that you are forced to use a node-set function in XSLT 1.0:

<?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet version="1.0"    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"   xmlns:generic="http://www.ora.com/XSLTCookbook/namespaces/generic"   xmlns:exslt="http://exslt.org/common"   extension-element-prefixes="generic exslt">     <xsl:import href="aggregation.xslt"/>     <xsl:output method="xml" indent="yes"/>     <!-- Extend the available generic functions --> <xsl:variable name="generic:generics" select="$generic:public-generics |  document('')/*/generic:*"/>     <!--Add generic aggregators for computing the min and the max values in a node set--> <generic:aggr-func name="avg" identity="0"/> <xsl:template match="generic:aggr-func[@name='avg']">      <xsl:param name="x"/>      <xsl:param name="accum"/>      <xsl:param name="i"/>      <xsl:value-of select="(($i - 1) * $accum + $x) div $i"/> </xsl:template>     <generic:aggr-func name="variance" identity=""/> <xsl:template match="generic:aggr-func[@name='variance']">   <xsl:param name="x"/>   <xsl:param name="accum"/>   <xsl:param name="i"/>         <xsl:choose>     <xsl:when test="$accum = @identity">       <!-- Initialize the sum, sum of squares, and variance.             The variance of a single number is zero -->       <variance sum="{$x}" sumSq="{$x * $x}">0</variance>     </xsl:when>     <xsl:otherwise>       <!-- Use node-set to convert $accum to a nodes set containing             the variance element -->       <xsl:variable name="accumElem" select="exslt:node-set($accum)/*"/>       <!-- Aggregate the sum of $x component -->       <xsl:variable name="sum" select="$accumElem/@sum + $x"/>       <!-- Aggregate the sum of $x squared component -->       <xsl:variable name="sumSq" select="$accumElem/@sumSq + $x * $x"/>       <!-- Return the element with attributes and the current variance             as its value -->       <variance sum="{$sum}" sumSq="{$sumSq}">         <xsl:value-of                select="($sumSq - ($sum * $sum) div $i) div ($i - 1)"/>       </variance>     </xsl:otherwise>   </xsl:choose>   </xsl:template>     <xsl:template match="numbers">     <results>       <!-- Average -->     <avg>     <xsl:call-template name="generic:aggregation">       <xsl:with-param name="nodes" select="number"/>       <xsl:with-param name="aggr-func" select=" 'avg' "/>     </xsl:call-template>  </avg>       <!-- Average of the squares -->   <avgSq>     <xsl:call-template name="generic:aggregation">       <xsl:with-param name="nodes" select="number"/>       <xsl:with-param name="func" select=" 'square' "/>       <xsl:with-param name="aggr-func" select=" 'avg' "/>     </xsl:call-template>   </avgSq>       <!-- Variance -->   <variance>     <xsl:call-template name="generic:aggregation">       <xsl:with-param name="nodes" select="number"/>       <xsl:with-param name="aggr-func" select=" 'variance' "/>     </xsl:call-template>   </variance>     </results>     </xsl:template>     </xsl:stylesheet>

This example shows how you can use your aggregation facilities to compute sums of polymorphic functions:

<?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns: generic="http://www.ora.com/XSLTCookbook/namespaces/generic" xmlns:aggr="http://www. ora.com/XSLTCookbook/namespaces/aggregate" xmlns:exslt="http://exslt.org/common"  extension-element-prefixes="generic aggr">       <xsl:import href="aggregation.xslt"/>      <xsl:output method="xml" indent="yes"/>      <!-- Extend the available generic functions -->   <xsl:variable name="generic:generics"        select="$generic:public-generics | document('')/*/generic:*"/>      <!-- Extend the primitives to compute commission-->   <generic:func name="commission"/>   <xsl:template match="generic:func[@name='commission']">     <xsl:param name="x"/>     <!-- defer actual computation to a polymorphic template using mode commission -->     <xsl:apply-templates select="$x" mode="commission"/>   </xsl:template>      <!-- By default salespeople get 2% commission and no base salary -->   <xsl:template match="salesperson" mode="commission">     <xsl:value-of select="0.02 * sum(product/@totalSales)"/>   </xsl:template>      <!-- salespeople with seniority > 4 get $10000.00 base + 0.5% commission -->   <xsl:template match="salesperson[@seniority > 4]" mode="commission" priority="1">     <xsl:value-of select="10000.00 + 0.05 * sum(product/@totalSales)"/>   </xsl:template>      <!-- salespeople with seniority > 8 get (seniority * $2000.00) base + 0.8%  commission -->   <xsl:template match="salesperson[@seniority > 8]" mode="commission" priority="2">     <xsl:value-of select="@seniority * 2000.00 + 0.08 *            sum(product/@totalSales)"/>   </xsl:template>      <xsl:template match="salesBySalesperson">     <results>       <result>         <xsl:text>Total commission = </xsl:text>         <xsl:call-template name="generic:aggregation">           <xsl:with-param name="nodes" select="*"/>           <xsl:with-param name="aggr-func" select=" 'sum' "/>           <xsl:with-param name="func" select=" 'commission' "/>         </xsl:call-template>       </result>       <result>         <xsl:text>Min commission = </xsl:text>         <xsl:call-template name="generic:aggregation">           <xsl:with-param name="nodes" select="*"/>           <xsl:with-param name="aggr-func" select=" 'min' "/>           <xsl:with-param name="func" select=" 'commission' "/>         </xsl:call-template>       </result>       <result>         <xsl:text>Max commission = </xsl:text>         <xsl:call-template name="generic:aggregation">           <xsl:with-param name="nodes" select="*"/>           <xsl:with-param name="aggr-func" select=" 'max' "/>           <xsl:with-param name="func" select=" 'commission' "/>         </xsl:call-template>       </result>       <result>         <xsl:text>Avg commission = </xsl:text>         <xsl:call-template name="generic:aggregation">           <xsl:with-param name="nodes" select="*"/>           <xsl:with-param name="aggr-func" select=" 'avg' "/>           <xsl:with-param name="func" select=" 'commission' "/>         </xsl:call-template>       </result>       <result>         <xsl:text>Avg sales = </xsl:text>         <xsl:call-template name="generic:aggregation">           <xsl:with-param name="nodes" select="*/product/@totalSales"/>           <xsl:with-param name="aggr-func" select=" 'avg' "/>         </xsl:call-template>       </result>       <result>         <xsl:text>Min sales = </xsl:text>         <xsl:call-template name="generic:aggregation">           <xsl:with-param name="nodes" select="*/product/@totalSales"/>           <xsl:with-param name="aggr-func" select=" 'min' "/>         </xsl:call-template>       </result>       <result>         <xsl:text>Max sales = </xsl:text>         <xsl:call-template name="generic:aggregation">           <xsl:with-param name="nodes" select="*/product/@totalSales"/>           <xsl:with-param name="aggr-func" select=" 'max' "/>         </xsl:call-template>       </result>     </results>   </xsl:template>    </xsl:stylesheet>

The result when run against salesBySalesperson.xml (see Chapter 4) is:

<results xmlns:exslt="http://exslt.org/common">    <result>Total commission = 471315</result>    <result>Min commission = 19600</result>    <result>Max commission = 364440</result>    <result>Avg commission = 117828.75</result>    <result>Avg sales = 584636.3636363636</result>    <result>Min sales = 5500.00</result>    <result>Max sales = 2920000.00</result> </results>

This section has demonstrated that many of the recipes implemented separately in Chapter 3 can be implemented easily in terms of this single generic example. In fact, this generic example can compute an infinite range of aggregation-like functions over a set of nodes. Unfortunately, this flexibility and generality is not free. A generic implementation will typically be 40% slower than a custom hand-coded solution. If speed is the most important consideration, then you may want to consider an optimized hand-coded solution. However, if you need to implement a complex piece of XSLT rapidly that performs a wider variety of aggregation operations, a generic solution will speed up development substantially.[1] One of the tricks of getting the most mileage out of this approach is to have many common generic element and aggregation functions that are ready to be used. In the actual implementation of aggregation.xslt, I have all the functions from this example (and several others). You can access the complete code at the book's web site (http://www.oreilly.com/catalog/xsltckbk).

[1] Not everyone would agree with this assessment. In fact, some would argue that using this approach slows down development because of the complexity caused by extra levels of indirection. However, repeated usage often makes the complex appear idiomatic. For example, recall how you felt when you first struggled with vanilla XSLT.

In cases when the aggregate function is not symmetric, you might need to aggregate over a node list in reverse order. This aggregation requires only a minor change to the generic aggregation function:

  <xsl:template name="generic:reverse-aggregation">     <xsl:param name="nodes"/>     <xsl:param name="aggr-func" select=" 'sum' "/>     <xsl:param name="func" select=" 'identity' "/>     <xsl:param name="func-param1" select="$generic:generics[self::generic:func and                                    @name = $func]/@param1"/>     <xsl:param name="i" select="1"/>     <xsl:param name="accum" select="$generic:generics[self::generic:aggr-func and                                      @name = $aggr-func]/@identity"/>          <xsl:choose>       <xsl:when test="$nodes">                 <!--Compute func($x) -->          <xsl:variable name="f-of-x">           <xsl:apply-templates select=           "$generic:generics[self::generic:func and @name = $func]">             <xsl:with-param name="x" select="$nodes[last( )]"/>             <xsl:with-param name="i" select="$i"/>           </xsl:apply-templates>         </xsl:variable>             <!-- Aggregate current $f-of-x with $accum -->             <xsl:variable name="temp">           <xsl:apply-templates                select="$generic:generics[self::generic:aggr-func and                                    @name = $aggr-func]">             <xsl:with-param name="x" select="$f-of-x"/>             <xsl:with-param name="accum" select="$accum"/>             <xsl:with-param name="i" select="$i"/>           </xsl:apply-templates>         </xsl:variable>                         <xsl:call-template name="generic:reverse-aggregation">           <xsl:with-param name="nodes"                            select="$nodes[position( )!=last( )]"/>           <xsl:with-param name="aggr-func" select="$aggr-func"/>           <xsl:with-param name="func" select="$func"/>           <xsl:with-param name="func-param1" select="$func-param1"/>           <xsl:with-param name="i" select="$i + 1"/>           <xsl:with-param name="accum" select="$temp"/>         </xsl:call-template>       </xsl:when>       <xsl:otherwise>         <xsl:value-of select="$accum"/>       </xsl:otherwise>     </xsl:choose>   </xsl:template>

See Also

FXSL (see the "See Also" section of this chapter's introduction) has fold and foldr functions that are similar to generic:aggregation and generic:reverse-aggregation, respectively.




XSLT Cookbook
XSLT Cookbook: Solutions and Examples for XML and XSLT Developers, 2nd Edition
ISBN: 0596009747
EAN: 2147483647
Year: 2003
Pages: 208
Authors: Sal Mangano

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net