Recipe9.5.Joins


Recipe 9.5. Joins

Problem

You want to relate elements in a document to other elements in the same or different document.

Solution

A join is the process of considering all pairs of element as being related (i.e., a Cartesian product) and keeping only those pairs that meet the join relationship (usually equality).

To demonstrate, I have adapted the supplier parts database found in Date's An Introduction to Database Systems (Addison Wesley, 1986) to XML:

<database>   <suppliers>     <supplier  name="Smith" status="20" city="London"/>     <supplier  name="Jones" status="10" city="Paris"/>     <supplier  name="Blake" status="30" city="Paris"/>     <supplier  name="Clark" status="20" city="London"/>     <supplier  name="Adams" status="30" city="Athens"/>   </suppliers>   <parts>     <part  name="Nut" color="Red" weight="12" city="London"/>     <part  name="Bult" color="Green" weight="17" city="Paris"/>     <part  name="Screw" color="Blue" weight="17" city="Rome"/>     <part  name="Screw" color="Red" weight="14" city="London"/>     <part  name="Cam" color="Blue" weight="12" city="Paris"/>     <part  name="Cog" color="Red" weight="19" city="London"/>   </parts>   <inventory>     <invrec s p qty="300"/>     <invrec s p qty="200"/>     <invrec s p qty="400"/>     <invrec s p qty="200"/>     <invrec s p qty="100"/>     <invrec s p qty="100"/>     <invrec s p qty="300"/>     <invrec s p qty="400"/>     <invrec s p qty="200"/>     <invrec s p qty="200"/>     <invrec s p qty="300"/>     <invrec s p qty="400"/>   </inventory> </database>

The join to be performed will answer the question, "Which suppliers and parts are in the same city (co-located)?"

You can use two basic techniques to approach this problem in XSLT. The first uses nested for-each loops:

<xsl:template match="/">   <result>     <xsl:for-each select="database/suppliers/*">       <xsl:variable name="supplier" select="."/>       <xsl:for-each select="/database/parts/*[@city=current( )/@city]">       <colocated>         <xsl:copy-of select="$supplier"/>         <xsl:copy-of select="."/>       </colocated>       </xsl:for-each>     </xsl:for-each>   </result> </xsl:template>

The second approach uses apply-templates:

<xsl:template match="/">   <result>     <xsl:apply-templates select="database/suppliers/supplier" />   </result> </xsl:template>     <xsl:template match="supplier">   <xsl:apply-templates select="/database/parts/part[@city = current( )/@city]">     <xsl:with-param name="supplier" select="." />   </xsl:apply-templates> </xsl:template>     <xsl:template match="part">   <xsl:param name="supplier" select="/.." />   <colocated>     <xsl:copy-of select="$supplier" />     <xsl:copy-of select="." />   </colocated> </xsl:template>

If one of the sets of elements to be joined has a large number of members, then consider using xsl:key to improve performance:

<xsl:key name="part-city" match="part" use="@city"/>     <xsl:template match="/">   <result>     <xsl:for-each select="database/suppliers/*">       <xsl:variable name="supplier" select="."/>       <xsl:for-each select="key('part-city',$supplier/@city)">       <colocated>         <xsl:copy-of select="$supplier"/>         <xsl:copy-of select="."/>       </colocated>       </xsl:for-each>     </xsl:for-each>   </result> </xsl:template>

Each stylesheet produces the same result:

<result>    <colocated>       <supplier  name="Smith" status="20" city="London"/>       <part  name="Nut" color="Red" weight="12" city="London"/>    </colocated>    <colocated>       <supplier  name="Smith" status="20" city="London"/>       <part  name="Screw" color="Red" weight="14" city="London"/>    </colocated>    <colocated>       <supplier  name="Smith" status="20" city="London"/>       <part  name="Cog" color="Red" weight="19" city="London"/>    </colocated>    <colocated>       <supplier  name="Jones" status="10" city="Paris"/>       <part  name="Bult" color="Green" weight="17" city="Paris"/>    </colocated>    <colocated>       <supplier  name="Jones" status="10" city="Paris"/>       <part  name="Cam" color="Blue" weight="12" city="Paris"/>    </colocated>    <colocated>       <supplier  name="Blake" status="30" city="Paris"/>       <part  name="Bult" color="Green" weight="17" city="Paris"/>    </colocated>    <colocated>       <supplier  name="Blake" status="30" city="Paris"/>       <part  name="Cam" color="Blue" weight="12" city="Paris"/>    </colocated>    <colocated>       <supplier  name="Clark" status="20" city="London"/>       <part  name="Nut" color="Red" weight="12" city="London"/>    </colocated>    <colocated>       <supplier  name="Clark" status="20" city="London"/>       <part  name="Screw" color="Red" weight="14" city="London"/>    </colocated>    <colocated>       <supplier  name="Clark" status="20" city="London"/>       <part  name="Cog" color="Red" weight="19" city="London"/>    </colocated> </result>

Discussion

XSLT 1.0

The join you performed is called an equi-join because the elements are related by equality. More generally, joins can be formed using other relations. For example, consider the query, "Select all combinations of supplier and part information for which the supplier city follows the part city in alphabetical order."

It would be nice if you could simply write the following stylesheet, but XSLT 1.0 does not define relational operations on string types:

<xsl:template match="/">   <result>     <xsl:for-each select="database/suppliers/*">       <xsl:variable name="supplier" select="."/>        <!-- This does not work! -->       <xsl:for-each select="/database/parts/*[current( )/@city > @city]">       <colocated>         <xsl:copy-of select="$supplier"/>         <xsl:copy-of select="."/>       </colocated>       </xsl:for-each>     </xsl:for-each>   </result> </xsl:template>

Instead, you must create a table using xsl:sort that can map city names onto integers that reflect the ordering. Here you rely on Saxon's ability to treat variables containing result-tree fragments as node sets when the version is set to 1.1. However, you can also use the node-set function of your particular XSLT 1.0 processor or use an XSLT 2.0 processor:

<xsl:stylesheet version="1.1" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">      <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>     <xsl:variable name="unique-cities"       select="//@city[not(. = ../preceding::*/@city)]"/>     <xsl:variable name="city-ordering">   <xsl:for-each select="$unique-cities">     <xsl:sort select="."/>     <city name="{.}" order="{position( )}"/>   </xsl:for-each> </xsl:variable>           <xsl:template match="/">   <result>     <xsl:for-each select="database/suppliers/*">       <xsl:variable name="s" select="."/>       <xsl:for-each select="/database/parts/*">         <xsl:variable name="p" select="."/>         <xsl:if            test="$city-ordering/*[@name = $s/@city]/@order &gt;                 $city-ordering/*[@name = $p/@city]/@order">           <supplier-city-follows-part-city>             <xsl:copy-of select="$s"/>             <xsl:copy-of select="$p"/>           </supplier-city-follows-part-city>         </xsl:if>       </xsl:for-each>     </xsl:for-each>   </result> </xsl:template>    </xsl:stylesheet>

This query results in the following output:

<result>    <supplier-city-follows-part-city>       <supplier  name="Jones" status="10" city="Paris"/>       <part  name="Nut" color="Red" weight="12" city="London"/>    </supplier-city-follows-part-city>    <supplier-city-follows-part-city>       <supplier  name="Jones" status="10" city="Paris"/>       <part  name="Screw" color="Red" weight="14" city="London"/>    </supplier-city-follows-part-city>    <supplier-city-follows-part-city>       <supplier  name="Jones" status="10" city="Paris"/>       <part  name="Cog" color="Red" weight="19" city="London"/>    </supplier-city-follows-part-city>    <supplier-city-follows-part-city>       <supplier  name="Blake" status="30" city="Paris"/>       <part  name="Nut" color="Red" weight="12" city="London"/>    </supplier-city-follows-part-city>    <supplier-city-follows-part-city>       <supplier  name="Blake" status="30" city="Paris"/>       <part  name="Screw" color="Red" weight="14" city="London"/>    </supplier-city-follows-part-city>    <supplier-city-follows-part-city>       <supplier  name="Blake" status="30" city="Paris"/>       <part  name="Cog" color="Red" weight="19" city="London"/>    </supplier-city-follows-part-city> </result>

XSLT 2.0

Comparison operators work correctly on string values in XSLT 2.0, so the simpler form will work:

<xsl:template match="/">   <result>     <xsl:for-each select="database/suppliers/*">       <xsl:variable name="supplier" select="."/>        <!-- This is okay in 2.0 -->       <xsl:for-each select="/database/parts/*[current( )/@city > @city]">       <colocated>         <xsl:copy-of select="$supplier"/>         <xsl:copy-of select="."/>       </colocated>       </xsl:for-each>     </xsl:for-each>   </result> </xsl:template>




XSLT Cookbook
XSLT Cookbook: Solutions and Examples for XML and XSLT Developers, 2nd Edition
ISBN: 0596009747
EAN: 2147483647
Year: 2003
Pages: 208
Authors: Sal Mangano

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net