Recipe8.7.Flattening an XML Hierarchy


Recipe 8.7. Flattening an XML Hierarchy

Problem

You have a document with elements organized in a more deeply nested fashion than you would prefer. You want to flatten the tree.

Solution

If your goal is simply to flatten without regard to the information encoded by the deeper structure, then you need to apply an overriding copy. The overriding template must match the elements you wish to discard and apply templates without copying.

Consider the following input, which segregates people into two categoriessalaried and union:

<people>   <union>     <person>       <firstname>Warren</firstname>       <lastname>Rosenbaum</lastname>       <age>37</age>       <height>5.75</height>     </person>     <person>       <firstname>Dror</firstname>       <lastname>Seagull</lastname>       <age>28</age>       <height>5.10</height>     </person>     <person>       <firstname>Mike</firstname>       <lastname>Heavyman</lastname>       <age>45</age>       <height>6.0</height>     </person>     <person>       <firstname>Theresa</firstname>       <lastname>Archul</lastname>       <age>37</age>       <height>5.5</height>     </person>   </union>   <salaried>     <person>       <firstname>Sal</firstname>       <lastname>Mangano</lastname>       <age>37</age>       <height>5.75</height>     </person>     <person>       <firstname>Jane</firstname>       <lastname>Smith</lastname>       <age>28</age>       <height>5.10</height>     </person>     <person>       <firstname>Rick</firstname>       <lastname>Winters</lastname>       <age>45</age>       <height>6.0</height>     </person>     <person>       <firstname>James</firstname>       <lastname>O'Riely</lastname>       <age>33</age>       <height>5.5</height>     </person>   </salaried> </people>

This stylesheet simply discards the extra structure:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">       <xsl:import href="copy.xslt"/>       <xsl:output method="xml" version="1.0" encoding="UTF-8"/>        <xsl:template match="people">     <xsl:copy>       <!--discard parents of person elements -->        <xsl:apply-templates select="*/person" />     </xsl:copy>   </xsl:template>     </xsl:stylesheet>

Discussion

Having additional structure in a document is generally good because it usually makes the document easier to process with XSLT. However, too much structure bloats the document and makes it harder for people to understand. Humans generally prefer to infer relationships by spatial text organization rather than with extra syntactic baggage.

The following example shows that the extra structure is not superfluous, but encodes additional information. If you want to retain information about the structure while flattening, then you should probably create an attribute or child element to capture the information.

This stylesheet creates an attribute:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">       <xsl:import href="copy.xslt"/>       <xsl:output method="xml" version="1.0" encoding="UTF-8"    omit-xml-declaration="yes"/>          <!--discard parents of person elements -->    <xsl:template match="*[person]">        <xsl:apply-templates/>   </xsl:template>     <xsl:template match="person">   <xsl:copy>     <xsl:apply-templates select="@*"/>     <xsl:attribute name="class">       <xsl:value-of select="local-name(..)"/>     </xsl:attribute>     <xsl:apply-templates/>   </xsl:copy> </xsl:template>     </xsl:stylesheet>

This variation creates an element:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">       <xsl:import href="copy.xslt"/>       <xsl:strip-space elements="*"/>       <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes" />          <!--discard parents of person elements -->    <xsl:template match="*[person]">        <xsl:apply-templates/>   </xsl:template>     <xsl:template match="person">   <xsl:copy>     <xsl:copy-of select="@*"/>     <xsl:element name="class">       <xsl:value-of select="local-name(..)"/>     </xsl:element>     <xsl:apply-templates/>   </xsl:copy> </xsl:template>     </xsl:stylesheet>

You can use xsl:strip-space and indent="yes" on the xsl:output element so the output will not contain a whitespace gap, as shown here:

<people> ...     <person>       <class>union</class>       <firstname>Warren</firstname>       <lastname>Rosenbaum</lastname>       <age>37</age>       <height>5.75</height>     </person>                                       <-- Whitespace gap here!         <person>       <class>salaried</class>       <firstname>Sal</firstname>       <lastname>Mangano</lastname>       <age>37</age>       <height>5.75</height>     </person> ...  </people>




XSLT Cookbook
XSLT Cookbook: Solutions and Examples for XML and XSLT Developers, 2nd Edition
ISBN: 0596009747
EAN: 2147483647
Year: 2003
Pages: 208
Authors: Sal Mangano

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net