Handling Whitespace

Handling spaces is always something of an involved topic in XSLT. Inserting a single space, " ", isn't difficult if you use the <xsl:text> element, which you use to insert text directly into the output document. This element only has one attribute: disable-output-escaping . Set this attribute to "yes" to make sure characters like < and > are output literally rather than as &lt; and &gt; . The default is "no".

This element can only contain a text node. Here's an example where we're using <xsl:text> to insert a space between an element value and the element's units:

 
 <xsl:template match="mass">     <xsl:value-of select="."/>     <xsl:text> </xsl:text>     <xsl:value-of select="@units"/> </xsl:template> 

Using <xsl:text> explicitly like this lets you insert whitespace into the output documentotherwise, the XSLT processor would delete extra whitespace like this by default. You can use this element to insert any text in the output document, not just whitespace, but because non-whitespace text is usually copied by default, this element is often used to handle whitespace.

Formally speaking, whitespace nodes are text nodes that only contain whitespace (that is, spaces, carriage returns, line feeds, and tabs). These nodes are copied by default when they come from the source document. However, you can also have whitespace nodes in your stylesheets as well, as here:

 
 <xsl:template match="planets">     <xsl:copy>         <xsl:apply-templates select="planet"/>     </xsl:copy> </xsl:template> 

Here, we're using spaces to indent the stylesheet elements, as well as carriage returns, to spread things out. Pure whitespace nodes like these are not copied from the stylesheet to the output document.

Note, however, that the whitespace in this <TITLE> element in the source document will be copied to the output, because it's not a pure whitespace node (it also contains the text "My Summer Vacation"):

 
 <xsl:template match="/data">     <HTML>         <HEAD>             <TITLE>  My Summer Vacation  </TITLE>             .             .             . 

If you want to eliminate whitespace, you could use empty <xsl:text> elements so the remaining whitespace becomes pure whitespace nodes, like this:

 
 <xsl:template match="/data">     <HTML>         <HEAD>             <TITLE>  <xsl:text/>My Summer Vacation<xsl:text/>  </TITLE>             .             .             . 

Pure whitespace nodes are not copied from the stylesheet to the output document unless it's inside an <xsl:text> element, or an enclosing element has the xml:space attribute set to "preserve".

On the other hand, by default, XSLT 1.0 preserves whitespace text nodes in the source document and copies them to the result document. That what's happening in the example of copying stylesheets that we've already seen:

 
 <?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet version="1.1"  xmlns:xsl="http://www.w3.org/1999/XSL/Transform">     <xsl:output method="xml"/>     <xsl:template match="*">         <xsl:copy>             <xsl:apply-templates/>         </xsl:copy>     </xsl:template> </xsl:stylesheet> 

When you apply this stylesheet to ch05_01.xml , all the whitespace we've used in ch05_01.xml is copied over to the result document as well:

 
 <?xml version="1.0" encoding="UTF-8"?> <planets>     <planet>         <name>Mercury</name>         <mass units="(Earth = 1)">.0553</mass>         <day units="days">58.65</day>         <radius units="miles">1516</radius>         <density units="(Earth = 1)">.983</density>         <distance units="million miles">43.4</distance><!--At perihelion-->     </planet>         .         .         . 

However, there are times you want to remove the whitespace used to format input documents, and you can do that with the <xsl: strip-space > element. There is only one attribute for this element: elements , which is mandatory and which specifies elements to strip the whitespace from. You set this attribute to a whitespace-separated list of names or names with wildcards. This element contains no content.

You can see an example that strips all whitespace nodes from ch05_01.xml using <xsl:strip-space elements="*"/> in ch05_09.xsl in Listing 5.8.

Listing 5.8 An XSLT Stylesheet That Copies Elements and Attributes ( ch05_09.xsl )
 <?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet version="1.1"  xmlns:xsl="http://www.w3.org/1999/XSL/Transform">  <xsl:strip-space elements="*"/>  <xsl:output method="xml"/>     <xsl:template match="*">         <xsl:copy>             <xsl:apply-templates/>         </xsl:copy>     </xsl:template> </xsl:stylesheet> 

Here's the result document you get when you apply this stylesheet to ch05_01.xml note that all whitespace has been stripped out, including all carriage returns (a few had to be added to fit this result on the pagethe actual result is just one long string):

 
 <?xml version="1.0" encoding="UTF-8"?> <planets><planet><name>Mercury</name><mass>.0553</mass><day>58.65</day><radius> 1516</radius><density>.983</density><distance>43.4</distance></planet><planet> <name>Venus</name><mass>.815</mass><day>116.75</day><radius>3716</radius> <density>.943</density><distance>66.8</distance></planet><planet><name>Earth </name><mass>1</mass><day>1</day><radius>2107</radius><density>1</density> <distance>128.4</distance></planet></planets> 

On the other hand, you might not want to remove all the whitespace nodes throughout a document, and you can use the <xsl: preserve-space > element to indicate which elements you want to preserve whitespace nodes in. This element has the same attribute as <xsl:strip-space> , elements .

What this means is that if you've used <xsl:strip-space> , you can still indicate what element or elements you want whitespace nodes preserved in by setting the elements attribute in <xsl:preserve-space> to a list of elements like this:

 
 <?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet version="1.1"  xmlns:xsl="http://www.w3.org/1999/XSL/Transform">     <xsl:strip-space elements="*"/>  <xsl:preserve-space elements="name distance"/>  <xsl:output method="xml"/>     <xsl:template match="*">         <xsl:copy>             <xsl:apply-templates/>         </xsl:copy>     </xsl:template> </xsl:stylesheet> 

PRESERVING WHITESPACE BY DEFAULT

Using <xsl:preserve-space> is actually the default for all elements in XSLTin other words, whitespace is preserved from the input document.


There's also an easy way to work with whitespace if you just want to indent the result document. The <xsl:output> element supports an attribute called indent , which you can set to "yes" or "no", and indicates to the XSLT processor whether you want the result document indented.

Often, indenting the result document doesn't matter very much, because that document is targeted to an application that doesn't care about indenting, such as a browser. But there are times when you'd like to view the result document as straight text, and in such cases, indenting that document can help.

How an XSLT processor uses the indent attribute varies by processor, because it's not specified by W3C. Say, for example, that you have a version of ch05_01.xml without any indentation at all, which appears in ch05_10.xml in Listing 5.9.

Listing 5.9 An XML Document with No Indentation ( ch05_10.xml )
 <?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xml" href="planets.xsl"?> <planets> <planet> <name>Mercury</name> <mass units="(Earth = 1)">.0553</mass> <day units="days">58.65</day> <radius units="miles">1516</radius> <density units="(Earth = 1)">.983</density> <distance units="million miles">43.4</distance><!--At perihelion--> </planet> <planet> <name>Venus</name> <mass units="(Earth = 1)">.815</mass> <day units="days">116.75</day> <radius units="miles">3716</radius> <density units="(Earth = 1)">.943</density> <distance units="million miles">66.8</distance><!--At perihelion--> </planet> <planet> <name>Earth</name> <mass units="(Earth = 1)">1</mass> <day units="days">1</day> <radius units="miles">2107</radius> <density units="(Earth = 1)">1</density> <distance units="million miles">128.4</distance><!--At perihelion--> </planet> </planets> 

To indent this document, you can use an XSLT processor that supports the <xsl:output indent="yes"/> element. A stylesheet that uses this element appears in ch05_11.xsl in Listing 5.10.

Listing 5.10 Using <xsl:output indent="yes"/> (ch05_11.xsl )
 <?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet version="1.1"  xmlns:xsl="http://www.w3.org/1999/XSL/Transform">  <xsl:output indent="yes"/>  <xsl:template match="/planets"> <HTML> <HEAD> <TITLE>           The Planets Table       </TITLE> </HEAD> <BODY> <H1>           The Planets Table       </H1> <TABLE BORDER="2"> <TD>Name</TD> <TD>Mass</TD> <TD>Radius</TD> <TD>Day</TD> <xsl:apply-templates/> </TABLE> </BODY> </HTML> </xsl:template> <xsl:template match="planet"> <TR> <TD><xsl:value-of select="name"/></TD> <TD><xsl:value-of select="mass"/></TD> <TD><xsl:value-of select="radius"/></TD> <TD><xsl:value-of select="day"/></TD> </TR> </xsl:template> </xsl:stylesheet> 

Xalan doesn't indent documents this way, but the Saxon XSLT processor will. Here's the result using Saxon, indented as we wanted:

 
 <HTML>    <HEAD>       <meta http-equiv="Content-Type" content="text/html; charset=utf-8">       <TITLE>          The Planets Table       </TITLE>    </HEAD>    <BODY>       <H1>          The Planets Table       </H1>       <TABLE BORDER="2">          <TD>Name</TD>          <TD>Mass</TD>          <TD>Radius</TD>          <TD>Day</TD>          <TR>             <TD>Mercury</TD>             <TD>.0553</TD>             <TD>1516</TD>             <TD>58.65</TD>          </TR>          <TR>             <TD>Venus</TD>             <TD>.815</TD>             <TD>3716</TD>             <TD>116.75</TD>          </TR>          <TR>             <TD>Earth</TD>             <TD>1</TD>             <TD>2107</TD>             <TD>1</TD>          </TR>       </TABLE>    </BODY> </HTML> 

As you can see, handling whitespace takes a little bit of thought in XSLT, but it's easier if you know what's going on.

People often use XSLT to work with the data in XML documents without resorting to programming, but in fact, you can do a bit of programming using XSLT when you use the <xsl:if> and <xsl:choose> elements. We'll take a look at these XSLT elements next .



XPath. Navigating XML with XPath 1.0 and 2.0 Kick Start
XPath Kick Start: Navigating XML with XPath 1.0 and 2.0
ISBN: 0672324113
EAN: 2147483647
Year: 2002
Pages: 131

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net