5.4 String Functions

The string functions return strings and Booleans. The XPath string functions are concat( ), contains( ) (returns a Boolean), normalize-space( ), starts-with( ) (also returns a Boolean), string( ), string-length( ), substring( ), substring-after( ), substring-before( ), and translate( ). The XSLT functions format-number( ), unparsed-entity-uri( ), and generate-id( ) also return strings. You saw substring( ) and string-length( ) in action in Chapter 3.

5.4.1 The concat( ) Function

I'll demonstrate how to use concat( ) here. The file poem.xml holds a limerick written by XML mensch John Cowan:

<?xml version="1.0" encoding="UTF-8"?>     <poem>  <line>My corporate data's a mess!</line>  <line>It's all semi-structured, no less.</line>  <line>But I'll be carefree</line>  <line>Using XSLT</line>  <line>In an XML DBMS.</line>  <attribution>John Cowan</attribution> </poem>

You could format the poem in any number of ways, but I'll show you one way to do it with concat( ). The stylesheet limerick.xsl does most of its work with the concat( ) function:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output method="text"/>     <xsl:template match="poem">  <xsl:value-of select="concat(line[1], '&#10;',                               line[2], '&#10;',                               '&#32;&#32;&#32;',                               line[3], '&#10;',                               '&#32;&#32;&#32;',                               line[4], '&#10;',                               line[5], '&#10;',                               '&#9;&#9;-',                               attribution)"/> </xsl:template>     </xsl:stylesheet>

The concat( ) function takes two or more strings as arguments and concatenates them together. In this stylesheet, concat( ) concatenates 14 strings together, collecting 5 of them from line elements. It inserts whitespace directly linefeeds, spaces, and tabs and picks up one last string from the attribution element.

The result of processing poem.xml with limerick.xsl produces a nicely formatted limerick:

My corporate data's a mess! It's all semi-structured, no less.    But I'll be carefree    Using XSLT In an XML DBMS.                 -John Cowan

5.4.2 The normalize-space( ), translate( ), and substring( ) Functions

Now let's use normalize-space( ), translate( ), and substring( ) together to help perform a conversion of a Microsoft file path with a Unix one. The document path.xml contains a Microsoft path that includes a filename:

<ms>     C:\LearningXSLT\examples\ch05\path.xml     </ms>

Let's suppose you want to convert this path to Unix and get rid of linefeeds that surround the path. The fix.xsl stylesheet can do this with a single expression:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output method="xml"/>     <xsl:template match="ms">  <unix>/usr/mike<xsl:value-of select="normalize-space(translate(substring(.,5),'\','/'))"/></unix> </xsl:template>     </xsl:stylesheet>

The result of processing path.xml with fix.xsl with:

xalan path.xml fix.xsl

produces:

<?xml version="1.0" encoding="UTF-8"?> <unix>/usr/mike/LearningXSLT/examples/ch05/path.xml</unix>

The innermost XSLT function called is substring(.,5). This call refers to the string value of the current node, that is, the child text node of ms, using the single period (.). (The single period is generally a synonym for the current( ) function, which returns the current node.) The second argument is 5, which indicates what character will begin the text node of the substring (because it is preceded by two linefeeds, the fifth character immediately follows the colon). substring( ) has an optional third argument (not shown in this example), which is a number that determines the overall length of the substring.

The translate( ) function takes three arguments. The first is the string to be translated. In the case of this example, it is the lopped-off string produced by the substring( ) function. The next argument is the character (it could be a list of characters) that you want to translate. You can list more than one character to translate, but this example only uses one, that is, \. The third argument tells what the second argument \ will translate into, namely, /. If the second and third arguments list more than one character, each character in the second argument list is translated with the corresponding character in the third argument list.

You have to use caution with translate( ), as it is easy to swap characters that you do not intend to swap. The conversion takes place for every instance of the character in the source tree, not just the first one.


Finally, the normalize-space( ) function normalizes space in ms by trimming leading and trailing whitespace.

5.4.3 The generate-id( ) Function

As discussed earlier in Section 5.2.1, an ID is a unique identifier in XML. The generate-id( ) function creates IDs that are guaranteed to be unique within a document. The stylesheet generate-id.xsl generates a unique ID for each new welcome element that it creates:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output method="xml" indent="yes"/> <xsl:output doctype-system="welcome.dtd"/>     <xsl:template match="greet">  <xsl:element name="greeting">   <xsl:apply-templates select="greeting"/>  </xsl:element> </xsl:template>     <xsl:template match="greeting">  <xsl:element name="welcome">   <xsl:attribute name="xml:lang"><xsl:value-of select="@xml:lang"/></xsl:attribute>   <xsl:attribute name="id"><xsl:value-of select="generate-id(.)"/></xsl:attribute>   <xsl:value-of select="current(  )"/>  </xsl:element> </xsl:template>     </xsl:stylesheet>

The stylesheet also passes on the xml:lang attributes with its values from the source to the result. It also creates a document type declaration that associates the result document with the DTD called welcome.dtd. It's shown here:

<!ELEMENT greeting (welcome+)> <!ELEMENT welcome (#PCDATA)> <!ATTLIST welcome id ID #REQUIRED                   xml:lang CDATA #REQUIRED>

This DTD declares an id attribute of type ID for the generated, unique ID values. Create the result document with this command:

xalan -i 1 -o welcome.xml greet.xml generate-id.xsl

welcome.xml looks like this:

<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE greeting SYSTEM "welcome.dtd"> <greeting>  <welcome xml:lang="en" >Welcome</welcome>  <welcome xml:lang="fr" >Bienvenue</welcome>  <welcome xml:lang="es" >Bienvenido</welcome>  <welcome xml:lang="de" >Willkommen</welcome> </greeting>

welcome.xml is valid with regard to welcome.dtd. The id attributes are of type ID in the DTD, and each of the id values is unique.

welcome.xsl extracts a German welcome from welcome.xml while showing its ID:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output method="text"/>     <xsl:template match="greeting">  <xsl:apply-templates select="welcome[lang('de')]"/> </xsl:template>     <xsl:template match="welcome[lang('de')]">  <xsl:text>German: </xsl:text>  <xsl:value-of select="."/>  <xsl:text> (ID: </xsl:text>  <xsl:value-of select="@id"/>  <xsl:text>)</xsl:text> </xsl:template>     </xsl:stylesheet>

Now validate welcome.xml by using the -v option with Xalan while you transform it with welcome.xsl:

xalan -v welcome.xml welcome.xsl

Here is the result of the transformation:

German: Willkommen (ID: N003EBD80.00483778)


Learning XSLT
Learning XSLT
ISBN: 0596003277
EAN: 2147483647
Year: 2003
Pages: 164

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net