Recipe2.6.Reversing a String


Recipe 2.6. Reversing a String

Problem

You need to reverse the characters of a string.

Solution

XSLT 1.0

This template reverses $input in a subtle yet effective way:

<xsl:template name="reverse">      <xsl:param name="input"/>      <xsl:variable name="len" select="string-length($input)"/>      <xsl:choose>           <!-- Strings of length less than 2 are trivial to reverse -->           <xsl:when test="$len &lt; 2">                <xsl:value-of select="$input"/>           </xsl:when>           <!-- Strings of length 2 are also trivial to reverse -->           <xsl:when test="$len = 2">                <xsl:value-of select="substring($input,2,1)"/>                <xsl:value-of select="substring($input,1,1)"/>           </xsl:when>           <xsl:otherwise>                <!-- Swap the recursive application of this template to                 the first half and second half of input -->                <xsl:variable name="mid" select="floor($len div 2)"/>                <xsl:call-template name="reverse">                     <xsl:with-param name="input"                          select="substring($input,$mid+1,$mid+1)"/>                </xsl:call-template>                <xsl:call-template name="reverse">                     <xsl:with-param name="input"                          select="substring($input,1,$mid)"/>                </xsl:call-template>           </xsl:otherwise>      </xsl:choose> </xsl:template>

XSLT 2.0

Reversing is trivial in 2.0.

<xsl:function name="ckbk:reverse">     <xsl:param name="input" as="xs:string"/>     <xsl:sequence select="codepoints-to-string(                            reverse(string-to-codepoints($input)))"/>   </xsl:function>

Discussion

XSLT 1.0

The algorithm shown in the solution is not the most obvious, but it is efficient. In fact, this algorithm successfully reverses even very large strings, whereas other more obvious algorithms either take too long or fail with a stack overflow. The basic idea behind this algorithm is to swap the first half of the string with the second half and to keep applying the algorithm to these halves recursively until you are left with strings of length two or less, at which point the reverse operation is trivial. The following example illustrates how this algorithm works. At each step, I placed a + where the string was split and concatenated.

  1. reverse("abcdef") (input)

  2. reverse(def)+reverse("abc")

  3. reverse("ef") + "d" + reverse("bc") + "a"

  4. "f" + "e" + "d" + "c" + "b" + "a"

  5. fedcba (result)

Considering more obvious XSLT implementations of reverse is instructive because they provide lessons in how and how not to implement recursive solutions in other contexts.

One of the worst algorithms is probably the one that many would think of on their first try. The idea is to swap the first and last character of the string, continue to the second and next to last, and so on until you reach the middle, at which point you are done. A C programmer might come up with this solution, since it is a perfectly efficient iterative solution in a language like C in which you can read and write individual characters of the string randomly and iteration rather than recursion is the norm. However, in XSLT you must implement this algorithm, shown in Example 2-1, in a recursive fashion, and you do not have the luxury of manipulating variables in place.

Example 2-1. A very poor implementation of reverse
<xsl:template name="reverse">         <xsl:param name="input"/>      <xsl:variable name="len" select="string-length($input)"/>      <xsl:choose>           <!-- Strings of length less than 2 are trivial to reverse -->           <xsl:when test="$len &lt; 2">                <xsl:value-of select="$input"/>           </xsl:when>           <!-- Strings of length 2 are also trivial to reverse -->           <xsl:when test="$len = 2">                <xsl:value-of select="substring($input,2,1)"/>                <xsl:value-of select="substring($input,1,1)"/>           </xsl:when>           <xsl:otherwise>                <!-- Concatenate the last + reverse(middle) + first -->                <xsl:value-of select="substring($input,$len,1)"/>                <xsl:call-template name="reverse">                     <xsl:with-param name="input"                          select="substring($input,2,$len - 2)"/>                 </xsl:call-template>                <xsl:value-of select="substring($input,1,1)"/>           </xsl:otherwise>      </xsl:choose> </xsl:template>

A major problem with this solution makes it useless for all but very short strings. The problem is that the solution is not tail recursive (see the Tail Recursion sidebar for an explanation of tail recursion). Many XSLT processors (such as Saxon) optimize for tail recursion, so you are advised to structure your code to benefit from this significant optimization. Example 2-2 makes this version of reverse tail recursive by moving only the last character in the string to the front on each recursive call. This puts the recursive call at the end and thus subject to the optimization.

Example 2-2. An inefficient tail recursive implementation
<xsl:template name="reverse">         <xsl:param name="input"/>      <xsl:variable name="len" select="string-length($input)"/>      <xsl:choose>           <!-- Strings of length less than 2 are trivial to reverse -->           <xsl:when test="$len &lt; 2">                <xsl:value-of select="$input"/>           </xsl:when>           <!-- Strings of length 2 are also trivial to reverse -->           <xsl:when test="$len = 2">                <xsl:value-of select="substring($input,2,1)"/>                <xsl:value-of select="substring($input,1,1)"/>           </xsl:when>           <!-- Concatenate the last + reverse(rest) -->           <xsl:otherwise>             <xsl:value-of select="substring($input,$len,1)"/>               <xsl:call-template name="reverse">                <xsl:with-param name="input" select="substring($input,1,$len - 1)"/>                </xsl:call-template>           </xsl:otherwise>      </xsl:choose> </xsl:template>

This change prevents reverse from overflowing the stack, but it is still inefficient for large strings. First, notice that each step results in the movement of only a single character. Second, each recursive call must process a string that is just one character shorter than the current string. For very large strings, this call will potentially overstress the memory management subsystem of the XSLT implementation. In editing this recipe, Jeni Tennison pointed out that another method of making the version tail recursive would pass the remaining (reverse) string and $len as a parameter to the template. This, in general, is a good strategy for achieving tail recursion. In this particular case, it improved matters but did not do as well as the solution.

An important goal in all recursive implementations is to try to structure the algorithm so that each recursive call sets up a subproblem that is at least half as large as the current problem. This setup causes the recursion to "bottom out" more quickly. Following this advice results in the solution to reverse, shown in Example 2-3.

Example 2-3. An efficient (but not ideal) implementation
<xsl:template name="reverse">      <xsl:param name="input"/>          <xsl:variable name="len" select="string-length($input)"/>      <xsl:choose>           <xsl:when test="$len &lt; 2">                <xsl:value-of select="$input"/>           </xsl:when>           <xsl:otherwise>                <xsl:variable name="mid" select="floor($len div 2)"/>                <xsl:call-template name="reverse">                     <xsl:with-param name="input"                          select="substring($input,$mid+1,$mid+1)"/>                </xsl:call-template>                <xsl:call-template name="reverse">                     <xsl:with-param name="input"                          select="substring($input,1,$mid)"/>                </xsl:call-template>           </xsl:otherwise>      </xsl:choose> </xsl:template>

This solution is the first one I came up with, and it works well even on large strings (1,000 characters or more). It has the added benefit of being shorter than the implementation shown in the "Solution" section. The only difference is that this implementation considers only strings of length zero or one as trivial. The slightly faster implementation cuts the number of recursive calls in half by also trivially dealing with strings of length two.

All the implementations shown here actually perform the same number of concatenations, and I do not believe there is any way around this without leaving the confines of XSLT. However, my testing shows that on a string of length 1,000, the best solution is approximately 5 times faster than the worst. The best and second-best solutions differ by only a factor of 1.3.

Tail Recursion

A recursive call is tail recursive if, when the call returns, the returned value is immediately returned from the function. The term "tail" is attributed to the recursive call, which comes at the end. Tail recursion is important because it can be implemented more efficiently than general recursion. A general recursive call must establish a new stack frame to store local variables and other bookkeeping items. Thus, a general recursive implementation can quickly exhaust the stack space on large inputs. However, tail-recursive implementations can be transformed internally into iterative solutions by an XSLT processor capable of recognizing tail recursion.


XSLT 2.0

The XSLT 1.0 solution manipulates the string as substrings because there is no way to get to the Unicode character level. The 2.0 solution uses the functions string-to-codepoints and codepoints-to-string, which is probably faster in most 2.0 implementations because internally strings are just arrays of Unicode integer values.




XSLT Cookbook
XSLT Cookbook: Solutions and Examples for XML and XSLT Developers, 2nd Edition
ISBN: 0596009747
EAN: 2147483647
Year: 2003
Pages: 208
Authors: Sal Mangano

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net