3.2 Creating Word Documents


It's very easy to create Word documents from XSLT. We saw the definitive "Hello, World" example for WordprocessingML in Chapter 2. Example 3-1 shows the "Hello, World" example for creating a Word document from XSLT.

Example 3-1. Creating a Word document from XSLT
<xsl:stylesheet version="1.0"   xmlns:xsl="http://www.w3.org/1999/XSL/Transform"   xmlns:w="http://schemas.microsoft.com/office/word/2003/wordml">       <xsl:template match="/">     <xsl:processing-instruction name="mso-application">       <xsl:text>prog</xsl:text>     </xsl:processing-instruction>     <w:wordDocument>       <xsl:attribute name="xml:space">preserve</xsl:attribute>       <w:body>         <w:p>           <w:r>             <w:t>Hello, World!</w:t>           </w:r>         </w:p>       </w:body>     </w:wordDocument>   </xsl:template>     </xsl:stylesheet>

As you can see, there's little to it, beyond slapping xsl:stylesheet and xsl:template elements around the w:wordDocument element. The only additional provisions you need to make are for generating the mso-application PI and the xml:space="preserve" directive in the result. (Using the xsl:attribute element as opposed to a literal xml:space attribute ensures that whitespace will be preserved in the result but not in the stylesheet.)

Obviously, Example 3-1 isn't terribly interesting in its own right. What is interesting is how you can extend it. With XSLT's power and a basic knowledge of WordprocessingML at your disposal, you can create dynamic Word documents quite easily. We'll take a look at one example of doing this: generating data-driven tables in Word.

3.2.1 Generating Data-Driven Tables

Oftentimes, Word documents need to contain tabular data. After all, that's what tables were made for. But it can be quite a pain to manually update tabular data in Word, especially when it's large or frequently changing, such as when generating reports from a database. When that data is exposed as XML a feature increasingly supported among the latest database products, then it becomes quite easy to generate data-driven Word tables using XSLT. Example 3-2 shows an XML document as output from Microsoft Office Access 2003. This example comes straight out of Chapter 8. We've added some indentation for readability.

Example 3-2. An example XML document generated from a database, books.xml
<?xml version="1.0" encoding="UTF-8"?> <dataroot xmlns:od="urn:schemas-microsoft-com:officedata" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="books.xsd" generated="2003-03-26T13:49:17">   <books>     <ISBN>0596005385</ISBN>     <Title>Office 2003 XML Essentials</Title>     <Tagline>Integrating Office with the World</Tagline>     <Short_x0020_Description>Microsoft has added enormous XML functionality to Word, Excel, and Access, as well as a new application, Microsoft InfoPath. This book gets readers started in using those features.     </Short_x0020_Description>     <Long_x0020_Description>Microsoft has added enormous XML functionality to Word, Excel, and Access, as well as a new application, Microsoft InfoPath. This book gets readers started in using those features.     </Long_x0020_Description>     <PriceUS>34.95</PriceUS>   </books>   <books>     <ISBN>0596002920</ISBN>     <Title>XML in a Nutshell, 2nd Edition</Title>     <Tagline>A Desktop Quick Reference</Tagline>     <Short_x0020_Description>This authoritative new edition of XML  in a Nutshell provides developers with a complete guide to the rapidly evolving XML space. </Short_x0020_Description>     <Long_x0020_Description>This authoritative new edition of XML in a Nutshell provides developers with a complete guide to the rapidly evolving XML space.  Serious users of XML will find topics on just about everything they need, including fundamental syntax rules, details of DTD and XML Schema creation, XSLT transformations, and APIs used for processing XML documents.  Simply put, this is the only references of its kind among XML books.     </Long_x0020_Description>     <PriceUS>39.95</PriceUS>   </books>   <books>     <ISBN>0596002378</ISBN>     <Title>SAX2</Title>     <Tagline>Processing XML Efficiently with Java</Tagline>     <Short_x0020_Description>This concise book gives you the information you need to effectively use the Simple API for XML, the dominant API for efficient XML processing with Java.</Short_x0020_Description>     <Long_x0020_Description>This concise book gives you the information you need to effectively use the Simple API for XML, the dominant API for efficient XML processing with Java.</Long_x0020_Description>     <PriceUS>29.95</PriceUS>   </books> </dataroot>

Let's say you want to only display the ISBN, title, tagline, and price of each book. You would start by creating an example four-column table from within Word, formatted however you wish. Figure 3-1 shows one such table.

Figure 3-1. An example table created from within Word
figs/oxml_0301.gif


The table headings in Figure 3-1 are formatted differently than the rest of the cells, using a character style called "CellHeading." The rest of the table cells (containing the data) take on the document's "Normal" paragraph formatting.

Once the table template looks how you want it to look, you would save the document as XML. Then, from a text editor, you would adapt the WordprocessingML into an XSLT stylesheet that generates dynamic tables, using documents like books.xml (Example 3-2) as input. Example 3-3 shows just such a stylesheet (booktable.xsl). The key parts of the stylesheet that make the resulting table dynamic are highlighted.

Example 3-3. Stylesheet for creating a dynamic books table in Word, booktable.xsl
<xsl:stylesheet version="1.0"   xmlns:xsl="http://www.w3.org/1999/XSL/Transform"   xmlns:w="http://schemas.microsoft.com/office/word/2003/wordml">       <xsl:output omit-xml-declaration="no" encoding="UTF-8"/>       <xsl:template match="/">     <xsl:processing-instruction name="mso-application">       <xsl:text>prog</xsl:text>     </xsl:processing-instruction>     <w:wordDocument>       <xsl:attribute name="xml:space">preserve</xsl:attribute>       <w:styles>         <xsl:copy-of select="$styles"/>       </w:styles>       <w:body>         <w:tbl>           <w:tblPr>             <w:tblStyle w:val="TableGrid"/>           </w:tblPr>           <xsl:copy-of select="$heading-row"/>           <xsl:apply-templates select="/dataroot/books"/>         </w:tbl>       </w:body>     </w:wordDocument>   </xsl:template>       <xsl:template match="books">     <w:tr>       <xsl:apply-templates select="ISBN"/>       <xsl:apply-templates select="Title"/>       <xsl:apply-templates select="Tagline"/>       <xsl:apply-templates select="PriceUS"/>     </w:tr>   </xsl:template>     <xsl:template match="books/*">     <w:tc>       <w:p>         <w:r>           <w:t>             <xsl:value-of select="."/>           </w:t>         </w:r>       </w:p>     </w:tc>   </xsl:template>       <xsl:variable name="heading-row">     <w:tr>       <w:tc>         <w:tcPr>           <w:tcW w:w="1216" w:type="dxa"/>         </w:tcPr>         <w:p>           <w:r>             <w:rPr>               <w:rStyle w:val="CellHeading"/>             </w:rPr>             <w:t>ISBN</w:t>           </w:r>         </w:p>       </w:tc>       <w:tc>         <w:tcPr>           <w:tcW w:w="3032" w:type="dxa"/>         </w:tcPr>         <w:p>           <w:r>             <w:rPr>               <w:rStyle w:val="CellHeading"/>             </w:rPr>             <w:t>Title</w:t>           </w:r>         </w:p>       </w:tc>       <w:tc>         <w:tcPr>           <w:tcW w:w="3770" w:type="dxa"/>         </w:tcPr>         <w:p>           <w:r>             <w:rPr>               <w:rStyle w:val="CellHeading"/>             </w:rPr>             <w:t>Tagline</w:t>           </w:r>         </w:p>       </w:tc>       <w:tc>         <w:tcPr>           <w:tcW w:w="838" w:type="dxa"/>         </w:tcPr>         <w:p>           <w:r>             <w:rPr>               <w:rStyle w:val="CellHeading"/>             </w:rPr>             <w:t>Price</w:t>           </w:r>         </w:p>       </w:tc>     </w:tr>   </xsl:variable>       <xsl:variable name="styles">     <!-- list of w:style elements -->   </xsl:variable>     </xsl:stylesheet>

The root template rule in Example 3-3 looks similar to Example 3-1; it creates the mso-application PI, the w:wordDocument root element, and the xml:space attribute:

  <xsl:template match="/">     <xsl:processing-instruction name="mso-application">       <xsl:text>prog</xsl:text>     </xsl:processing-instruction>     <w:wordDocument>       <xsl:attribute name="xml:space">preserve</xsl:attribute>

Since our result document contains some custom styles, the stylesheet needs to output a w:styles element. To save space and reduce clutter, we've encapsulated all of the w:style definitions into a global variable, $styles, and our stylesheet copies that into the w:styles literal result element:

      <w:styles>         <xsl:copy-of select="$styles"/>       </w:styles>

Next, we create the w:body and w:tbl elements. The resulting table is associated with the TableGrid style, which is defined in the result document's w:styles element:

  <w:body>     <w:tbl>       <w:tblPr>         <w:tblStyle w:val="TableGrid"/>       </w:tblPr>

Then, we create the first table row, which is the heading for our table. Just as we did with the w:style elements, we put this row definition in another global variable, $heading-row, and copied it:

  <xsl:copy-of select="$heading-row"/>

The heading row dictates the width of each column, which means we don't have to define the column width for each of the remaining rows. Word automatically gives them the same width as the heading row.

Finally, we begin processing each books element in the source document:

  <xsl:apply-templates select="/dataroot/books"/>

Elsewhere in the stylesheet, we define the template rules that create the rows and columns for our dynamic table. The template rule for table rows matches up each books element in the source document with a table row in the result. Then, inside the table row, we process the ISBN, Title, Tagline, and PriceUS elements, in that order:

  <xsl:template match="books">     <w:tr>       <xsl:apply-templates select="ISBN"/>       <xsl:apply-templates select="Title"/>       <xsl:apply-templates select="Tagline"/>       <xsl:apply-templates select="PriceUS"/>     </w:tr>   </xsl:template>

The template rule for table cells is quite simple. For each element inside the books element that is processed, it creates a table cell containing a paragraph containing a run containing text. The text is simply the string value of the current element in the source document:

  <xsl:template match="books/*">     <w:tc>       <w:p>         <w:r>           <w:t>             <xsl:value-of select="."/>           </w:t>         </w:r>       </w:p>     </w:tc>   </xsl:template>

Now, let's take a look at what the result looks like. Figure 3-2 shows the result of applying booktable.xsl (Example 3-3) to books.xml (Example 3-2).

Figure 3-2. The result of applying booktable.xsl to books.xml
figs/oxml_0302.gif


Creating dynamic Word documents is now so easy with Word 2003 that it just might be WordprocessingML's "killer app." But before we jump to any conclusions, let's look at some of the other fun things we can do with WordprocessingML.

While most constructs in WordprocessingML are straightforward to generate using XSLT, there are certain things, such as VBA macros and embedded images, that cannot be generated using vanilla XSLT. That's because they are encoded in WordprocessingML as Base64 binary, and XSLT has no built-in facilities for processing or generating binary data. However, by utilizing XSLT extension functions, you can get around the limitations of standard XSLT. Oleg Tkachenko has demonstrated in a blog entry how an XSLT stylesheet can generate images in a Word document, using XSLT extensions. For more information, see http://www.tkachenko.com/blog/archives/000106.html.




Office 2003 XML
Office 2003 XML
ISBN: 0596005385
EAN: 2147483647
Year: 2003
Pages: 135

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net