Section 3.4. Transform XML Spreadsheets


3.4. Transform XML Spreadsheets

XML spreadsheets provide Excel data in a format that can be easily used by other applications or transformed into presentation documents, such as HTML web pages. For either task, you often need to modify the content of the XML spreadsheet, and the best way to do that is with XSLT.

You can use XSLT to perform a wide variety of transformations, such as:

  • Extract specific items from a spreadsheetsuch as retrieving only worksheets containing data

  • Transform the spreadsheet into HTML

  • Make global changes to the spreadsheet

  • Highlight significant items, such as high or low outlier numbers


Note: Now that you've saved a workbook as XML, what can you do with it? XSLT lets you transform XML into other formats, extract or highlight information, or make other changes.

3.4.1. How to do it

To transform an XML spreadsheet, follow these general steps:

  1. Create an XSLT file to perform the transformation, using Notepad or some other editor.

  2. Perform the transformation in code, from the command line, or by including a processing instruction.

  3. Save the result.

There are three different ways to perform the transformation. The following table compares each of those techniques, and the sections that follow describe each of the preceding steps in more detail.

Transformation

Use to

Advantages

Disadvantages

Code

Automatically generate the result from within Visual Basic.

Can be performed with a single click by the user, or in response to an event.

Requires Excel to be running.

Command line

Perform batch transformations.

Transformed file is generally smaller than source file.

Uses commandline interface; utility must be downloaded.

Processing instruction

Dynamically transform the file when it is viewed in the browser.

Changes to the XSLT are reflected automatically, underlying source is preserved.

File is generally larger and displays more slowly, because it is transformed on the client. Doesn't work in all browsers.


3.4.2. How to create XSLT

XSLT is a simple language containing looping, decision-making, evaluation, branching, and functional statements. It follows the same conventions as XML and its sole purpose is to interpret and transform valid XML documents into some other text.

Excel qualifies the names of the XML nodes it creates with namespaces from the Microsoft Offices schemas. An Excel workbook defines the following namespaces:

      <Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet"       xmlns:o="urn:schemas-microsoft-com:office:office"       xmlns:x="urn:schemas-microsoft-com:office:excel"       xmlns:dt="uuid:C2F41010-65B3-11d1-A29F-00AA00C14882"       xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet"       xmlns:html="http://www.w3.org/TR/REC-html40"       xmlns:x2="http://schemas.microsoft.com/office/excel/2003/xml">

Notice that the default namespace (xmlns, highlighted in bold ) is urn:schemas-microsoft-com:office:spreadsheet . This is the same as the namespace for the ss prefix (xmlns:ss, also in bold ). You use this ss namespace prefix when referring to workbook nodes in your XSLT file.

Different nodes in the XML spreadsheet use different default namespaces. For instance, the DocumentProperties node uses the following default namespace:

      <DocumentProperties xmlns="urn:schemas-microsoft-com:office:office">

Therefore, when referring to the DocumentProperties node or its children, define a prefix for the namespace urn:schemas-microsoft-com:office:office in your XSLT, and use that prefix to refer to those nodes. If you don't do this, your XSLT won't recognize the nodes in the XML spreadsheet. This is a very common problem, and it should be the first place you look if your XSLT outputs nothing.

It is convenient to copy the namespace definitions from the XML spreadsheet Worksheet node to your XSLT stylesheet. For instance, the following XSLT example uses the copied ss namespace to locate nodes in an XML spreadsheet:

      <?xml version="1.0"?>      <!-- Strip.xslt transforms a XML Spreadsheet to its bare essentials -->      <xsl:stylesheet version="1.0"          xmlns="urn:schemas-microsoft-com:office:spreadsheet"         xmlns:xsl="http://www.w3.org/1999/XSL/Transform"          xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet">          <xsl:output method="xml" indent="yes" />          <xsl:template match="ss:Workbook">          <xsl:processing-instruction name="mso-application">prog      </xsl:processing-instruction>          <xsl:element name="ss:Workbook">              <xsl:copy-of select="ss:Styles" />              <xsl:for-each select="ss:Worksheet">                  <xsl:if test="count(ss:Table/ss:Row/ss:Cell/ss:Data) &gt; 0">                      <xsl:copy-of select="." />                  </xsl:if>              </xsl:for-each>          </xsl:element>          </xsl:template>      </xsl:stylesheet>

The preceding transformation copies worksheets that contain data, and formatting styles used by cells in those worksheets, into a new XML spreadsheet file. Empty worksheets, document properties, and other items are simply omitted. Excel still recognizes the resulting output as an XML spreadsheet since it conforms to the Excel schema and contains the mso-application processing instruction.

To see how this transformation works:

  1. Create a workbook in Excel and enter some data in its worksheets.

  2. Save the workbook as an XML spreadsheet named TestStrip.xml .

  3. Process the XML spreadsheet using the sample file XSLT. Ways to process the XML file are described in the following sections.

  4. In Windows Explorer, double-click on the output file. Excel will display the transformed XML (Figure 3-2).

Figure 3-2. An XML spreadsheet with empty worksheets removed by a transformation


3.4.3. How to transform in code

As mentioned previously, there are several ways to transform XML. Transforming XML from Visual Basic code uses the Microsoft XML object library to call the Microsoft XML Parser (msxml4.dll). The Microsoft XML object library also provides a means to create new XML files; navigate between nodes; copy, delete, and add nodes; and more.

To perform a transformation in code, follow these steps:

  1. In Visual Basic, add a reference to the Microsoft XML object library. The Microsoft XML object library provides the DOMDocument object, which is used to load, transform, and save XML documents.

  2. In code, create two instances of DOMDocument objects from the Microsoft XML object library.

  3. Load the XML spreadsheet in the first DOMDocument object.

  4. Load the XSLT file in the second DOMDocument object.

  5. Use the transformNode method of the first DOMDocument object to perform the transformation.

For example, the following code loads the TestStrip.xml XML spreadsheet and Strip.xslt transformation, processes the transformation, and saves the result:

      Sub Strip(  )          ' Requires reference to Microsoft XML          Dim xdoc As New DOMDocument, xstyle As New DOMDocument          Dim xml As String          xdoc.Load (ThisWorkbook.path & "\TestStrip.xml")          xstyle.Load (ThisWorkbook.path & "\Strip.xslt")          xml = xdoc.transformNode(xstyle)          SaveFile xml, "Out.xml"      End Sub      Sub SaveFile(content As String, fileName As String)          ' Requires reference to Microsoft Scripting Runtime          Dim fso As New FileSystemObject, strm As TextStream          fileName = ThisWorkbook.path & "\" & fileName          If fso.FileExists(fileName) Then fso.DeleteFile(fileName)          Set strm = fso.CreateTextFile(fileName)          strm.Write (content)          strm.Close      End Sub

The preceding SaveFile helper procedure is necessary because the transformNode method returns a string containing the XML created by the transformation. Once the XML is saved, you can open the file by double-clicking on it in Windows Explorer, or by using the following code:

      Application.Workbooks.Open ("out.xml")

3.4.4. How to transform from the command line

You can also perform transformations using the command-line transformation utility (msxsl.exe). Msxsl.exe is available from Microsoft for free in the MSDN download area. It is a small shell executable that simply calls the Microsoft XML Parser to perform the transformation.

For example, the following command line transforms the TestStrip.xml file, using the Strip.xslt transformation shown previously, and writes the output to Out.xml :

      msxsl TestStrip.xml Strip.xslt -o Out.xml

The output is the same as that created by using the DOMDocument object's TRansformNode method shown in the preceding section. The command-line utility allows you to automate transformations using batch files rather than Visual Basic code.

3.4.5. How to transform in the browser

Another way to perform a transformation is to include an xml-stylesheet processing instruction in the XML spreadsheet. The mso-application instruction supersedes other instructions, so you must replace that processing instruction in order to have a browser perform the translation. The following XML shows the changes you must make to the XML spreadsheet file: deletions are shown in strikethrough and additions are shown in bold :

      <?xml version="1.0"?>      <?xml-stylesheet type="text/xsl" href="worksheet.xslt"?>      <?mso-application prog?>      <Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet"      xmlns:o="urn:schemas-microsoft-com:office:office"            ...

Now when a user opens the XML file, the browser transforms and displays the file (Figure 3-3).


Note: Not all browsers support this technique. Be sure to test your transformations in non-Microsoft browsers.

Figure 3-3. Transforming an XML spreadsheet in the browser


3.4.6. How it works

The transformation shown in Figure 3-3 converts cells in a worksheet to HTML table elements. It also displays document properties of the workbook. The transformation is performed by the following XSLT fragment:

      <xsl:template match="ss:Workbook">          <html>              <body>              <h1>Display XML Spreadsheets as HTML Tables</h1>              Author:       <xsl:value-of select="o:DocumentProperties/o:Author" />      <br />                LastSaved:       <xsl:value-of select="o:DocumentProperties/o:LastSaved" />      <br />                Number of worksheets:                   <xsl:value-of select="count(ss:Worksheet)" />                <xsl:for-each select="ss:Worksheet">                    <h2><xsl:value-of select="@ss:Name" /></h2>                    <table border="1" frame="box">                    <xsl:for-each select="ss:Table/ss:Row">                    <tr>                        <xsl:for-each select="ss:Cell/ss:Data">                            <td><xsl:value-of select="." /></td>                        </xsl:for-each>                    </tr>                            </xsl:for-each>                    </table>                </xsl:for-each>                </body>            </html>

The advantage of using a processing instruction to perform the transformation is that you don't significantly alter the spreadsheet. You can switch the file back to an Excel XML spreadsheet simply by removing the xml-stylesheet instruction and replacing the mso-application instruction, as shown in the following XSLT fragment:

      <xsl:template match="ss:Workbook">           <xsl:processing-instruction name="xml-stylesheet">      type="text/xsl" href="Worksheet.xslt"</xsl:processing-instruction>          <xsl:copy-of select="." />      </xsl:template>

To transform the file back into an XML spreadsheet, simply change the xsl:processing-instruction element:

      <xsl:processing-instruction name="mso-application">      prog</xsl:processing-instruction>

When a user requests an XML file that includes an xml-stylesheet processing instruction, the file is downloaded and the transformation is processed on the user's machine. That takes more time than if the XML file had already been transformed, but any changes to the XSLT are automatically reflected, because the transformation is performed dynamically.

3.4.7. How to reset file associations

In order for the mso-application processing instruction to work correctly when the user opens the file from Windows Explorer, the XML file type must be associated with the Microsoft Office XML Editor (MsoXmlEd.Exe). Some applications, such as Mozilla, can change this association and break the ability to easily open these files from Explorer.

If this happens, you can reset the XML file type by following these steps:

  1. From Windows Explorer, right-click an XML file and choose Open With Choose Program. Windows displays the Open With dialog box.

  2. Select XML Editor and select Always use the selected program to open this type of file, then click OK. Windows updates the system registry to associate XML files with the Microsoft Office XML Editor.

3.4.8. What about...

To get

Look here

Free IE XML Validation/XSL Transformation viewer

Search for "Validating XSLT" at www.microsoft.com/downloads/

Free XML/XSL Editor

xmlcooktop.com

XML/XSL Debugger (free trial of a good product)

new.xmlspy.com/products_ide. html




    Excel 2003 Programming. A Developer's Notebook
    Excel 2003 Programming: A Developers Notebook (Developers Notebook)
    ISBN: 0596007671
    EAN: 2147483647
    Year: 2004
    Pages: 133
    Authors: Jeff Webb

    flylib.com © 2008-2017.
    If you may any questions please contact us: flylib@qtcs.net