7.4 Creating XML Spreadsheets


There are two basic routes to creating a SpreadsheetML document. The first route is perhaps best described as "start from scratch," where you assemble a spreadsheet using the XML vocabulary. The other route uses Excel to build a template for the spreadsheets you create, accepting a certain amount of overhead for the convenience of using a familiar GUI (rather than a collection of XML parts) to create a spreadsheet. In general, especially where styles are involved, I strongly recommend using Excel to generate an initial SpreadsheetML file you can use as a model.

Whichever approach you choose, you don't need to provide as much information in your SpreadsheetML as Excel provides when you save information out. Most of the metadata can be discarded, and Excel can also reconstruct named ranges if necessary from the NamedCell elements inside of cells. Some data, like the ss:ExpandedColumnCount and ss:ExpandedRowCount attributes on the Table element, may actually be better left out, as it takes extra effort to generate and may produce errors when the spreadsheet is loaded if it's wrong. For the most part, you'll want to focus on creating the basic row and cell structures, along with styles.

You can use whatever tool you like to generate SpreadsheetML. XSLT, Java, C#, PHP, Perl, Python, Visual Basic, and many more will all work perfectly well. For complex spreadsheets with a lot of cross-references, I recommend working in whatever environment you're most comfortable in, as getting large numbers of cross-references right is a challenge, especially if they link among themselves. For simpler spreadsheets, though, XSLT's ready ability to take existing XML and add extra instructions to it makes it a very convenient tool for generating SpreadsheetML.

To demonstrate, the stylesheet in Example 7-10 will take the XML shown earlier in Example 7-8 and convert it back into SpreadsheetML. Critical pieces of logic are highlighted in bold.

Example 7-10. A stylesheet for generating SpreadsheetML
<xsl:stylesheet version="1.0"    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"   xmlns:d="http://simonstl.com/ns/dinosaurs/"   xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet"   xmlns="urn:schemas-microsoft-com:office:spreadsheet"   xmlns="urn:schemas-microsoft-com:office:spreadsheet"   xmlns:o="urn:schemas-microsoft-com:office:office"   xmlns:x="urn:schemas-microsoft-com:office:excel"   xmlns:html="http://www.w3.org/TR/REC-html40"  > <xsl:output method="xml" omit-xml-declaration="no" indent="yes" encoding="US-    ASCII"/> <xsl:template match="d:dinosaurs"> <xsl:processing-instruction name="mso-application">progid=   "Excel.Sheet"</xsl:processing-instruction> <Workbook> <!--Namespace declarations moved from Workbook to xsl:stylesheet-->  <Styles>   <Style ss: ss:Name="Normal">    <Alignment ss:Vertical="Bottom"/>    <Borders/>    <Font/>    <Interior/>    <NumberFormat/>    <Protection/>   </Style>   <Style ss:>    <NumberFormat ss:Format="mmm\-yy"/>   </Style>   <Style ss:>    <NumberFormat ss:Format="&quot;$&quot;#,##0.00"/>   </Style>  </Styles>  <Worksheet ss:Name="Sheet1">   <Table ss:ExpandedColumnCount="5" ss:ExpandedRowCount="{count(d:sale)+4}"       x:FullColumns="1"       x:FullRows="1">    <Column ss:AutoFitWidth="0" ss:Width="73.5"/>    <Column ss:AutoFitWidth="0" ss:Width="96.75"/>    <Column ss:Index="5" ss:AutoFitWidth="0" ss:Width="56.25"/>    <Row>     <Cell><Data ss:Type="String">Sales for:</Data></Cell>     <Cell ss:Style><Data ss:Type="DateTime"><xsl:value-of         select="d:date"/></Data></Cell>    </Row>    <Row ss:Index="3">     <Cell><Data ss:Type="String">ID Number</Data></Cell>     <Cell><Data ss:Type="String">Critter</Data></Cell>     <Cell><Data ss:Type="String">Price</Data></Cell>     <Cell><Data ss:Type="String">Quantity</Data></Cell>     <Cell><Data ss:Type="String">Total</Data></Cell>    </Row> <xsl:apply-templates select="d:sale" />    <Row>     <Cell ss:Index="4"><Data ss:Type="String">Total:</Data></Cell>     <Cell ss:Style ss:Formula="=SUM(R[-{count(d:sale)}]C:R[-1]C)">       <Data ss:Type="Number"></Data></Cell>    </Row>   </Table>   <WorksheetOptions xmlns="urn:schemas-microsoft-com:office:excel">    <Print>     <ValidPrinterInfo/>     <HorizontalResolution>600</HorizontalResolution>     <VerticalResolution>600</VerticalResolution>    </Print>    <Selected/>    <Panes>     <Pane>      <Number>3</Number>      <ActiveRow>12</ActiveRow>      <ActiveCol>1</ActiveCol>     </Pane>    </Panes>    <ProtectObjects>False</ProtectObjects>    <ProtectScenarios>False</ProtectScenarios>   </WorksheetOptions>  </Worksheet>  <Worksheet ss:Name="Sheet2">   <WorksheetOptions xmlns="urn:schemas-microsoft-com:office:excel">    <ProtectObjects>False</ProtectObjects>    <ProtectScenarios>False</ProtectScenarios>   </WorksheetOptions>  </Worksheet>  <Worksheet ss:Name="Sheet3">   <WorksheetOptions xmlns="urn:schemas-microsoft-com:office:excel">    <ProtectObjects>False</ProtectObjects>    <ProtectScenarios>False</ProtectScenarios>   </WorksheetOptions>  </Worksheet> </Workbook> </xsl:template> <xsl:template match="d:sale">    <Row>     <Cell><Data ss:Type="Number"><xsl:value-of         select="d:IDnum" /></Data><NamedCell ss:Name="ID"/></Cell>     <Cell><Data ss:Type="String"><xsl:value-of select="d:critter" /></ Data><NamedCell ss:Name="Critters"/></Cell>     <Cell ss:Style><Data ss:Type="Number"><xsl:value-of         select="d:price" /></Data><NamedCell       ss:Name="Price"/></Cell>     <Cell><Data ss:Type="Number"><xsl:value-of select="d:quantity" /></Data>        <NamedCell ss:Name="Quantity"/></Cell>     <Cell ss:Style ss:Formula="=RC[-2]*RC[-1]"><Data ss:Type="Number">        <xsl:value-of select="d:total" /></Data></Cell>    </Row> </xsl:template> <xsl:template match="d:date" /> <xsl:template match="d:total" /> </xsl:stylesheet>

There are a few pieces of this worth special attention. First, note that the SpreadsheetML is wrapped in XSLT; the SpreadsheetML becomes part of the stylesheet. There's one extra namespace declaration:

  xmlns:d="http://simonstl.com/ns/dinosaurs/"

XSLT requires that references to parts of XML documents that have namespace URIs also have namespace prefixes. As a result, all references in the stylesheet to elements in the original document will look like d:sale instead of just sale.

There's also one piece of the SpreadsheetML we need to recreate explicitly, and not just by including it in the document: the processing instruction noted earlier that tells Windows this is an Excel spreadsheet. For that, we have to use:

<xsl:processing-instruction name="mso-application">progid=    "Excel.Sheet"</xsl:processing-instruction>

Because the named ranges will vary depending on the number of sale elements in the original, this stylesheet won't generate the Names element and its contents. Excel will recreate the named ranges from the NamedCell elements in any case.

This stylesheet creates a Table element complete with (accurate) ss:ExpandedColumnCount and ss:ExpandedRowCount attributes.

  <Table ss:ExpandedColumnCount="5" ss:ExpandedRowCount="{count(d:sale)+4}"       x:FullColumns="1"  x:FullRows="1">

If calculating the number of rows or columns in your spreadsheet is going to be difficult, it will be better to leave off this information, as it produces an error if wrong but little benefit if right.

The first row of the spreadsheet contains the date:

   <Row>     <Cell><Data ss:Type="String">Sales for:</Data></Cell>     <Cell ss:Style><Data ss:Type="DateTime"><xsl:value-of select="d:date"/></Data></Cell>    </Row>

The xsl:value-of element pulls the information from the date element of the XML document and puts its value into the Data element. As we'll see at the end of the spreadsheet, regular processing of the date element (and the total element, which is handled similarly) will have to be suppressed.

The heart of this stylesheet is again the part that generates the Row and Cell elements, like:

<xsl:template match="d:sale">    <Row>     <Cell><Data ss:Type="Number"><xsl:value-of select="d:IDnum" /></Data><NamedCell ss:Name="ID"/></Cell>

The xsl:template element will collect every sale element in the original and produce a Row element which itself contains Cell elements matching its contents. Each Row contains the contents of one sale element. To keep XSLT from applying its default templates to the date and total elements, which would drop their values into the SpreadsheetML as (unexpected) text, the last code snippet explicitly specifies no processing for them with empty xsl:template elements.

<xsl:template match="d:date" /> <xsl:template match="d:total" />

The SpreadsheetML created by this stylesheet from the XML data in Example 7-8 looks like Example 7-11.

Example 7-11. A SpreadsheetML document created with XSLT
<?xml version="1.0"?> <?mso-application prog?> <Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet"  xmlns:o="urn:schemas-microsoft-com:office:office"  xmlns:x="urn:schemas-microsoft-com:office:excel"  xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet"  xmlns:html="http://www.w3.org/TR/REC-html40"  xmlns:d="http://simonstl.com/ns/dinosaurs/">  <Styles>   <Style ss: ss:Name="Normal">    <Alignment ss:Vertical="Bottom"/>    <Borders/>    <Font/>    <Interior/>    <NumberFormat/>    <Protection/>   </Style>   <Style ss:>    <NumberFormat ss:Format="mmm\-yy"/>   </Style>   <Style ss:>    <NumberFormat ss:Format="&quot;$&quot;#,##0.00"/>   </Style>  </Styles>  <Worksheet ss:Name="Sheet1">   <Table ss:ExpandedColumnCount="5" ss:ExpandedRowCount="12" x:FullColumns="1"    x:FullRows="1">    <Column ss:AutoFitWidth="0" ss:Width="73.5"/>    <Column ss:AutoFitWidth="0" ss:Width="96.75"/>    <Column ss:Index="5" ss:AutoFitWidth="0" ss:Width="56.25"/>    <Row>     <Cell><Data ss:Type="String">Sales for:</Data></Cell>     <Cell ss:Style><Data ss:Type="DateTime">2004-01-01T00:00:00.000</Data></Cell>    </Row>    <Row ss:Index="3">     <Cell><Data ss:Type="String">ID Number</Data></Cell>     <Cell><Data ss:Type="String">Critter</Data></Cell>     <Cell><Data ss:Type="String">Price</Data></Cell>     <Cell><Data ss:Type="String">Quantity</Data></Cell>     <Cell><Data ss:Type="String">Total</Data></Cell>    </Row>    <Row>     <Cell><Data ss:Type="Number">4627</Data><NamedCell ss:Name="ID"/></Cell>     <Cell><Data ss:Type="String">Diplodocus</Data><NamedCell ss:Name="Critters"/>        </Cell>     <Cell ss:Style><Data ss:Type="Number">22.5</Data><NamedCell       ss:Name="Price"/></Cell>     <Cell><Data ss:Type="Number">127</Data><NamedCell ss:Name="Quantity"/></Cell>     <Cell ss:Style ss:Formula="=RC[-2]*RC[-1]"><Data ss:Type="Number">        2857.5</Data></Cell>    </Row>    <Row>     <Cell><Data ss:Type="Number">3912</Data><NamedCell ss:Name="ID"/></Cell>     <Cell><Data ss:Type="String">Brontosaurus</Data>        <NamedCell ss:Name="Critters"/></Cell>     <Cell ss:Style><Data ss:Type="Number">17.5</Data><NamedCell       ss:Name="Price"/></Cell>     <Cell><Data ss:Type="Number">74</Data><NamedCell ss:Name="Quantity"/></Cell>     <Cell ss:Style ss:Formula="=RC[-2]*RC[-1]"><Data ss:Type="Number">        1295</Data></Cell>    </Row>    <Row>     <Cell><Data ss:Type="Number">9845</Data><NamedCell ss:Name="ID"/></Cell>     <Cell><Data ss:Type="String">Triceratops</Data>        <NamedCell ss:Name="Critters"/></Cell>     <Cell ss:Style><Data ss:Type="Number">12</Data>        <NamedCell       ss:Name="Price"/></Cell>     <Cell><Data ss:Type="Number">91</Data><NamedCell ss:Name="Quantity"/></Cell>     <Cell ss:Style ss:Formula="=RC[-2]*RC[-1]"><Data ss:Type="Number">        1092</Data></Cell>    </Row>    <Row>     <Cell><Data ss:Type="Number">9625</Data><NamedCell ss:Name="ID"/></Cell>     <Cell><Data ss:Type="String">Vulcanodon</Data>        <NamedCell ss:Name="Critters"/></Cell>     <Cell ss:Style><Data ss:Type="Number">19</Data><NamedCell       ss:Name="Price"/></Cell>     <Cell><Data ss:Type="Number">108</Data><NamedCell ss:Name="Quantity"/></Cell>     <Cell ss:Style ss:Formula="=RC[-2]*RC[-1]"><Data ss:Type="Number">        2052</Data></Cell>    </Row>    <Row>     <Cell><Data ss:Type="Number">5903</Data><NamedCell ss:Name="ID"/></Cell>     <Cell><Data ss:Type="String">Stegosaurus</Data>        <NamedCell ss:Name="Critters"/></Cell>     <Cell ss:Style><Data ss:Type="Number">18.5</Data><NamedCell       ss:Name="Price"/></Cell>     <Cell><Data ss:Type="Number">63</Data><NamedCell ss:Name="Quantity"/></Cell>     <Cell ss:Style ss:Formula="=RC[-2]*RC[-1]"><Data ss:Type="Number">        1165.5</Data></Cell>    </Row>    <Row>     <Cell><Data ss:Type="Number">1824</Data><NamedCell ss:Name="ID"/></Cell>     <Cell><Data ss:Type="String">Monoclonius</Data>        <NamedCell ss:Name="Critters"/></Cell>     <Cell ss:Style><Data ss:Type="Number">16.5</Data><NamedCell       ss:Name="Price"/></Cell>     <Cell><Data ss:Type="Number">133</Data><NamedCell ss:Name="Quantity"/></Cell>     <Cell ss:Style ss:Formula="=RC[-2]*RC[-1]"><Data ss:Type="Number">        2194.5</Data></Cell>    </Row>    <Row>     <Cell><Data ss:Type="Number">9728</Data><NamedCell ss:Name="ID"/></Cell>     <Cell><Data ss:Type="String">Megalosaurus</Data>        <NamedCell ss:Name="Critters"/></Cell>     <Cell ss:Style><Data ss:Type="Number">23</Data><NamedCell       ss:Name="Price"/></Cell>     <Cell><Data ss:Type="Number">128</Data><NamedCell ss:Name="Quantity"/></Cell>     <Cell ss:Style ss:Formula="=RC[-2]*RC[-1]"><Data ss:Type="Number">        2944</Data></Cell>    </Row>    <Row>     <Cell><Data ss:Type="Number">8649</Data><NamedCell ss:Name="ID"/></Cell>     <Cell><Data ss:Type="String">Barosaurus</Data>        <NamedCell ss:Name="Critters"/></Cell>     <Cell ss:Style><Data ss:Type="Number">17</Data><NamedCell       ss:Name="Price"/></Cell>     <Cell><Data ss:Type="Number">91</Data><NamedCell ss:Name="Quantity"/></Cell>     <Cell ss:Style ss:Formula="=RC[-2]*RC[-1]"><Data ss:Type="Number">        1547</Data></Cell>    </Row>    <Row>     <Cell ss:Index="4"><Data ss:Type="String">Total:</Data></Cell>     <Cell ss:Style ss:Formula="=SUM(R[-8]C:R[-1]C)">        <Data ss:Type="Number">15147.5</Data></Cell>    </Row>   </Table>   <WorksheetOptions xmlns="urn:schemas-microsoft-com:office:excel">    <Print>     <ValidPrinterInfo/>     <HorizontalResolution>600</HorizontalResolution>     <VerticalResolution>600</VerticalResolution>    </Print>    <Selected/>    <Panes>     <Pane>      <Number>3</Number>      <ActiveRow>12</ActiveRow>      <ActiveCol>1</ActiveCol>     </Pane>    </Panes>    <ProtectObjects>False</ProtectObjects>    <ProtectScenarios>False</ProtectScenarios>   </WorksheetOptions>  </Worksheet>  <Worksheet ss:Name="Sheet2">   <WorksheetOptions xmlns="urn:schemas-microsoft-com:office:excel">    <ProtectObjects>False</ProtectObjects>    <ProtectScenarios>False</ProtectScenarios>   </WorksheetOptions>  </Worksheet>  <Worksheet ss:Name="Sheet3">   <WorksheetOptions xmlns="urn:schemas-microsoft-com:office:excel">    <ProtectObjects>False</ProtectObjects>    <ProtectScenarios>False</ProtectScenarios>   </WorksheetOptions>  </Worksheet> </Workbook>

If you open the SpreadsheetML this stylesheet produces (which looks much like that in Example 7-2, minus some named ranges, metadata, and formatting) in Excel, we get the result shown in Figure 7-8.

Figure 7-8. A spreadsheet generated as SpreadsheetML
figs/oxml_0708.gif




Office 2003 XML
Office 2003 XML
ISBN: 0596005385
EAN: 2147483647
Year: 2003
Pages: 135

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net