XML::LibXSLT Perl Module


XML:: LibXSLT Perl Module

The XML::LibXSLT Perl module was written by Matt Sergeant and is based on the GNOME Project's libxslt C library. The libxslt C library is a fast, stable, and portable XSLT processor library. Another important feature is that it strictly follows the XSLT standard. Let's take a look at an example that uses the XML::LibXSLT Perl module.

XML::LibXSLT Perl Module Example

This example will demonstrate how to use a Perl XSLT processor to convert an XML document into an HTML document. Depending on your requirements, this process can be performed offline, or it can reside on a web server to produce HTML documents based on dynamically generated XML documents.

Input XML Document Format

For this example, let's assume that we have an XML document stored as a file and want to transform the XML document to an HTML document. The XML document to be transformed is shown in Listing 8.11. This XML document is similar to the XML document that was used for an example back in Chapter 4. Our task for this example is to convert the XML document to HTML and display the contents of the file in a table. We won't be doing any filtering in this example so all the elements (and any corresponding attributes) from the input XML document will appear in the output HTML document.

As you can see in Listing 8.11, each customer element has one attribute ( account_number ) and three elements ( name , balance , and due_date ). Based on the structure of our input XML document, we'd like to have one row in the output table per customer. Because we've seen the input data, let's next take a look at the XSLT stylesheet that will support our requirements.

Listing 8.11 Input XML document for XSLT transformation. (Filename: ch8_libxslt_customers.xml)
 <?xml version="1.0" encoding="UTF-8"?>  <!DOCTYPE customer_data SYSTEM "customers.dtd">  <customer_data>    <customer account_number="cid_1">        <name>Joseph Burns</name>        <balance>19.95</balance>        <due_date>May 5</due_date>     </customer>     <customer account_number="cid_2">        <name>Kayla Burns</name>        <balance>29.95</balance>        <due_date>May 12</due_date>     </customer>     <customer account_number="cid_3">        <name>Roger Smith</name>        <balance>100.25</balance>        <due_date>May 19</due_date>     </customer>     <customer account_number="cid_4">        <name>James Kennedy</name>        <balance>0.00</balance>        <due_date>N/A</due_date>     </customer>     <customer account_number="cid_5">        <name>Margaret Pelligrino</name>        <balance>0.00</balance>        <due_date>N/A</due_date>     </customer>     <customer account_number="cid_6">        <name>Michael Harwell</name>        <balance>1000.00</balance>        <due_date>May 22</due_date>     </customer>     <customer account_number="cid_7">        <name>Riley Corgi</name>        <balance>100.00</balance>        <due_date>June 1</due_date>     </customer>  </customer_data> 
XSLT Stylesheet to Convert XML to HTML

The XSLT stylesheet that will generate an HTML table based on our requirements is shown in Listing 8.12. Listing 8.12 may seem like a long stylesheet (and it is), however, you'll see when we walk through it that there are several repeated sections. It's not as intimidating as it may first appear.

Listing 8.12 XSLT stylesheet to perform our XML to HTML transformation. (Filename: ch8_libxslt_customers.xslt)
 1.   <?xml version="1.0"?>  2.   <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">  3.  4.   <xsl:template match="/">  5.   <html>  6.      <head>  7.         <title>Customer Report</title>  8.      </head>  9.      <body>  10.     <h2>Customer Report</h2>  11.        <table border="2">  12.           <tr align="center">  13.              <th>Account Number</th>  14.              <th>Name</th>  15.              <th>Balance</th>  16.              <th>Due Date</th>  17.           </tr>  18.  19.           <xsl:for-each select="customer_data/customer">  20.              <tr align="center">  21.                 <td><xsl:value-of select="@account_number" /></td>  22.                 <td><xsl:value-of select="name" /></td>  23.                 <td><xsl:value-of select="balance" /></td>  24.                 <td><xsl:value-of select="due_date" /></td>  25.              </tr>  26.           </xsl:for-each>  27.  28.        </table>  29.     </body>  30.  </html>  31.  </xsl:template>  32.  33.  </xsl:stylesheet> 

1 “2 The first two lines of the XSLT stylesheet identify it as an XML document and also as an XSLT stylesheet.

 1.   <?xml version="1.0"?>  2.   <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> 

4 “17 This section of the stylesheet starts off with an <xsl:template> element and the match="/" attribute to indicate that this template applies to the root element of the XML document. This portion contains the opening <html> , <head> , and <body> tags that will appear in the output HTML document. Basically, our entire XML document will be wrapped by the <html> and <body> elements. In this section, we're also defining the table column titles.

 4.   <xsl:template match="/">  5.   <html>  6.      <head>  7.         <title>Customer Report</title>  8.      </head>  9.      <body>  10.     <h2>Customer Report</h2>  11.        <table border="2">  12.           <tr align="center">  13.              <th>Account Number</th>  14.              <th>Name</th>  15.              <th>Balance</th>  16.              <th>Due Date</th>  17.           </tr> 

19 “26 This section contains a <xsl:for-each> element. This element acts as an iterator, basically looping through a list of matched elements. In this case, the select attribute is set to match all the elements based on the XPath expression customer_data/customer . The <xsl:for-each> element will match once for each of our <customer> elements. After the <xsl:for-each> element matches a <customer> element (which in our case, this will be all the customer elements), we extract the information contained in the customer element. As you can see, we retrieve the value of the account_number attribute (using the ' @ ' symbol to indicate an attribute), and the <name> , <balance> , and <due_date> elements. Each <customer> element will occupy one row in the generated table. Note that each matching customer element is wrapped by a <tr> tag, and each column value is wrapped by a <td> tag.

 19.           <xsl:for-each select="customer_data/customer">  20.              <tr align="center">  21.                 <td><xsl:value-of select="@account_number" /></td>  22.                 <td><xsl:value-of select="name" /></td>  23.                 <td><xsl:value-of select="balance" /></td>  24.                 <td><xsl:value-of select="due_date" /></td>  25.              </tr>  26.           </xsl:for-each> 

28 “33 The last section of the XSLT stylesheet contains all the required closing tags. Note that each opening tag has a corresponding closing tag.

 28.        </table>  29.     </body>  30.  </html>  31.  </xsl:template>  32.  33.  </xsl:stylesheet> 

Now that I've shown you the XSLT stylesheet, let's take a look at the Perl program that actually performs the transformation from XML to HTML. As you will see, most of the hard work has been done using the Perl XML::LibXSLT module; the transformation is actually the easy part of the process.

XML::LibXSLT-Based Perl Program

The Perl program that performs the transformation for us is shown in Listing 8.13. As you can see, it is probably one of the shorter Perl programs I've discussed so far. Most of the work in a task such as this is spent in designing the XSLT stylesheet, and writing the Perl program is usually the easy part. Let's walk through this program and explain the steps required to perform the transformation.

Listing 8.13 XML::LibXSLT-based Perl program to generate the output HTML file. (Filename: ch8_libxslt_app.pl)
 1.   use strict;  2.   use XML::LibXML; 3.   use XML::LibXSLT;  4.  5.   # Instantiate parser and xslt objects.  6.   my $parserObject = XML::LibXML->new();  7.   my $xsltObject = XML::LibXSLT->new();  8.  9.   # Open the XML document and the XSLT stylesheet.  10.  my $inputXmlObject = $parserObject->parse_file("ch_libxslt_customers.xml");  11.  my $inputStylesheetObject = $parserObject->parse_file ("ch_libxslt_customers.xslt");  12.  13.  # Parse the stylesheet, transform the input XML document,  14.  # and output the result to the $htmlFile scalar.  15.  my $xsltObject = $xsltObject->parse_stylesheet($inputStylesheetObject);  16.  my $resultsObject = $xsltObject->transform($inputXmlObject);  17.  my $htmlFile = $xsltObject->output_string($resultsObject);  18.  19.  # Write the results to an output file.  20.  open (HTML_REPORT, "> ch8_libxslt_customer_report.html")  21.    or die "Can't open ch8_libxslt_customer_report.html $!\n";  22.  23.  print HTML_REPORT $htmlFile;  24.  25.  close (HTML_REPORT); 

1 “3 The opening section of the Perl program has the standard use strict pragma. In addition, you need to include the use XML::LibXML and use XML::LibXSLT pragmas. The XML::LibXSLT has a dependence on the XML::LibXML module because it expects XML::LibXML::Document objects as inputs.

 1.   use strict;  2.   use XML::LibXML;  3.   use XML::LibXSLT; 

5 “11 In this section of the program, we're creating all the objects required for the transformation. First, we create a new XML::LibXML parser object and a new XML::LibXSLT object. After we've created these new objects we parse the input XML document and the XSLT stylesheet document and create XML::LibXML::Document objects. These are the trees that we discussed during the high-level discussion about XSLT processors.

 5.   # Instantiate parser and xslt objects.  6.   my $parserObject = XML::LibXML->new();  7.   my $xsltObject = XML::LibXSLT->new();  8.  9.   # Open the XML document and the XSLT stylesheet.  10.  my $inputXmlObject = $parserObject->parse_file("ch_libxslt_customers.xml");  11.  my $inputStylesheetObject = $parserObject->parse_file("ch_libxslt_customers.xslt"); 

13 “25 This is the portion of the program that does all the difficult work, and note that it is only a few lines long. Now that we've built the tree objects containing the contents of both the input XML document and the XSLT stylesheet, we can perform the transformation. The first step is to call the parse_stylesheet() function which returns an XML:: LibXSLT::Stylesheet object. After we have the XML::LibXSLT::Stylesheet object, we can call the transform method. Note that the arguments to both of the previous methods were XML::LibXML::Document objects.

At this point, the input XML document has been transformed to HTML, and the only remaining task is to call the output_string() method that returns a scalar containing the resulting output file. After calling the output string method, the HTML document is stored in the $htmlFile scalar. Because our task was to generate an HTML file for this example, we simply write it to an output file.

 13.  # Parse the stylesheet, transform the input XML document,  14.  # and output the result to the $htmlFile scalar.  15.  my $xsltObject = $xsltObject->parse_stylesheet($inputStylesheetObject);  16.  my $resultsObject = $xsltObject->transform($inputXmlObject);  17.  my $htmlFile = $xsltObject->output_string($resultsObject);  18.  19.  # Write the results to an output file.  20.  open (HTML_REPORT, "> ch8_libxslt_customer_report.html")  21.    or die "Can't open ch8_libxslt_customer_report.html $!\n";  22.  23.  print HTML_REPORT $htmlFile;  24.  25.  close (HTML_REPORT); 

Working with Dynamically Generated Files

If the XML document and XSLT stylesheets were dynamically generated (for example, because of a user 's selection on a web page), the basic procedures are the same, with only subtle differences. For this example, we would only need to make minor changes (for example, parse_file() changes to parse_string() or parse_fh() , depending on the situation).

Generated HTML Output File

Listing 8.14 shows the HTML file that was generated by our Perl program.

Listing 8.14 HTML file generated by XSLT transformation. (Filename: ch8_libxslt_customer_report.html)
 <html>      <head>     <meta content="text/html; charset=UTF-8" http-equiv="Content-Type">      </head>      <body>        <span style="font-weight:bold">Customer Report</span><br><br>       <table border="1"><thead>           <tr>        <td align="center"><span style="font-weight:bold">Account Number</span></td>        <td align="center"><span style="font-weight:bold">Name</span></td>        <td align="center"><span style="font-weight:bold">Balance</span></td>        <td align="center"><span style="font-weight:bold">Due Date</span></td>           </tr></thead><tbody>           <tr>        <td align="center">cid_1</td>       <td align="center">Joseph Burns</td>        <td align="center">19.95</td>        <td align="center">May 5</td>           </tr>           <tr>        <td align="center">cid_2</td>        <td align="center">Kayla Burns</td>        <td align="center">29.95</td>        <td align="center">May 12</td>           </tr>           <tr>        <td align="center">cid_3</td>        <td align="center">Roger Smith</td>        <td align="center">100.25</td>        <td align="center">May 19</td>           </tr>           <tr>        <td align="center">cid_4</td>        <td align="center">James Kennedy</td>        <td align="center">0.00</td>        <td align="center">N/A</td>           </tr>           <tr>        <td align="center">cid_5</td>        <td align="center">Margaret Pelligrino</td>        <td align="center">0.00</td>        <td align="center">N/A</td>           </tr>           <tr>        <td align="center">cid_6</td>        <td align="center">Michael Harwell</td>        <td align="center">1000.00</td>        <td align="center">May 22</td>           </tr>           <tr>        <td align="center">cid_7</td>        <td align="center">Riley Corgi</td>        <td align="center">100.00</td>        <td align="center">June 1</td>           </tr>         </tbody>       </table>      </body>  </html> 
Output Report Viewed in a Browser

Figure 8.3 shows the output customer report as it appears in a web browser. As you can see, it has all the columns and is formatted as I described when I discussed the XSLT stylesheet.

Figure 8.3. Output customer report as viewed in a browser.

graphics/08fig03.gif

This was a fairly straightforward example that demonstrated the power of transforming XML documents to HTML documents. As I mentioned, the most difficult part of the entire process is developing the XSLT stylesheet ”the Perl XSLT module handles all the remaining details.

In the next example, we'll take a look at another Perl XSLT module; however, our output result will be a filtered XML document rather than an HTML document.

Note

For additional information on the XML::LibXSLT Perl module, please see perldoc XML::LibXSLT.




XML and Perl
XML and Perl
ISBN: 0735712891
EAN: 2147483647
Year: 2002
Pages: 145

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net