17.2 Writing an XSLT Processor with C

17.2 Writing an XSLT Processor with C#

C# is Microsoft's evolution of C++ and Java. It's similar to Java, so I've found it easy to learn. C# takes some interesting forks from Java, such as its use of properties, delegates, and so forth. However, exploring the virtues and foibles of C# is not my mission here. I'm just going to show you how to create an XSLT processor in C# really only a simple command-line interface to .NET's underlying XSLT engine. It's easy to do after you have the right pieces.

C# comes as part of Microsoft's .NET Framework 1.1 SDK. You can download the Framework by following the .NET download link on http://www.microsoft.com/net. It's well over 100 megabytes, so it takes some time to download, especially if you don't have a fast Internet connection. This example uses Version 1.1 of the .NET Framework SDK. You need Windows 2000 or Windows XP for the Framework to even install, so either one is required for this exercise. .NET applications will run on other Windows operating systems, but that requires extra steps that I won't go into here.

The Mono Project includes an open source version of C# that was declared code complete about mid-2003. The Mono version of C# runs on Windows, Linux, FreeBSD, and Mac OS X. I have not tested the C# code in this chapter with Mono, but it's likely to work.


17.2.1 The Pax Code

In examples/ch17/Pax.cs, you will also find the C# source code for the Pax XSLT processor, shown in Example 17-2.

Example 17-2. The Pax code for running an XSLT processor
/*  * Pax C# XSLT Processor  */     using System; using System.IO; using System.Text; using System.Xml; using System.Xml.XPath; using System.Xml.Xsl;     public class Pax {         public static void Main(String[  ] args)     {            // Output file flag        bool file = false;            // Usage strings        string info = "Pax C# XSLT processor";        string usage = "\nUsage: Pax source stylesheet [result]";            // Test arguments        if (args.Length =  = 0) {            Console.WriteLine(info + usage);            Environment.Exit(1);        } else if (args.Length =  = 3) {            // Third argument = output to file            file = true;        } else if (args.Length > 3) {            Console.WriteLine("Too many arguments; exit.");            Environment.Exit(1);        }            // Create the XslTransform        XslTransform xslt = new XslTransform(  );            // Load the XML document, create XPathNavigator for transform        XPathDocument doc = new XPathDocument(args[0]);        XPathNavigator nav = doc.CreateNavigator(  );            // Load a stylesheet        xslt.Load(args[1]);            // Create the XmlTextWriter        XmlTextWriter writer;        if (file) {            // Output to file with ASCII encoding            writer = new XmlTextWriter(args[2], Encoding.ASCII);        } else {            // Output to console            writer = new XmlTextWriter(Console.Out);        }            // Write XML declaration        writer.WriteStartDocument(  );            // Set indentation to 1        writer.Formatting = Formatting.Indented;        writer.Indentation = 1;            // Transform file        xslt.Transform(nav, null, writer, null);            // Close XmlTextWriter        writer.Close(  );       }     }

Right away, you should notice that the code for Pax.cs and Moxie.java are very similar. A C# programmer should be able to figure out this code in a few glances, but again, if you're not familiar with C#, you can read the following section, which walks through the program nearly line by line.

17.2.2 Looking at the Pax Code

C# uses similar comment characters to Java. Instead of packages, C# uses namespaces, declaring them at the very beginning of the program with the reserved word using:

using System; using System.IO; using System.Text; using System.Xml; using System.Xml.XPath; using System.Xml.Xsl;

You can't import individual classes in C# as you can in Java: you have to use the namespace name, such as System.Xml.Xsl, which exposes the entire object to the program.

Following this, the Pax class is defined and the Main( ) method is invoked. The command-line arguments to Main( ) are, as in Moxie.java, evaluated with an if statement. The three possible arguments represent files:

  1. The first argument (args[0]) represents an XML source document that you want to transform.

  2. The next argument (args[1]) represents the XSLT stylesheet for the transformation.

  3. The optional third argument (args[2]) represents the name of the file where the result tree will be stored, if it is used. If it is absent, the result tree will appear on Console.Out (C#'s name for standard output or the screen). The file variable (of type bool) indicates whether the third argument is present. file is set to false by default, but if the third argument is on the command line, file is set to true, and the program will know that a file should be written for the result tree.

The XslTransform class comes from the System.Xml.Xsl namespace. This line instantiates a transformer named xslt:

XslTransform xslt = new XslTransform(  );

The classes that follow are in the System.Xml.Xpath namespace:

XPathDocument doc = new XPathDocument(args[0]); XPathNavigator nav = doc.CreateNavigator(  );

An XPathDocument provides a cache for performing the transformation, and the CreateNavigator( ) method from XPathDocument creates an XPathNavigator for navigating the cached document.

The Load( ) method from XslTransform loads the stylesheet from the second argument (args[1]) to the program:

xslt.Load(args[1]);

In C#, the XmlTextWriter class from the System.Xml namespace creates a writer for XML output:

XmlTextWriter writer; if (file) {     // Output to file with ASCII encoding     writer = new XmlTextWriter(args[2], Encoding.ASCII); } else {     // Output to console     writer = new XmlTextWriter(Console.Out); }

If a third argument is present on the command line, file is set to true, and the output from the program will be written to a file encoded as US-ASCII. Encoding is a property from System.Text. Some other possible values for this property are UTF8 for UTF-8 output, Unicode for UTF-16 output, and BigEndianUnicode for UTF-16BE. If file is false, the output will be written to the console using IBM437 output, based on the codepage for a Windows command prompt.

The following line tells the writer to use an XML declaration in the output:

writer.WriteStartDocument(  );

Without this line, no XML declaration is written. These lines of code set the indentation of the output to a single space per child element:

 writer.Formatting = Formatting.Indented;  writer.Indentation = 1;

Formatting and Indentation are properties from the XmlTextWriter class. The next line performs the actual transformation:

xslt.Transform(nav, null, writer);

The XslTranform instance xslt loaded the stylesheet earlier with its Load( ) method. The first argument to Transform( ) provides the name of an instance of an XpathNavigator object, and the third argument is the name of an instance of an XmlTextWriter object. The second argument, which is null, can use an XsltArgumentList to provide a list of parameters or extension objects to the transform. The final statement in the program closes the XmlTextWriter object writer, automatically closing any element or attributes that might still be open:

writer.Close(  );

17.2.3 Running Pax

A compiled version of Pax is in examples/ch17 (Pax.exe). To run Pax, type the following line at a Windows 2000/XP command prompt:

pax

If all is well, the program will return some usage information to you:

Pax C# XSLT processor Usage: Pax source stylesheet [result]

To transform test.xml with test.xsl, type:

pax test.xml test.xsl

With this command, you will get the following results:

<?xml version="1.0" encoding="IBM437"?> <methods>  <method>clearParameters(  )</method>  <method>getErrorListener(  )</method>  <method>getOutputProperties(  )</method>  <method>getOutputProperty(String name)</method>  <method>getParameter(String name)</method>  <method>getURIResolver(  )</method>  <method>setErrorListener(ErrorListener listener)</method>  <method>setOutputProperties(Properties oformat)</method>  <method>setOutputProperty(String name, String value)</method>  <method>setParameter(String name, Object value)</method>  <method>setURIResolver(URIResolver resolver)</method>  <method>transform(Source xmlSource, Result outputTarget)</method> </methods>

The output encoding is set to IBM437 for screen output. You can also save the output to a file using:

pax test.xml test.xsl pax.xml

The output encoding in pax.xml is US-ASCII as set by the Encoding.ASCII property. If you want to alter this program, you'll also need to know how to recompile it.

17.2.4 Compiling Pax

With the .NET Framework Version 1.1 downloaded and installed on your system, you should be able to access the C# complier csc. If you type csc at a command prompt with no options, you should see this:

Microsoft (R) Visual C# .NET Compiler version 7.10.2292.4 for Microsoft (R) .NET Framework version 1.1.4322 Copyright (C) Microsoft Corporation 2001-2002. All rights reserved.     fatal error CS2008: No inputs specified

To view the many options available with csc, enter:

csc /help

To compile Pax, type this command:

csc Pax.cs

Upon success, the compiler will produce a new version of Pax.exe. For more information on C#, study the vast documentation provided with the Version 1.1 Framework download. You can access the documentation by clicking on the Documentation link under Microsoft .NET Framework SDK v1.1 under Programs or All Programs on the Start menu.



Learning XSLT
Learning XSLT
ISBN: 0596003277
EAN: 2147483647
Year: 2003
Pages: 164

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net