How Does XSL Work? | XML and SOAP Programming for BizTalk(TM) Servers (DV-MPS Programming)

[Previous] [Next]

W3C XSL has two major purposes: formatting objects and transforming XML documents. To format a document, XSL reads an XML document and applies a set of transformation processes to create another XML document, called a result tree. This result tree adheres to the formatting object namespace, which contains hundreds of elements and attributes that describe the presentation of the XML document. For example, the result tree indicates whether a particular textual object will be bold or italic, red or salmon, inline or blocking. The result tree does not have any instructions for a particular typesetting language. Instructions are applied in the next step of the XSL process: formatting object interpretation.

This result tree is read into a formatting object interpreter, which interprets the formatting object elements and attributes and outputs typesetting codes for a particular typesetter. Figure 6-1 illustrates this process.

In this example, if a designer wants to display a particular piece of text in green italics, all she needs to do is indicate those requirements in generic terms. The font style and color attributes are in the schema referenced by the formatting object namespace. These attributes are set to italic and green. This declarative way of indicating output transcends any particular output medium, which means that the designer needn't worry about particular typesetting codes.

Let's use HTML to show how the XSL formatting works. In HTML, inline cascading style sheet (CSS) styles indicate font style and color. The formatting object interpreter for HTML renders the green italic object as STYLE="font-style:italic;color:green". This string is readable by an HTML typesetter (a Web browser). The final paragraph tag looks like this:

 <P STYLE="font-style:italic;color:green;">December 3, 1997</P>

Suppose you want a paper (rather than an HTML) document. To render our document on paper, we could use a formatting object interpreter that understands the rich text format, or RTF. (Microsoft created RTF syntax in the mid-1980s as a 7bit ASCII representation of richly formatted word-processing documents. Because RTF was plain text, it was easy to transmit over e-mail and other early transport protocols.) Our example's formatting object interpreter transforms the font-style="italic" command into \i, which turns the text that follows the command into italics. The color="green" command is transformed into \c6, indicating that the color is found in the sixth entry of the color table at the top of the RTF document. The resulting RTF document fragment might look something like this:

 {\c6\i December 3, 1997\par}

click to view at full size.

Figure 6-1. The two parts of the W3C XSL presentation process: First the input XML document is transformed into a result tree, and then the result tree is interpreted by a formatting object interpreter optimized for a particular output.

The XSL presentation process provides a powerful model because it allows an organization to get any number of outputs from the same XML inputs and style sheets. Of course, this model is advantageous only if you have support for a formatting object interpreter for the types of outputs you are considering.

Microsoft's Implementation of XSL

As of this writing, the XSL specification is still under development. The formatting object libraries are complex, and many outstanding issues still need resolution.

In 1998, Microsoft felt that the transformation piece of XSL was stable enough and implemented only the transformation part of XSL from a working draft of the XSL specification. Microsoft introduced this part of XSL in the MSXML parser, which shipped with Microsoft Internet Explorer 5. This gave developers access to a mechanism that allowed general-purpose, XML-to-XML transformation.

XSLT and XPath Breakout

Although some people criticized the Microsoft XSL implementation as an incomplete part of a W3C specification, many other people used the implementation to understand the power of a declarative transformation language. As a result of this wide understanding, the W3C XSL Working Group extracted the transformation part from XSL and created a new W3C Recommendation, XSL Transformations (XSLT) Version 1.0. While XSL now points to XSLT as its transformation engine, XSL still contains all the formatting object support.

XSLT requires a syntax that enables the selection of certain parts of an XML document. For example, to render a chapter title one way and a section title another way, two rules must be able to specify a path to the appropriate objects in a particular context. Because XSLT is not the only W3C standard that requires this syntax, the XML Path Language (XPath)—was extracted from the XSLT specification. The W3C adopted both XPath and XSLT as Recommendations in November 1999.