Transforming and Formatting Content | Content Networking Fundamentals

You can use XSL to transform, filter, sort, and format your content. The XSL family consists of XSLT, XPath, and XSL-FO. Use XSLT for transforming XML documents, XPath for defining parts of an XML document, and XSL-FO for formatting XML documents.

You have two options to transform your XML documents for the purpose of publishing them to the web:

Transforming XML to XHMTL/HMTL You can translate your XML documents into HTML or XHTML and apply CSS's to the documents for display to a web browser.
Transforming XML to XSL-FO You can translate your XML documents directly into XSL-FO. You can then use a third-party program to convert the standard XSL-FO into HTML/XMTL/CSS (as mentioned previously), Braille, bar-codes, Adobe PDF, PostScript, SVG, Abstract Windowing Toolkit (AWT), or Maker Interchange Format (MIF).

Note

You can apply XSL stylesheets to your content within your network using the Cisco Application Oriented Network (AON) network modules for the Catalyst 6500 series switches and Cisco 2600/2800/3700/3800 Series routers. For more information on the AON, refer to its product documentation on Cisco.com.

Transforming XML to XHMTL/HMTL

Consider an application where you would like to publish an outline of this book on the web. The outline will contain the book title, author name, and Chapter titles followed by the goals within each Chapter. The content of each Chapter's sections and subsections will not be included in the outline but assume that the entire book is available and marked up with the elements discussed previously in the simple DTD in Table 7-1. Figure 7-7 gives the sample XML file containing the outline of the first two chapters.

Figure 7-7. A Sample Book Outline XML File

Because you define custom element names in XML, name conflicts may occur when the same name from different DTDs is used to describe two different types of elements. You can use XML namespaces to provide unique element names within an XML document. In Figure 7-8, the namespace is the string "xsl:" that prefixes all of the XSL elements. You are required to use a namespace to differentiate elements among languages. The particular application that parses the document will know what to do with the specific elements based on the prefix. For example, an XSLT parser will look for the xsl: namespace URI and perform the intended actions based on the elements in the document. Alternatively, a XSL-FO parser will see the "fo:" namespace and perform the appropriate actions using the respective elements.

Figure 7-8. Sample XSLT File Transforming XML to HTML

Parsers do not use the URL of the namespace to retrieve a DTD or schema for the namespacethe URL is simply a unique identifier within the document. According to W3C, the definition of a namespace simply defines a two-part naming system and nothing else. However, you must define namespaces of individual markup languages with a specific URL for the parsing application to take action on tags within the context of the language. For example, you must define the XSLT namespace with the URL "http://www.w3.org/1999/XSL/Transform" in your documents. Additionally, you must define the XSL-FO namespace with "http://www.w3.org/1999/XSL/Format." That said, many simple XML parsing applications do not require namespaces in order to differentiate elements; they simply treat all elements as within the same namespace. However, this Chapter uses namespaces strictly for illustration purposes.

In Figure 7-8, the namespace for XSL is defined for the XSLT parser to recognize the XSL specific elements value-of, for-each, and number. An XSLT parser inputs the XSLT file and the XML source file. It processes these two files and outputs an HTML file as a well-formed XML document. There are many other XSLT elements available to you, but you should know at least these three to understand the content transformations in this section.

The first line in Figure 7-8 defines the XML version and encoding scheme. The second line defines the namespace for the XSL elements within the document. The XSL element "template" imposes a logical template for the whole document. The parser outputs the <head> and <html> tags without modification. The two <h1> tags within the <head> section are output containing the title and author name attributes from the source file.

Note

The elements <h1>, <h2>, and <h3> have specific implied formats when read by browsers in HTML. However, you can adjust the implied formats of these tags using CSS, as discussed in the next section.

The XSLT language organizes the XML elements into a tree structure, using XPath, similar to the way in which a standard computer file system organizes files. You reference the node elements or attributes by specifying the entire path, starting at the current location in the tree. In this case, the root element contains the desired attribute, so you should use path "book/@title." The "@" character indicates to the parser to select an attribute as opposed to an element.

The parser then reads the content of the book from the XML source file. The for-each element iterates through each of the elements given within the select attribute. Within the outer for-each element, the parser first outputs the Chapter goals title and then begins another for-each loop to iterate through the list of Chapter goals. Figure 7-9 gives the output from the XSLT translation file in Figure 7-8. The text view is the exact text output by the XSLT file, and the browser view is how the HTML looks from an HTML 4.0-based web browser's interpretation of the tags.

Figure 7-9. Output HTML from XSLT Transformation

Note

To transform the XML source file into a WML file instead, you require a new XSL transformation file. Instead of outputting HTML tags, you output WML tags, leaving the overall flow of the XSLT file the same.

Using Cascading Style Sheets

You can use CSSs to separate the formatting of a web document from the content in the document. Style sheets are useful because you can locate them in files that are separate from the content, allowing for multiple formats for the same content. For example, you can create two versions of your website: a standard style and a style for the visually impaired containing clearer images and larger font. Another example is the format specific to the different series of Cisco IP phones. Each series of IP phone has a different size display and requires special consideration with respect to content placement.

The concept of CSSs gives your authors the ability to blend different style sheets into the same document, as opposed to using completely separate styles for different groups of end users or different displays. For example, the author of a Cisco.com page can apply three different style sheets for a page within the Cisco TAC website. The first style sheet may impose the Cisco corporate look and feel. The second style sheet may apply to the standard TAC presentation, and the third may apply a format for the series of TAC documents that the author is writing for, such as network troubleshooting topics.

You can use the CSS file in Example 7-2 to format the HTML generated in Figure 7-9.

Example 7-2. Sample CSS File for Formatting a Standard HTML Document

 body { background-color: #FFFFFF; } h1 { font-family: Arial, sans-serif; font-size: 20px; color:#660000; text-align: center} h2 { font-family: Arial, sans-serif; font-size: 16px; color: #660000 } h3 { font-family: Arial, sans-serif; font-size: 14px; color: #003333; } p { font-family: Arial, sans-serif; font-size: 12px; color: #003333;}

Alternatively, you can generate HTML using XSLT to include CSS classes. Figure 7-10 illustrates how you can use XSLT to generate HTML with CSS classes, to provide a robust formatting solution to your XML documents.

Figure 7-10. XSLT for Generating HTML with Embedded CSS Classes

The XSLT file in Figure 7-10 will generate the formatted document in Figure 7-11.

Figure 7-11. Sample HTML Document Formatted with CSS

You can use the CSS file in Figure 7-12 to provide the format attributes for the classes described previously.

Figure 7-12. Sample CSS File Using Classes

Style sheets are beneficial when you require rendering a large number of documents into the same style. With a standard XML format, your authors can create content and use a given set of markup tags to describe the content. If the documents require different versions, such as XHTML and HTML for online viewing or Braille and PDF for printing, you will require a separate XSL transformation file for each. At any time, you can create new versions of the content by writing a new transformation file without changing the XML source files. Moreover, you can further separate the style and formatting using CSS. In the future, if a style or layout change to the documents is required, only the style sheets require modification, not the source XML file or XSL transformation file.

Transforming XML to XSL-FO

Now that you have a solid understanding of XML, XSL, and CSSs, you can tackle the more complex and highly powerful style sheet formatting language called XSL Format Objects (XSL-FO). Like CSS, you can use XSL-FO to format XML data for output. However, unlike CSS, XSL-FO is XML-based, and you can use it to further mark up XML by including descriptive formatting elements. Once marked up with XSL-FO, the formatted XML files can be output into various formats using third-party XSL-FO processors. The output formats can include any of the online display markup languages discussed in this Chapter, such as HTML, XHTML, and WML. However, the most common use of XSL-FO is to produce typeset documents for print in Adobe PDF format.

The XSL-FO in Example 7-3 is the general format for an XSL-FO formatted file. Notice that the "fo:" namespace precedes all the XSL-FO elements in the document.

Example 7-3. The General Format for an XSL-FO Document.

 <?xml version="1.0" encoding="ISO-8857-1"?> <fo:root xmlns:fo="http://www.w3.org/1999/XSL/Format"> <fo:layout-master-set>   <fo:simple-page-master master-name="A4">      <!-- Page template goes here -->   </fo:simple-page-master> </fo:layout-master-set> <fo:page-sequence master-reference="A4">   <!-- Page content goes here --> </fo:page-sequence> </fo:root>

The <fo:layout-master-set> element declares the page layout for the document. For your book outline project, you need only a single-page layout. All XSL-FO documents are broken into three areas, region-before, region-body, and region-after, but you can rename them to HEADER, CONTENT, and FOOTER in this example for clarity. Within our page outline, called BOOK-OUTLINE, we define the characteristics of each region. When supplying the content of the page, you reference the outline BOOK-OUTLINE, and the particular regions in which the content will reside.

The last part of the document specifies the actual content for output. The page-sequence element specifies the format for each page in your output document and references BOOK-OUTLINE for the placement and structure of content on the page. In the book outline example, the HEADER region contains the book title and author name, the CONTENT region contains the book outline, and the FOOTER region contains the copyright information. The HEADER and FOOTER content do not change. As such, you should define the HEADER and FOOTER content with static-content elements. If the data in this example happened to span multiple printed pages, the header and footer data would not change. Conversely, if content is not destined for print as in previous examples in this Chapter, the header and footer remain at the top and bottom of the page. For content that changes from page-to-page, use the <fo:flow> element to output the content. The <block> element specifies each area within a flow. Attributes of the <block> element give the specific formatting for each block. For example, the block containing the content for the Chapter goals would contain the formatting attributes in Example 7-4.

Example 7-4. Sample "Chapter Goal" XSL-FO Block

 <fo:block  background-color="#EEEFEE"  margin="30px"  border="1px solid #003333"  border-left="15px solid #003333"  text-indent="40px"  line-height="25px"  font-family="Times"  font-size="11pt"  color="black"> </fo:block>

The attributes in Example 7-4 produce the indentation, fonts, and colors that you see in the final output in Figure 7-13. To simplify Figure 7-13, none of the format attributes are included in the XSL-FO output. As an exercise, you can add format attributes based on those provided in Example 7-3 and the CSS example in Figure 7-13 to produce the same results.

Figure 7-13. Sample XML Document Formatted with XSL-FO and Generated into an Adobe PDF Document Using a XSL-FO Processor

In Figure 7-13, the XSL-FO document includes the content from the source XML file. In order to automatically populate the content from a source XML file, you can use the XSLT document in Example 7-5. The XSLT elements from the previous examples are in boldthe parser generates the non-bolded text as the XSL-FO output in Figure 7-13.

Example 7-5. XSL File That Generates XSL-FO with Embedded XML Content

 <?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"      xmlns:fo="http://www.w3.org/1999/XSL/Format" >   <xsl:template match="/">     <fo:root xmlns:fo="http://www.w3.org/1999/XSL/Format">       <fo:layout-master-set>         <fo:simple-page-master master-name="BOOK-OUTLINE">           <fo:region-before             region-name="HEADER" />           <fo:region-body margin-top="60px"             region-name="CONTENT"/>           <fo:region-after extent="30px"             region-name="FOOTER"/>         </fo:simple-page-master>       </fo:layout-master-set>       <fo:page-sequence master-reference="BOOK-OUTLINE">         <fo:static-content           flow-name="HEADER">           <fo:block><xsl:value-of select="book/@title"/></fo:block>           <fo:block>By: <xsl:value-of select="book/@author"/></fo:block>         </fo:static-content>         <fo:static-content flow-name="FOOTER">           <fo:block>             Copyright &#xA9; Cisco Press 2005         </fo:block>         </fo:static-content>       </fo:page-sequence>         <fo:flow flow-name="CONTENT">           <xsl:for-each select="book/chapter">             <fo:block               <xsl:value-of select="chaptertitle"/>             </fo:block>             <fo:block>               <xsl:value-of select="chaptergoals/@title"/>             </fo:block>             <fo:block>               <xsl:for-each select="chaptergoals/goal">              <fo:block>               <xsl:number value="position()" format="1. "/>               <xsl:value-of select="."/>             </fo:block>           </xsl:for-each>         </fo:block>       </xsl:for-each>     </fo:flow>   </fo:root>  </xsl:template> </xsl:stylesheet>