2.1 The XML family of Recommendations

2.1.1 Extensible Markup Language (XML)

To maintain text-based information in a hierarchical structure, the Extensible Markup Language (XML) describes a class of data objects called "XML documents" and partially describes the behavior of computer programs that process these objects:

http://www.w3.org/TR/REC-xml

A Recommendation fulfilling two objectives for information representation. We use XML to capture our information in a markup language defined by a vocabulary of elements and attributes described by a document model. We can presume a vocabulary informally using only XML-defined constraints, or we can formally declare the grammar, or model, of the vocabulary so that we may validate that our information adheres to our additional constraints. This vocabulary represents the labels and the granularity of the concepts we use for the expression of our information.

Nothing in XML is related to presentation or rendition . When we present documents visually or aurally, we are rendering the content according to the presentation semantics our stylesheets confer on the content. Nothing inherent in XML is related to presentation or rendition; it is entirely up to other standards to define presentation semantics and the syntax with which to engage them, and to presentation software to interpret these vocabularies and implement the semantics associated with the presentation vocabularies.

The vocabulary of elements and attributes used in an instance can be validated . An XML document is considered well- formed when it adheres to the constraints defined by the XML Recommendation. A set of user -defined constraints on the allowed elements, attributes, and text of the vocabulary can be specified at a grammar level by a declarative document model, or implemented at a semantic level by the application processing the information in the instance.

A grammar or document model can specify structural validation for the nesting and order of elements, their use of attributes, and the use of text. Certain aspects of content can be lexically validated and checked for referential integrity. A distinct process, separate from the applications acting on the information, can analyze an XML document against the formalized user-defined constraints of the document model expressed in an XML 1.0 Document Type Definition (DTD). Recently other validation mechanisms became available, including RELAX-NG, Schematron, W3C's XML Schema and the upcoming Document Schema Definition Language (DSDL ISO/IEC 19757).

When the constraints cannot be defined with the expressiveness of a document model technology, an application can algorithmically validate an XML document by interpreting the semantics associated with the labels used in the structure of the document. Information "means" exactly what an application processing the information wants it to mean. Thus, an application can analyze the structure and content of an XML document for appropriateness to the application's purpose. It can test conditions or constraints that cannot be expressed in a formal document model syntax. It can algorithmically determine validity to support requirements that are not easily expressed declaratively .

Consider the simple well-formed XML instance purc.xml in Example 2-1.

Example 2-1 A well-formed XML purchase order instance

 Line 01 <?xml version="1.0"?>      02 <purchase>      03   <customer db="cust123"/>      04   <product db="prod345">      05     <amount>23.45</amount>      06   </product>      07 </purchase>

The constructs in Example 2-1 represent purchasing information only because we recognize the names and assume they reflect the concepts that we understand. If we misunderstand the names used, yet believe we understand them correctly, we can "process" the information using our assumed semantics without invalidating the information as presented to us.

In the same way, we can feed this information to any XML-based application and the application can act on the names it has been programmed to recognize, thus interpreting the semantics of the information in the only way it knows . This may not, however, be the semantics assumed by the author of the document.

Indeed it is a common misconception that our document models somehow formally describe the semantics of our information. In fact, they only describe the vocabulary with which we identify components of our information. All we can use a document model for is validating that the structure and content of a character stream conforms to the constraints described using the features and limitations of the expression of the model description. XML 1.0 describes the Document Type Definition (DTD) expression of the grammar of an XML vocabulary. Other approaches mentioned before offer different benefits and limitations compared to using a DTD, and are candidates for validating the contents of XML documents. In all these cases the semantics represented by the vocabulary are defined in prose comments or supplemental documents.

The semantics of our information are formally described (i.e. interpreted) by the applications we use to process our documents, because a document only means what our applications think it means, by the processes they employ against our information by following the corresponding labels it finds therein.

XML vocabularies can be translated to an application's specific vocabulary. If our documents are composed in the same structure we wish to use for presentation, we can choose to decorate our structures with recognized formatting properties if that is sufficient to the presentation environment. If an alternative structure is needed, either in the same vocabulary or a vocabulary specific to the presentation technology, then we must rearrange and/or transform documents from our vocabulary into the presentation vocabulary.

In practice, it is far more beneficial in the long run to design your document structures according to your business practices and your plans for creating and maintaining your information. You should not be designing your models according to how you plan to present your information, as you may have many different ways you wish to do the presentation, both current and unexpected in the future. Through transformation, you can rearrange any information you create and maintain it in the order you wish to present. It may be difficult to accommodate business practices and information access requirements if you lock your information into a single presentation.

Markup-based rendering agents are programmed to recognize vocabularies geared for the presentation of information. These vocabularies may be purely attribute-based, as is true for Cascading Style Sheets (CSS). Other vocabularies are comprised of both elements and attributes, as is true for the HyperText Markup Language (HTML) and the Scalable Vector Graphics (SVG). When we want our information to be rendered by one of these agents, we must understand the presentation semantics implemented by the rendering agents .

To present our information, we are in effect interpreting the semantics of our vocabularies and choosing to represent our information packaged into the semantics of a rendering device. We must, therefore, transform instances of our vocabularies into instances of the rendering vocabularies in order for the rendering agents to present our information the way we wish. For example, we can transform our XML vocabularies into a combination of HTML and CSS vocabularies to render our XML documents in a wide range of web browsers that may, or may not, support CSS stylesheets. Those browsers recognizing CSS will make use of CSS presentation semantics, while browsers not recognizing CSS will use the accepted presentation semantics inferred by convention for the HTML vocabulary.

Namespaces distinguish constructs from different vocabularies. Properly identifying constructs in the information through their labels is essential to implementing the semantics of our data in our applications. A rigorous method of identifying constructs in XML information uses namespaces by prefixing element-type names with lengthy URI reference strings governed by ownership through domain name registration.

Such URI prefixes are included in names by associating a namespace prefix with the URI reference string and then using that prefix in the markup of the document to identify the names of element types and attributes.

But, again, these names are merely labels. The use of namespaces provides a more powerful labeling mechanism by employing names that incorporate the essence of ownership through the domain names of URI references. When we model our vocabularies, we must decide if we are going to use simple non-namespace-based naming conventions or namespace-based naming conventions, and accommodate the presence of the namespace URI in our applications if necessary.

An application can, therefore, be rigorous in recognizing namespace-based names to which it applies the assumed semantics of the information. With proper namespace maintenance, this eliminates any risk of improperly recognizing a construct's label, though of course this does not prevent an application from making an incorrect assumption of the semantics of the information based on a correct label.

Moreover, an application should be prepared to accommodate namespaces it does not recognize, either through defined processing or perhaps by reporting an error.

2.1.2 Document Style Semantics and Specification Language (DSSSL)

There is a large constituency of people who are not fond of electronic displays that are difficult to navigate when a lot of information appears to flow down infinitely long windows of information. We are all familiar with the printed form of presenting information and the roles played by collections of pages, as well as with the tools used to comfortably navigate through a large collection of pages. The Document Style and Semantics Specification Language (DSSSL) gives information providers the technology to paginate information incorporating familiar page-based navigational tools:

ISO/IEC-10179:1996
- http://www.y12.doe.gov/sgml/wg8/dsssl/readme.htm

Transforming and formatting structured information. The Document Style Semantics and Specification Language (DSSSL) is the International Standard (IS) for styling information. This IS includes a transformation language to rearrange structured information, pagination semantics for presenting it in printed form, and an extension mechanism for implementing arbitrary formatting semantics.

A programming language for transforming structured information. DSSSL incorporates both a transformation language and a styling language, implemented using a side-effect-free dialect of the Scheme language. This functional programming language is very powerful, and complex algorithms can be implemented succinctly with its tight LISP-like syntax.

Unfortunately, the abundant parentheses in the specification language scared a lot of people away from DSSSL and it never achieved the recognition or acceptance it should have in the industry. As a result of shying away from the specification language, our markup community never learned the style semantics side of this international standard, and DSSSL was often ignored when it should have been an important contribution to a number of efforts.

A standardized set of formatting semantics for paginated output. DSSSL includes extensive pagination semantics a set of characteristics and their values that are used both to flow information on folios (e.g. printed pages) and format the appearance of that information. Both simple and complex page geometries can be specified. Users specify the desired intent of the result of the formatting process by using the semantics described in the standard.

The design of DSSSL is different from traditional publishing software applications and employs an arms-length model regarding rendering. DSSSL does not specify the rendering process itself, only the interpretation of the formatting intent into what needs to be rendered. A DSSSL stylesheet specifies the intent of what is desired, and a DSSSL engine interprets that intent on the given rendering medium.

The members of the DSSSL development group represented a wide range of users and formatting software vendors . Between them they isolated essential formatting concepts (i.e. semantics), gave them labels (i.e. names), and specified their possible properties and values.

DSSSL is truly internationalized. The semantics have no bias to any particular writing direction. For example, a stylesheet written for the left-to-right writing direction can simultaneously support top-to-bottom or right-to-left writing systems without changes.

A framework for implementation-defined sets of formatting semantics. DSSSL is extensible so that a stylesheet writer can utilize any set of semantics defined by a given DSSSL processor. Stylesheets can declare the existence of a formatting concept and then use that concept in the intent for the result.

James Clark, the author of the JADE DSSSL formatting engine (http://www.jclark.com/jade/jade.htm), specified and implemented a set of formatting semantics that represents markup syntax. Using these semantics, one can effect an instance transformation by "styling" one's input document into output markup. The OpenJade project openjade.sf.net continues the development of JADE.

Custody of ISO/IEC JTC 1/SC 34/WG 2. The International Organization for Standardization (ISO) has many committees for standardization work in various aspects of our daily lives. The joint technical committee (JTC) with the International Electrotechnical Commission (IEC) is responsible for information technology. The subcommittee (SC) for document description and processing languages is numbered 34 and the second working group (WG) of this subcommittee is responsible for DSSSL and other formatting issues such as fonts.

The full title of the working group is ISO/IEC JTC 1/SC 34/WG 2. Prior designations for the committee that has worked on DSSSL from its inception include ISO/IEC JTC 1/WG 4, and ISO/IEC JTC 1/SC 18/WG 8. The working group continues to support the evolution of the DSSSL International Standard.

Members of the original DSSSL working group are members of the W3C XSL Working Group.

2.1.3 Cascading Style Sheets (CSS)

When displaying web documents in user agents, we are often not satisfied with the expressiveness of the limited formatting properties assumed for HTML documents. The CSS1 and CSS2 Recommendations describe a set of formatting semantics through a collection of property names and values:

http://www.w3.org/TR/REC-CSS1
http://www.w3.org/TR/REC-CSS2

Formatting property assignment for web documents (HTML and XML). Not accepting the presentation semantics inferred by HTML browsers for HTML- marked -up information, the web community developed a robust and coherent formatting model for electronic presentation of information. The CSS model standardizes a set of formatting semantics tied to a vocabulary of attribute values that can be attached to hierarchically structured web documents for rendering in browsers. This addressed the incompatibilities being added to HTML interpretation by browser manufacturers.

Unfortunately, the designers of CSS did not adopt the pre-existing DSSSL terminology for the identical formatting semantics, nor did they use all of the value sets for the CSS properties. This resulted in a different set of properties for the same concepts and purposes.

Initially designed for HTML, a subsequent release of CSS described the application of formatting properties to XML documents. These formatting properties do not involve any manipulation of the abstract document tree created from an input file; they are merely attached to the document tree as ornaments and are interpreted by a CSS-aware user agent that can effect the formatting inferred by the semantics triggered by the properties.

Inherent in the formatting model is the notion that the width and length of the presentation canvas are not fixed. The length of the presentation is essentially infinite, in that the browser agent shows a document in its entirety regardless of the length of the information included in the instance. The technology of the presentation is essentially electronic, in that reader of the information can dynamically change the width of the canvas requiring the user agent to re-flow the content within the new dimensions.

Ornamentation of the document tree. A CSS-aware user agent views our information as the document tree represented by the markup we choose to use. The stylistic information with which we decorate the document tree dictates how we want the user agent to render the content.

The formatting model provides for document content to be prefixed and suffixed with supplemental information found in the stylesheet. White space around information can be controlled, and we can place our content in overlapping and transparent rectangular regions of the canvas.

Inheritance plays an important role in CSS. The "cascade" is the application of inherited formatting properties when a given construct does not explicitly supply an inheritable value. Inheritance first starts "up" the ancestry of the document tree, looking for an applicable property specification. The cascade then continues looking at external stylesheets of lesser priority than internal stylesheets, finally accepting the built-in presentation semantics assumed by HTML user agents.

Multiple media type support. The CSS formatting model incorporates a number of layout constructs available to flow our information into a given desired presentation. For example table-oriented constructs are available for us to present our information in a tabular form, even if the information isn't in HTML table markup.

Important to the accessibility of information to all users of the Web, CSS introduced aural presentation properties we can use to shape our information for the visually impaired. Note that not only people with sight disabilities are visually impaired, but sighted users of the Web may be in situations where they are unable to use their sight (e.g. mobile applications, such as browsing for information while in a car). Information providers will find clever use of aural properties a boon to their users' surfing experience.

Doesn't (shouldn't) interfere with legacy browsers not supporting CSS. CSS properties are expressed in HTML documents through reserved attributes and document metadata. Unfortunately, many legacy HTML browsers were not true SGML applications but rather simple "angle bracket processors" unaware of the rules of formal markup practice. Hence, arcane methods of capturing stylesheet information are sometimes required to be resilient to legacy browsers that do not properly implement markup. Even still, there are some non-CSS-aware browsers that end up exposing property specifications on the user's canvas instead of properly recognizing their role as supplemental information to be kept off the canvas.

The property sheets can be external to the document itself, and must indeed be so in XML documents that are not utilizing namespaces.

A browser could choose to render XML documents that are using namespaces by recognizing CSS properties embedded in style attributes from the HTML vocabulary, though this is not seen in practice.

Working group is producing a common formatting model for web documents. A W3C working group is responsible for the common formatting model for web documents. Many W3C Recommendations need to specify formatting or display properties at times, and where applicable, designers of new Recommendations are asked to use the CSS semantics and the property names for those semantics. Members of the original CSS working group are members of the XSL Working Group.

This promotes a widely understood specification of the common requirements for formatting, including properties related to font, spacing, and a number of other useful presentation areas.

2.1.4 Styling structured information

2.1.4.1 Styling is transforming and formatting information

Styling is the rendering of information into a form suitable for consumption by a target audience. Since the audience can change for a given body of information, we often need to apply different styling for that information, to obtain dissimilar renderings to meet the needs of each audience. Perhaps some information needs to be rearranged to make more sense for the reader. Perhaps some information needs to be highlighted differently to bring focus to key content.

It is important, when we think about styling information, to remember that two distinct processes are involved, not just one. First, we must transform the information from the organization used when it was created into the organization needed for consumption. Second, when rendering, we must express the aspects of the appearance of the reorganized information, whatever the target medium.

Consider the flow of information as a streaming process where information is created upstream and processed or consumed downstream. Upstream, in the early stages, we should be expressing the information abstractly, thus preventing any early binding of concrete or final-form concepts. Midstream, or even downstream, we can exploit the information as long as it remains flexible and abstract. Late binding of the information to a final form can be based on the target use of the final product; by delaying this binding until late in the process, we preserve the original information for exploitation for other purposes along the way.

It is a common but misdirected practice to model information based on how you plan to use it downstream. It does not matter if your target is, for example, a presentation-oriented structure, or a structure that is appropriate for another markup-based system. Modeling practice should focus on both the business reasons and inherent relationships existing in the semantics behind the information being described (so, the vocabularies are content-oriented at this stage). For example, emphasized text is often confused with a particular format in which it is rendered. Where you might want to model information using a b element type for eventual rendering in a bold face, we would be better off using an emph element type. In this way, we capture the reason for marking up the information (the fact that it is emphasized compared to surrounding information), and we do not lock any of the downstream targets into only using a bold face for rendering.

Many times the midstream or downstream processes need only rearrange, re-label or synthesize the information for a target purpose and never apply any semantics of style for rendering purposes. Transformation tasks stand alone in such cases, meeting the processing needs without introducing rendering issues.

One caveat regarding modeling content-oriented information is that there are applications where the content orientation is, indeed, presentation-oriented. Consider book publishing where the abstract content is based on presentational semantics. This is meaningful because there is no abstraction beyond the appearance or presentation of the content.

Consider the customer information in Example 2-1. A web user agent doesn't know how to render an element named customer . The HTML vocabulary used to render the customer information could be as in Example 2-2.

Example 2-2 HTML rendering semantics markup for Example 2-1

 Line 01 <p>From: <i>(Customer Reference) <b>cust123</b></i>      02 </p>

The rendering result would then be as in Figure 2-1, with the user agent interpreting the markup for italics and boldface presentation semantics.

Figure 2-1. HTML rendering for example

graphics/02fig01.jpg

The figure illustrates these two distinct styling steps: transforming the instance of the XML vocabulary into a new instance according to a vocabulary of rendering semantics; and formatting the instance of the rendering vocabulary in the user agent.

2.1.4.2 W3C XSL Working Group

This working group was chartered to define a style specification language that covers at least the formatting functionality of both CSS and DSSSL. The end result was not intended to replace CSS, but to provide functionality beyond that defined by CSS, such as element reordering and pagination semantics.

2.1.4.3 Two W3C Recommendations

To meet these two distinct processes in a detached (yet related) fashion, the W3C XSL Working Group split the original drafts of their work into two separate Recommendations: one for transforming information and the other for paginating information.

The XSL Transformations (XSLT) Recommendation describes a vocabulary recognized by an XSLT processor to transform information from an organization in the source file into a different organization suitable for continued downstream processing.

The Extensible Stylesheet Language (XSL) Recommendation describes a vocabulary (often called XSL-FO for "Formatting/Flow Objects," even by the W3C, though the use is unofficial and not formally part of the Recommendation) reflecting the semantics of paginating a stream of information into individual pages. The semantics are defined by a set of formatting objects, properties, and property values, expressible in an XML vocabulary.

The XSL Recommendation normatively includes XSLT by reference in Chapter 2, and historically both Recommendations were expressed in a single document. Indeed, XSLT was designed for use with XSL-FO, incorporating features to make working with the XSL-FO vocabulary easier.

Both XSLT and XSL-FO are endorsed by members of WSSSL, an association of researchers and developers passionate about the application of markup technologies in today's information technology infrastructure.

2.1.5 Extensible Stylesheet Language Transformations (XSLT)

We all have needs to transform our structured information when it is not appropriately ordered for a purpose other than how it is created. The XSLT 1.0 Recommendation describes a transformation instruction vocabulary of constructs that can be expressed in an XML model of elements and attributes:

http://www.w3.org/TR/xslt

2.1.5.1 Transformation by example

We can characterize XSLT among other techniques for transmuting information by regarding it simply as "Transformation by Example," whereas many other techniques are better described as "Transformation by Program Logic." This perspective focuses on the fact that we do not need to tell an XSLT processor how to effect the changes we need; rather, we tell an XSLT processor what we want as the end result, and it is the processor's responsibility to do the dirty work.

The XSLT Recommendation gives us, in effect, a templating language. It is a vocabulary for specifying templates that represent "examples of the result." Based on how we instruct the XSLT processor to access the source of the data being transformed, the processor will incrementally build the result by adding the filled-in templates.

We write our stylesheets, or "transformation specifications," primarily with declarative constructs, though we can employ imperative techniques (also known as procedural techniques) if and when needed. We assert the desired behavior of the XSLT processor based on conditions found in our source. We supply examples of how each component of our result is formulated and indicate the conditions of the source that trigger which component is added to the result next . Alternatively, we can selectively add components to the result on demand.

Note:

Many programmers unfairly deride XSLT for not being a good programming language, when in fact it is a templating language and not a programming language at all. The idea of declaratively supplying templates of the result and the matching conditions of source tree nodes to the templates is a paradigm that is very different from imperative programming. I find that by far the most disparaging and vociferous attacks against XSLT are from programmers unable or awkwardly trying to follow an algorithm-based imperative approach to the problem instead of the assertion-based declarative approach inherent in the language design.

XSLT is not a panacea, and there are many algorithmic situations (particularly in character-level text manipulation) where XSLT is not the appropriate tool to use. Node tree rearrangement, and in particular mixed content processing, can be handled far more easily declaratively in XSLT than in many imperative approaches. This templating approach is ideal for the rearrangement of information for use with XSL formatting semantics. Critics will continue to claim that XSLT is a "bad" programming language until they stop using it as an incorrect pigeonhole for certain classes of problems.

XSLT is similar to other transmutation approaches in that we deal with our information represented as trees of abstract nodes. We don't deal with the raw markup of our source data. Unlike these other approaches, however, the primary memory management and information manipulation (node traversal and node creation) is handled by the XSLT processor, not by the stylesheet writer. This is a significant difference between XSLT and a transformation programming language or interface, such as the Document Object Model (DOM) where the programmer is responsible for handling the low-level manipulation of information constructs.

Our objective as stylesheet writers is to supply the XSLT processor with enough "templates of the result" so that the processor can build the result we desire when triggered by information in our source. Our data file becomes a hierarchy of nodes in our source tree. Our templates become a hierarchy of nodes in our stylesheet tree. The processor is doing the work building the result node tree from nodes in our stylesheet and source trees. We don't have to be programmers to manipulate the node trees or serialize the result node tree into our result file. It isn't our responsibility to worry about the angle brackets and ampersands that may be needed in the result markup.

Consider once again the customer information in our purchase order in Example 2-1. An example of the HTML vocabulary supplied to the XSLT processor to produce the markup in Example 2-2 could be as in Example 2-3.

Example 2-3 An XSLT template rule for the HTML vocabulary

 Line 01 <xsl:template match="customer">      02   <p><xsl:text>From: </xsl:text>      03     <i><xsl:text>(Customer Reference) </xsl:text>      04       <b><xsl:value-of select="@db"/></b></i></p>      05 </xsl:template>

An example of XSL vocabulary supplied to the XSLT processor to produce the markup in Example 2-7 is shown in Example 2-4.

Example 2-4 Example XSLT template rule for the XSL vocabulary

 Line 01 <xsl:template match="customer">      02   <fo:block space-before.optimum="20pt" font-size="20pt">      03     <xsl:text>From: </xsl:text>      04     <fo:inline font-style="italic">      05       <xsl:text>(Customer Reference) </xsl:text>      06       <fo:inline font-weight="bold">      07         <xsl:value-of select="@db"/>      08       </fo:inline></fo:inline></fo:block>      09 </xsl:template>

Comparing Examples Example 2-3 and Example 2-4, we see that our practices as stylesheet writers are not different in any way. The templates are different in that they express different vocabularies for the elements and attributes in the result tree of nodes, but our methodology is not different. Each template is the example of the desired result for the given customer element as expressed in each of two different presentation vocabularies.

Comparing the style shown in both examples above to imperative programming techniques, one can see the XSLT stylesheet writer is not responsible for low-level node manipulation or markup generation. By declaring the nodes to be used in the result tree, one is describing the construction through the use of examples. These templates represent the information we want in the result tree that the processor must effect however it needs to, in order for the information in the example to be correctly included in the result. The processor only takes what is given as an example and is free to use whatever syntactic constructs it wishes; the downstream processor interpreting the result will use these constructs to understand the same information being represented in the template.

XSLT includes constructs that can be used to identify and iterate over structures found in the source information. The information being transformed can be traversed in any order needed and as many times as required to produce the desired result. We can visit source information numerous times if the result of transformation requires that information to be present numerous times.

Users of XSLT also don't have the burden of implementing numerous practical algorithms required to present information. XSLT specifies a number of algorithms that are implemented within the processor itself, and we engage these algorithms declaratively. High-level functions such as sorting and counting are available on demand when we need them. Low-level functions, such as memory-management, node manipulation, and garbage collection, are all integral to the XSLT processor.

This declarative nature of the stylesheet markup makes XSLT much more accessible to non-programmers than the imperative nature of procedurally-oriented transformation languages. Writing a stylesheet is as simple as using markup to declare the behavior of the XSLT processor, much like HTML is used to declare the behavior of the web browser to paint information on the screen.

Not all examples of the result are fixed monolithic sequences of markup, however, as XSLT can conditionally include portions of a template based on testable conditions expressed by the stylesheet writer. Other constructs allow templates to be fragmented and added to the result on demand based on stylesheet logic. Templates can be parameterized to be used in different contexts corresponding to different parameter values.

In this way, XSLT accommodates the programmer as well as non-programmer, in that there is sufficient expressiveness in the declarative constructs so they can be used in an imperative fashion. XSLT is (in programming theory) " Turing complete," thus any arbitrarily complex algorithm could (theoretically) be implemented using the constructs available. While there will always be a trade-off between extending the processor to implement something internally and writing an elaborate stylesheet to implement something portably, there is sufficient expressive power to implement some algorithmic business rules and semantic processing in XSLT constructs.

In short, straightforward and common requirements can be satisfied in a straightforward fashion, while unconventional requirements can be satisfied to an extent with some programming effort.

Note:

Theory aside, the necessarily verbose XSLT syntax dictated by its declarative nature and the use of XML markup makes the coding of some complex algorithms a bit awkward . I have implemented some very complex traversals and content generation successfully, but with code that could be difficult to maintain (my own valiant, if not always satisfactory, documentation practices notwithstanding).

Users of XSLT often need to maintain large transformation specifications, and many need to tap prior accomplishments when writing stylesheets. A number of constructs are included supporting the management, maintenance and exploitation of existing stylesheets. Organizations can build libraries of stylesheet components for sharing among their colleagues. Stylesheet writers can tweak the results of a transformation by writing shell specifications that include or import other stylesheets known to solve the problems they are addressing. Stylesheet fragments can be written for particular vocabulary fragments; these fragments can subsequently be used in concert, as part of an organization's strategy for common information description in numerous markup models.

2.1.5.2 Not intended for general purpose XML transformations

It is important to remember that XSLT is primarily for transforming XML vocabularies to the XSL formatting vocabulary . This doesn't preclude us from using XSLT for other transformation requirements, but it does influence the design of the language and it does constrain some of the functionality from being truly general purpose.

For this reason, the specification cannot claim XSLT is a general purpose transformation language. However, it is still powerful enough for all downstream processing transformation needs within the assumptions of use of the transformation results. XSLT stylesheets are often called XSLT transformation scripts because they can be used in many areas not at all related to stylesheet rendering. Consider an electronic commerce environment where transformation is not used for presentation purposes. In this case, the XSLT processor may transform a source instance, which is based on a particular vocabulary, and deliver the results to a legacy application that expects a different vocabulary as input. In other words, we can use XSLT in a non-rendering situation when it doesn't matter what markup is utilized to represent the content; when only the parsed result of the markup is material.

An example of using such a legacy vocabulary for the XSLT processor would be as in Example 2-5.

Example 2-5 An XSLT template rule for a legacy vocabulary

 Line 01 <xsl:template match ="customer">      02   <buyer><xsl:value-of select="@db"/></buyer>      03 </xsl:template>

The transformation would then produce a result acceptable to the legacy application, as shown in Example 2-6.

Example 2-6 A legacy vocabulary for customer information

 01 <buyer>cust123</buyer>

XSLT assumes that results of transformation will be processed by a rendering agent or some other application employing an XML processor as the means to access the information in the result. The information being delivered represents the serialized result of working with the information in XML instance, and if supplied, the XML document model definition of information set augmentation , expressed as a tree of nodes. The actual markup within either the source XML instance or the XSLT stylesheet is, therefore, not considered material to the application and therefore need not be preserved during transformation. All that counts is that the underlying content of the input is found where required in the structure of the resulting output, regardless of the markup used to represent that result.

Because of this focus on the processed result for downstream applications, there is little or no control in an XSLT stylesheet over the actual XML markup constructs found within the input documents, or over the actual XML markup constructs utilized in the resulting output document. This prevents a stylesheet from being aware of such constructs or controlling how such constructs are used. Any transformation requirement that includes "original markup syntax preservation" would not be suited for XSLT transformations.

Therefore, in comparison to imperative languages and interfaces offering the programmer tight control over the markup of the result of transformation, XSLT cannot be considered a general purpose transformation language because of the lack of control over the markup. For example, when using XSLT one cannot specify the order of attributes in a start tag of the serialized result tree, nor can one specify the technique by which sensitive markup characters present in #PCDATA content are escaped.

When working with the XSL-FO vocabulary, the result of the XSLT transformation is going to be processed by the XML processor inside the XSL-FO processor; therefore, the markup of the result is immaterial as long as it is well formed. The transformation process is, indeed, absolutely general purpose when the result is going to be interpreted for pagination.

2.1.6 Extensible Stylesheet Language (XSL/XSL-FO)

XSL (or XSL-FO) describes formatting and flow semantics for paginated presentation that can be expressed using an XML vocabulary of elements and attributes:

http://www.w3.org/TR/xsl

2.1.6.1 Paginated flow and formatting semantics vocabulary

This hierarchical vocabulary captures formatting semantics for rendering textual and graphic information in different media in a paginated form. A rendering agent is responsible for interpreting an instance of the vocabulary for a given medium to reify a final result.

This is no different, in concept and architecture, from using HTML and Cascading Style Sheets (CSS) as a hierarchical vocabulary and formatting properties for rendering a body of information in a web browser. Such user agents are not pagination-oriented and effectively have an infinite page length and variable page width.

Indeed, the printed paged output from a browser of an HTML page is often less than satisfactory. Paginated information includes navigation tools such as page numbers , page number citations, headers, footers, etc. to give the reader methods of finding information or indentifying the current location in a printed document.

In essence, when doing any kind of presentation, we are transforming our XML documents into a final display form by transforming instances of our XML vocabularies into instances of a particular rendering vocabulary that expresses the formatting semantics of our desired result. The vocabulary we choose must be able to express the nature of the formatting we want accomplished. We can choose to transform our information into a combination of HTML and CSS for web browsers and can choose an alternate transformation into XSL-FO for paginated display (be that paginated to a screen, to paper, or perhaps even aurally using sound).

In this way XSL-FO can be considered a pagination markup language.

2.1.6.2 Target of transformation

When using the XSL-FO vocabulary as the rendering language, the objective for a stylesheet writer is to convert an XML instance of some arbitrary XML vocabulary into an instance of the formatting semantics vocabulary. This formatting instance is the information rearranged into an expression of the intent of the paginated result as a collection of layout constructs populated with the content to be laid out on the rendered pages.

This result of transformation cannot contain any user-defined vocabulary constructs (such as "address," "customer identifier," or "purchase order number" constructs) because the rendering agent would not know what to do with constructs labeled with these foreign, unknown identifiers.

Consider again the two examples: HTML for rendering on a single page of infinite length in a web browser window, and XSL-FO for rendering on multiple separated pages on a screen, on paper, or audibly. In both cases, the rendering agents only understand the vocabulary expressing their respective formatting semantics and wouldn't know what to do with alien element types defined by the user.

Just as with HTML, a stylesheet writer utilizing XSL-FO for pagination must transform each and every user construct into a rendering construct to direct the rendering agent to produce the desired result. By learning and understanding the semantics behind the constructs of XSL-FO, the stylesheet writer can create an instance of the formatting vocabulary expressing the desired layout of the final result (e.g. area geometry, spacing, font metrics, etc.), with each piece of information in the result coming from either the source data or the stylesheet itself.

Consider once more the customer information in Example 2-1. An XSL-FO rendering agent doesn't know how to render a marked up construct named <customer> . The XSL-FO vocabulary used to render the customer information could be as in Example 2-7.

Example 2-7 XSL-FO rendering semantics markup for Example 2-1

 Line 01 <fo:block space-before.optimum="20pt" font-size="20pt">From:      02 <fo:inline font-style="italic">(Customer Reference)      03 <fo:inline font-weight="bold">cust123</fo:inline>      04 </fo:inline>      05 </fo:block>

The result rendered in the Portable Document Format (PDF) would then be as in Figure 2-2, with an intermediate PDF generation step interpreting the XSL-FO markup for italics and boldface presentation semantics.

Figure 2-2. XSL-FO rendering for example

graphics/02fig02.jpg

The figure again illustrates the two distinctive styling steps: transforming the instance of the XML vocabulary into a new instance according to a vocabulary of rendering semantics; and formatting the instance of the rendering vocabulary in the user agent.

The formatting semantics of the XSL-FO vocabulary are described for both visual and aural targets, so we can use one set of constructs regardless of the rendering medium. It is the rendering agent's responsibility to interpret these constructs accordingly . In this way, the XSL-FO semantics can be interpreted for print, display, audio, or other presentations. There are, indeed, some specialized semantics we can use to influence rendering on particular media, though these are just icing on the cake. Dynamic behaviors can be specified for interactive media that would not function at all, obviously, in the paper form.

2.1.7 Styling semantics and vocabularies

XSLT and XSL-FO processors implement styling semantics. XSLT and XSL-FO processors are rigorous in implementing styling semantics, or behaviors, for the XSL vocabularies. Well-defined semantics are captured in the two XML vocabularies of elements and their attributes that represent, respectively, instructions and their controls for XSLT and formatting objects and their properties for XSL-FO. Namespaces are important to an XSLT or XSL-FO processor to recognize not only the constructs from their respective vocabularies, but also any extensions to the vocabulary that are specific to a particular brand of processor.

In addition, namespaces allow for rendering vocabularies to be freely used for foreign objects in the XSL-FO stream, and an XSL-FO formatting engine can properly forward rendered content in arbitrary namespaces to the rendering processes incorporated in the tool.

But, as with all XML applications, the assumption is that the names of element types and attributes are just labels referencing the semantics as defined by the specification, and it is up to the user to respect that assumption in order to get the desired formatted result. The use of the XSLT or XSL-FO namespace does not magically confer semantics on the elements; rather, an XSLT or XSL-FO processor assumes that when the names from these vocabularies are used, the application of the semantics defined by the Recommendations is what is desired by the user.

We learn the XSLT and XSL-FO XML vocabularies as representations of the semantics assumed by the processors, and we engage those processors to transform, render, and paginate our information accordingly.

XSLT and XSL-FO document type definitions are described using prose. There are no standardized XML 1.0 DTD representations of the grammar of these vocabularies because DTD semantics and syntax are unable to fully express all of the grammatical constraints of instances representing XSLT and XSL-FO transformations and stylesheets.

The XSLT 1.0 and XSL 1.0 Recommendations describe the document types in English rather than in a formal notation, and it is up to processors to do all aspects of validation and interpretation of the document structure according to the respective document type.

There are, however, snippets of content model-like syntax with Kleene operators (" ? " for zero or one, " * " for zero or more, and " + " for one or more) used in the XSL documentation because of the reader's assumed familiarity with DTD syntax.

Otherwise, the normative description of the vocabularies is essentially the detailed prose of the corresponding Recommendations. This prose may override some of the strictness of the abbreviated expression that uses the DTD syntax.

2.1.8 Transforming and rendering XML information using XSLT and XSL-FO

When the result tree in an XSLT process is specified to utilize the XSL-FO pagination vocabulary, the normative behavior of an XSL-FO processor incorporating an XSLT processor is to interpret the result tree. This interpretation reifies the semantics expressed in the constructs of the result tree to some medium, for example pixels on a screen, dots on paper, or sound through a synthesis device (see Figure 2-3).

Figure 2-3. Transformation from XML to XSL Formatting Semantics

graphics/02fig03.gif

The stylesheets used in this scenario contain the transformation vocabulary and any custom extensions, as well as the desired result XSL-FO formatting vocabulary and any foreign object vocabularies. There are no element types from other XML vocabularies in the result. If there were, rendering processors would not inherently know what to do with such constructs, for example, with an element of type custnbr representing a customer number. It is the stylesheet writer's responsibility to transform the information into information recognized by the rendering agent.

There is no obligation for the formatter to serialize the result tree created during transformation. The feature of serializing the result tree to XML markup is, however, quite useful as a diagnostic tool, revealing to us what we really asked to be rendered, instead of what we thought we were asking to be rendered when we saw incorrect results. There may also be performance considerations of taking the reified result tree in XML markup and rendering it in other media without incurring the overhead of performing the transformation repeatedly.

2.1.9 Interpreting XSL-FO instances directly

The XSL-FO and foreign object vocabularies can also be used in a standalone XML instance, perhaps as the result of an XSLT transformation using an outboard XSLT processor, as shown in Figure 2-4. The XSLT processor serializes a physical entity from the transformation result tree, and that XML file of XSL-FO vocabulary is then interpreted by a standalone XSL-FO processor.

Figure 2-4. Creating standalone XML instances of XSL vocabulary

graphics/02fig04.gif

Figure 2-4 delineates three distinct phases of the process. These phases also exist when the XSLT and XSL-FO processors are combined into a single application. The transformation phase creates the XSL-FO expressing our intent for formatting the source XML. The XSL-FO processor first interprets our intent into the information that is to be rendered on the target device, then effects the rendering to reify the result.

2.1.10 Generating XSL-FO instances

XSL-FO need not be generated by XSLT in order to be useful, as shown in Figure 2-5. Consider that when we learned HTML as the rendering vocabulary for a web user agent, we either coded it by hand or wrote applications that generated HTML from our information. This information may have come from some source, such as a database.

Figure 2-5. Generating XML instances of XSL vocabulary

graphics/02fig05.gif

Having learned XSLT, we can express our information in XML and then either transform the XML into HTML to send to the user agent, or send the XML directly to an XSLT process in the user agent.

The typical generation of XSL-FO would be from our XML using an XSLT stylesheet, though this need not be the case at all. We may have situations where our applications need to express information in a paginated form, and these applications could generate instances of the XSL-FO vocabulary directly to be interpreted for the output medium.

We need to remember that XSL-FO is just another vocabulary that can be expressed as an XML instance, requiring an application to interpret our intent for formatting in order to effect the result. This is no different than the use of the HTML vocabulary for a web browser.

The sole requirement is that the namespace of the vocabulary in the instance be " http://www.w3.org/1999/XSL/Format " for the labeled information in the instance to be recognized as expressing the semantics described by the XSL-FO Recommendation.

Note:

The default namespace may be used for the XSL-FO vocabulary, just as is true with any vocabulary. Personally, I don't use the popular " fo: " prefix in my stylesheets, as it is my habit to use the default namespace and not prefix my XSL-FO names in any way.

This practice reinforces for me that this is just as simple as HTML, where I don't use any namespace at all in my own stylesheets.

There are processors that interpret standalone XSL-FO instances interactively on the screen in a GUI environment. To learn the nuances of XSL-FO, I often hand-author XSL-FO instances experimenting with various objects and properties in elements and attributes, tweaking values repeatedly, and examining the results interactively with the formatting tool. Having hand- authored HTML, using the default namespace for XSL-FO is very natural and saves on the amount of typing as well.

2.1.11 Using XSL-FO on a server

A typical web-based use of XSL-FO is in a three-tiered environment shown in Figure 2-6 where the server is producing "printable versions" of information that people are browsing in HTML using web browsers. In such an architecture, a single XML document is transformed into HTML using an XSLT stylesheet specifically designed for the best presentation of the information with browser features.

Figure 2-6. Using XSL-FO in a server environment

graphics/02fig06.gif

When the user requests a rendition of the information suitable for printing on paper, a separate XSLT stylesheet is applied to the same XML document to produce an XSL-FO structure representing the information on a printed page. An XSL-FO process interprets this structure to produce a representation of the printed output suitable for the user's environment. The Portable Document Format (PDF) is a ubiquitous final-form print format and free readers are available to both view and print paginated documents represented in PDF.