Chapter 9. Cross-Document Links

  •  9.1 Cross-Document Links
  •  9.2 Indexing and Tables of Contents
  •  9.3 Running Headers

To quote from the specification:

Because XML, unlike HTML, has no built-in semantics, there is no built-in notion of a hypertext link.

So why does XSL-FO address links? Certainly for a paper-based output, an active hyperlink isn't much use, though for a screen-based presentation using PDF, it might be. While XSL-FO is used primarily for paper-based output today, it would be a mistake to think this is XSL-FO's only purpose. XSL-FO is designed to present XML across several media, including interactive media; to do that, it needs to support hyperlinking. In its simplest form, the link is useful to cross-reference to content, locations within the document, and specific structural elements. For web-based delivery, it is handy to have an active link, and for print output, the actual content of the active element needs to be meaningful. This facility is offered in the first version of the specification.

XSL-FO has a formatting object named fo:basic-link, which provides the basic linking capability. Example 9-1 and Figure 9-1 show this in use.

Example 9-1. A basic link
xml source <para>...see the figure on page <link idref="fig53"/> </para> and the stylesheet <xsl:template match="link"> <fo:basic-link background-color="lightblue"     internal-destination="{@idref}">Page     <fo:page-number-citation ref-id="intro"/>   </fo:basic-link> </xsl:template>
Figure 9-1. A basic link


Here, the link is shown inline and shaded, referencing a page number. The formatter replaces the page-number-citation with the page number on which the link target is present while laying out the document. Some formatters will create a clickable link, others will not. Be warned: there is no requirement to do so! Note the use of the attribute value template in this example to insert the id value of the link target, using the idref attribute value on this element.

Now, let's have a look at the various uses for links.

9.1 Cross-Document Links

The simplest syntax for this use is the id/idref pair. This is for a case in which the document being transformed contains both the source and target of the link. This way, the XSLT engine can resolve the link target using the id( ) function.

Note that for id/idref to work properly, the id attribute must be declared by the DTD as being of type ID. A common error (and one that I frequently make) is to style a part of a document and forget the DTD inclusion.

When the link is between documents that are only styled to form a single document for paper, other cross-document linking forms should be used. If the source documents are parsed as a single entity, this presents no problem. If they are to be used in other ways, to avoid unresolved cross-references the source of the link needs to use something other than the IDREF attribute.

The contents of the basic-link element could be literal content or content retrieved from the target, such as the title of a chapter, a page, or a section number, using the functionality of XSLT. Example 9-2 and Example 9-3 show such an example, with the generated fo shown in Example 9-4.

The cross-references are actually within the fo file. If you receive warnings about unresolved page-number-citations or reference id values, it's possible that you have forgotten to add the id values to the targets.

Example 9-2. Cross-references using target content, XML source
<chapter> <para>A link to <xref idref="ch2" />.       </para></chapter> <chapter id="ch2">    <title>Second chapter</title> </chapter>
Example 9-3. Cross-references using target content, XSLT stylesheet
 <xsl:template match="chapter"> [1]       <fo:block id="{@id}">      <xsl:apply-templates/></fo:block> </xsl:template>  <xsl:template match="xref">   <fo:inline ><fo:basic-link [2]       internal-destination="{@idref}">      Chapter [3]          <xsl:for-each select="id(@idref)">       <xsl:number level="multiple" count="chapter" format="1 "/>     </xsl:for-each>,     <xsl:value-of select="id(@idref)/title"/>     on Page   <fo:page-number-citation ref-id="{@idref}"/>,   </fo:basic-link> </fo:inline>  </xsl:template>
  1. The example shows the chapter being wrapped in a block. An alternative is to use an empty block with the id value set.

  2. The xref template creates the source link, with content obtained from the target (Chapter 2 title); its number is calculated from its position within the document, and the page number is added.

  3. Note the need to change context, using xsl:for-each, to obtain the right context.

Example 9-4. Cross-references using target content, resulting FO
 <fo:block font-family="Times"     font-size="12pt" space-before="12pt" space-after="12pt" [4] text-align="justify">A link to <fo:inline>       <fo:basic-link       internal-destination="ch2">Chapter 2, Second chapter on Page       <fo:page-number-citation ref-id="ch2"/>, </fo:basic-link>                </fo:inline>.       </fo:block>          </fo:block>
  1. The resulting output in the fo namespace indicates the processing that the formatter has to do, replacing the page-number-citation while generating the link.

Note that the whitespace in this example is there for readability.

9.1.1 Page Numbering

Page numbering can sometimes cause problems. As mentioned earlier in the book, page-number restarts are possible for any layout. Another thing to consider is the actual appearance of page numbers. Front matter and main matter may require different formats, for example; Roman for the front matter and Arabic for the main matter. This is not defined in XSL-FO, but rather in XSLT, which provides that facility and is referenced from the XSLT-FO Recommendation. xsl:number has an attribute named format, which takes an option of 1, a, A, i, or I, and applies formatting to the string to return Arabic numerals, lowercase letters, uppercase letters, lowercase Roman numerals, and uppercase numerals, respectively. This attribute is available on the fo:page-sequence element, with a default beginning value of 1. Select one of the other options to format the page number in Roman or other formats. For example, to have a page-sequence numbered using Roman uppercase, you might specify <fo:page-sequence master-reference="only" format="I"> <fo:flow flow-name="xsl-region-body"> ....

Next, let's put the basic link to use.

9.2 Indexing and Tables of Contents

The most common use of links, index, and table of contents generation, share two characteristics. First, they answer the old problem of changing content. If the table of contents or index is generated automatically, there is no frantic rush near publishing time to get all the page numbering correct. Second, the actual page numbers (or section or chapter numbers) don't need to be hardcoded into the document. This way, content reorganization is not a problem. Take a book with chapters and sections within chapters: if all cross-references are to chapter id values, then no matter how much reorganization is done, the cross-references will remain valid, the table of contents will be accurate, and the indexing is done as part of the transformation.

Let's take the previous example further, by producing a table of contents showing the title and page number. We need to do that for chapters and the contained sections. Dot leaders are needed for each entry, which are second-level entries indented by four character widths with respect to the parent. The source might look like Example 9-5.

Example 9-5. Source XML requiring a table of contents
<chapter><title>one </title>  <section><title>one one </title></section>  <section><title>one two </title></section>  <section><title>one three </title></section> </chapter> <chapter><title>two </title>  <section><title>two one</title></section> <section><title>two two</title></section> <section><title>two three, with a long title to show   the effect of wrapping on long lines in this mode.    Normal layout provides a reasonable solution</title></section> <section><title>two four</title></section> <section><title>two five</title></section> <section><title>two six</title></section>

The stylesheet section to generate the table of contents needs to be called at the appropriate time in the output generation and might look like Example 9-6. The example uses templates with a mode attribute set to the value toc, to enable out-of-line processing. An appropriate header must be included.

Example 9-6. Table of contents stylesheet extract
 <xsl:template match="chapter" mode="toc">    <fo:block text-align-last="justify">      <fo:inline><xsl:value-of select="title"/>      <fo:leader        leader-pattern="dots"/>      <fo:page-number-citation ref-id="{@id}"/>    </fo:inline>  </fo:block> <xsl:apply-templates select="section" mode="toc"/> </xsl:template>  <xsl:template match="section" mode="toc">    <fo:block text-align-last="justify"           text-indent="-1em" start-indent="1em">      <fo:inline padding-start="1em"><xsl:value-of select="title"/>      <fo:leader        leader-pattern="dots" />      <fo:page-number-citation ref-id="{@id}"/>    </fo:inline>  </fo:block> </xsl:template>

Leaders are used to separate the title contents from the page number. The only other difference is the use of the text-align-last attribute. This expands content across the page to give the presentation shown in Figure 9-2.

Figure 9-2. A table of contents example


Figure 9-2 shows how to control the wrapping of long lines by using the text-indent and start-indent combination. The first line is outdented, with the whole block indented by the same amount.

Indexing is managed in a similar manner. Each term is identified, perhaps with a specific attribute or even using an id attribute. The index is then generated automatically, using the id/idref pair again. More complex indexing would require both primary and secondary (or even tertiary) annotations to indicate the level of indexing for that usage. This is simply a case of using indentation to layout the index, perhaps using bold to indicate the primary entries and normal weight for other levels. Example 9-7 shows a simple text example illustrating the source XML.

Example 9-7. Source XML
 <para>This is a page layout using the     <term id="front-page">page </term> format.... ....     <idx>       <item idref="front-page">Background Image</item>       <item idref="b">Bold</item>       <item idref="cen">Centered text</item>       <item idref="sect4">Columns</item>     </idx>

The indexed term is identified and referenced by page number. At the end of the document, the term is identified and the index text is inserted, which in this case expands on the content for clarity. The XSLT stylesheet to produce the overall index is shown in Example 9-8.

Example 9-8. Stylesheet for the index
 <xsl:template match="idx">  <fo:block    start-indent="0.5in"    end-indent="0.5in"    font-size="{$base-font-spec}"    text-align-last="justify">    <fo:inline font-weight="bold">Item</fo:inline>      <fo:leader leader-pattern="dots"/>     <fo:inline font-weight="bold">Page</fo:inline>  </fo:block>    <xsl:apply-templates/>  </xsl:template>  <xsl:template match="idx/item">  <fo:block    start-indent="0.5in"    end-indent="0.5in"    font-size="{$base-font-spec}"    text-align-last="justify">    <xsl:apply-templates/>  <fo:leader leader-pattern="dots"/><fo:basic-link  internal-destination="{@idref}">     <fo:page-number-citation     color="blue" ref-id="{@idref}"/></fo:basic-link> </fo:block>  </xsl:template>

The example provides the header for the index, the column headings, the term, and page number. Note that I have used a variable to specify font-size to permit varying the whole document font size for different readers. The content of the item element is left-justified with an indent, followed by the leader with a pattern set to dots to provide the dot leaders out to the right margin, where the page number is included using the page-number-citation property. This produces the actual page number. The result is shown in Figure 9-3.

Figure 9-3. Resultanting index output


If secondary terms are required, the method shown for the table of contents example could be used (see Example 9-6, earlier).

9.3 Running Headers

Because the method used to produce running headers is effectively a link, it is included here with other cross-referencing techniques. For those not familiar with the term, running headers are the lines of text that run across the tops of book pages, sometimes including chapter titles.

XSL-FO uses a scheme that I don't find particularly clear. As explained earlier in this book, headers are seen as static content related to page layout, rather than to flowed content. This is reasonable for the majority of cases. I find it counter-intuitive for running headers. As of Release 1 of the Recommendation, little has been provided that is dependent on the page position. The XSL Working Group has openly stated that there are clear requirements for this that will be addressed, but have yet to address them.

A key point here is that static content is defined in the page layout specification and remains static until a new page layout is used. This means items such as page numbers and running headers that are required to change over a single layout must be treated specially. The running header issue is resolved in XSL-FO by means of two formatting objects, marker and retrieve-marker.

Let's first identify the content for the running header (or footer the same principles apply). Assuming that a chapter title is required for the running header, we may have something like Example 9-9 as the XML source file.

Example 9-9. XML source for a running header
<chapter><title>Introducing markers</title>  ....

The stylesheet for this "Introducing markers" header is as shown in Example 9-10.

Example 9-10. Marker usage
<xsl:template match="chapter/title">    <fo:block xsl:use-attribute-sets="head1"     break-before="page">      <fo:marker marker-class-name="sect-head" >        <fo:block><xsl:value-of select="."/>          </fo:block>      </fo:marker>     <xsl:apply-templates/>   </fo:block> </xsl:template>

Within the title block, a marker is specified, given a class name of sect-head and its contents are enclosed in a block (which is not strictly necessary). The marker-class-name must be unique within the layout area. Note that this will not produce output prior to the contents of the title in the formatted output. It simply says, "Use this content when you want the contents of the marker named sect-head." This identifies the content, so now we need to use it.

As part of the page layout, the header is specified as static-content. Assuming a justified layout of title contents, we can use text-align-last for this justification, with each element as a child of the block with that property set to justify. The header stylesheet might look like Example 9-11.

Example 9-11. Retrieve marker usage in a header
<fo:static-content       flow-name="xsl-region-head">          <fo:block>            <fo:retrieve-marker               retrieve-class-name="sect-head"/>         </fo:block> </fo:static-content>

The static content for the header is specified to contain the contents of the marker named sect-head, contained within a block. This produces a header that changes within the same page-sequence to reflect the changing contents of the chapter title. I'll leave it up to you to format the other typical contents of a header.

9.3.1 Footnotes

Though footnotes are not links, it seems appropriate to discuss them here. The element (in the fo namespace) to use is fo:footnote. The two first children of this element are the inline for the footnote reference and the footnote-body for the actual content of the footnote. In Example 9-13, I've used a decorative horizontal rule to separate the footnotes from the page text and created a list to lay out the footnote content. I'll leave the addition of superscripting to you.

Both the reference and the content of the footnote is included at the same place, with the formatter doing the hard work of laying it out on the same page. Examples Example 9-12 through Example 9-14 show the XML source, the transformation, and the resulting FO.

Example 9-12. Footnote example, source XML
<para> the bicameral<footnote>The Latin   alphabet, which you are reading, is an example of a   bicameral font; it has an uppercase and   lowercase. Unicameral alphabets (the Arabic and Hebrew   alphabets) only have one case.</footnote> font presents us with two forms of   presentation..</para>
Example 9-13. Footnote example, the transformation
<xsl:template match='para'> <fo:block><xsl:apply-templates/></fo:block> </xsl:template> <xsl:template match='footnote'>     <fo:footnote> [1]       <fo:inline>1</fo:inline>     <fo:footnote-body> [2]       <fo:block text-align-last="justify">               <fo:leader leader-pattern="rule"/>             </fo:block>             <fo:list-block>               <fo:list-item>                 <fo:list-item-label end-indent="label-end(  )"> [3]       <fo:block>1</fo:block>                 </fo:list-item-label>                 <fo:list-item-body start-indent="body-start(  )"> [4]       <fo:block><xsl:apply-templates/></fo:block>                 </fo:list-item-body>               </fo:list-item>             </fo:list-block>           </fo:footnote-body>         </fo:footnote> </xsl:template>
  1. The marker, which is an inline.

  2. A leader separates footnotes.

  3. A list presents footnote.

  4. The actual footnote content.

Example 9-14. Footnote example, the resulting FO
 <fo:block>the bicameral font presents us with two forms of presentation           <fo:footnote>           <fo:inline>1</fo:inline>           <fo:footnote-body>             <fo:block text-align-last="justify">               <fo:leader leader-pattern="rule"/>             </fo:block>             <fo:list-block>               <fo:list-item>                 <fo:list-item-label end-indent="label-end(  )">                   <fo:block>1</fo:block>                 </fo:list-item-label>                 <fo:list-item-body start-indent="body-start(  )">                   <fo:block>The Latin                       alphabet, which you are reading, is an example of a                       bicameral font; it has an uppercase and                       lowercase. Unicameral alphabets (the Arabic and Hebrew                       alphabets) only have one case.</fo:block>                 </fo:list-item-body>               </fo:list-item>             </fo:list-block>           </fo:footnote-body>         </fo:footnote> font presents us with two forms of presentation.. </fo:block>

The output is shown in Figure 9-4. The body content is simply the inline number one, to which a prefix could be added during the transformation stage.

Figure 9-4. Resultant footnote presentation



Xsl Fo
ISBN: 0596003552
EAN: 2147483647
Year: 2002
Pages: 24
Authors: Dave Pawson © 2008-2017.
If you may any questions please contact us: