Chapter 3. Pagination

CONTENTS
  •  3.1 Document Classes
  •  3.2 The Main Parts of an XSL-FO Document
  •  3.3 Simple Page Master
  •  3.4 Complex Pagination
  •  3.5 Page Sequences

Practical publishing projects start with a number of constraints. Unless you are experimenting, you will already know if your XML source is targeted at a book, an article, a business form, or a newsletter. In other words, the final product will be a concrete instance of what I will informally call document categories or document classes.

Decades and centuries of usage have established publishing conventions for many document classes. Many readers who have experience with at least one desktop publishing system or formatting software application will be familiar with standard document classes. These conventions suggest rules to be followed at all levels of the formatting process. It is at the pagination level, however, where the effect of these rules is most strongly felt. This chapter discusses XSL pagination how to design pages and how to put them together.

3.1 Document Classes

The rules and conventions that apply to a given document class will determine the presence and structure of the three major divisions of any single document: the front matter, the main matter (probably most commonly known as the body), and the back matter. (These terms are generally used in connection with only certain types of documents. Because the concepts have more general utility, they will be extended to all documents.) The front matter obtains its fullest form in books and typically contains most of the following: a title page, a copyright page, a preface, a table of contents, and lists of figures or other illustrations. Dedications and similar material also belong to the front matter of a document.

The main matter of a document consists of the actual content: everything from the introduction to the appendixes. The back matter may contain an index, more acknowledgments, a glossary, a bibliography, a colophon, and so forth. It is worth pointing out at this stage of the discussion (and I will repeat this point often) that these are logical structures. They may be present and identifiable in your source XML, but will not be the same in the FO document. Equally, items like a table of contents will not exist in the source but will be generated when transforming to the fo namespace.

Depending on the specific type of document, the front matter may be greatly abbreviated, may be missing altogether, or may be combined with the main matter. This is typical of articles and reports. Letters and business forms may have only main matter. Books have all three, and these contain nearly all of the listed sections.

3.2 The Main Parts of an XSL-FO Document

XML instances in the fo namespace, or XSL-FO stylesheets, consist of two major parts. The first part describes the general layout of all possible pages and provides instructions to the formatter regarding which page templates to use. The second part assigns the actual content of the document to the pages and describes the formatting of the content. The general pagination problem consists of properly and fully constructing the first part and in making the proper assignment of content flows. This chapter will cover all of this in detail. The formatting of content remains for subsequent chapters.

The top-level element of an FO document is the fo:root element.[1]

One important attribute on the fo:root element is the source-document, which has been added such that the source document may be accessed from the XSL-FO document. It's a good habit to pass this to the XSLT stylesheet as a parameter for inclusion.

The children of the fo:root element consist of:

  • One layout-master-set

  • An optional declarations

  • One or more page-sequence

Figure 3-1 shows a very useful diagram from the XSL specification that illustrates the pagination formatting objects.

Figure 3-1. Pagination formatting objects

figs/xslf_0301.gif

The declarations element, if used, contains one or more color-profile children. declarations are a wrapper for formatting objects whose content is to be used as a resource to the formatting process. This element groups global declarations for the FO file. See Chapter 7 for a discussion on color profiles.

The layout-master-set corresponds to the first major part of an FO file that I mentioned earlier. Its function is to fully specify the pages to be used in the document. The children of this element consist of simple-page-master elements and page-sequence-master elements. You must have at least one simple-page-master defined. It is good practice to organize your simple-page-masters and page-sequence-masters in a way that suggests how they will be used. This will make more sense once you see some examples.

The simple-page-master has a master-name attribute by which it is referenced, and the page-sequence element has a master-reference attribute that refers back to one of the simple-page-master elements. Similarly, the page-sequence-master has a master-name attribute. This is how content is assigned to one or another layout within the formatting operation. The significance of the master-name attribute is that this is how masters are referenced by content flows.

The value of the master-name attribute must be unique across all the content of the layout-master-set. The formatter will treat an empty or conflicting master-name attribute as an error, and may or may not continue to process the FO file.

The page-sequence-master is simply a way of sequencing the use of simple-page-master elements as more content is added. A typical use of this element is to specify that left- and righthand page layouts (properly, verso and recto) are to be used alternately throughout the pages of a book. More on this later.

Let us make one more connection. As can be seen from Example 3-3, the selection of content for a particular type of page layout (the simple-page-master elements) is achieved by either the use of one or more page-sequence elements that follow the layout-master-set, or, indirectly, from references within the page-sequence-master element's repeatable-page-master-reference element (see Section 3.4.3 for more detail). Each page-sequence has a master-reference attribute, and the value of this attribute for a given page-sequence designates the simple-page-master or page-sequence-master that will paginate the content contained in that page-sequence. I think of this as the base relationship between this piece of content and that particular type of page, so I might want to put all chapter elements into a standard page layout, where standard is the master-name attribute of the simple-page-master. This attribute names this particular page specification, which may be one of many such specifications. This attribute uniquely identifies the associated page specification.

The formatter will treat an empty or conflicting master-reference attribute as an error, and may or may not continue to process the FO file.

 

There is no requirement for master-references to be unique across page-sequence elements. Several page-sequence elements may point at the same simple-page-master or page-sequence-master. (The state of page-sequence-master elements is not shared across page-sequence elements. We will see what this means when we examine the use of multiple page-sequence elements.)

3.3 Simple Page Master

XSL 1.0 specifies just one way of laying out a page: the page description. We use the simple-page-master element for this page description. Any discussion of page masters presupposes the concept of a page. It may seem self-evident at this point that we do have a page, but there is actually more to this concept in XSL than immediately meets the eye.

The CSS and XSL specifications overlap, and this is reflected in shared models at various levels. CSS originally approached pagination from the web point of view a single unlimited canvas, effectively restricted in the horizontal, but not in the vertical, direction. XSL is heavily weighted towards paged media; this distinction operates primarily at the level of page master selection, not in the description of single pages. However, CSS is actively embracing paper media (in CSS2), and XSL from the start has acknowledged formats other than print namely, HTML.

This means in XSL, we must deal with the idea of non-paged media and viewports. Non-paged effectively means one page with flexible boundaries, which is obviously not the case with print. Hence, if you are reading this on a web browser, you are effectively viewing it in a non-paged form. Viewports introduce the ideas of clipping and scrolling, again, not things we will encounter in print. Fortunately, these are XSL capabilities that may be ignored by readers interested in print; implementors are not so lucky. I will sufficiently explain viewport concepts so you will be able to read the spec without confusion.

In XSL under normal (meaning print) circumstances, we use the page-width and page-height attributes on the simple-page-master element. In a production context, these attributes are obvious candidates for XSL parameterization. A simple model of the page is illustrated in Figure 3-2. Note that the labeling of the outer regions supposes a left-to-right, top-to-bottom (lr-tb) writing mode.

Figure 3-2. Simple page model

figs/xslf_0302.gif

The page-viewport-area content rectangle is the outermost rectangle, and for any media, this represents the physical bounds of the output medium, e.g., the edges of the sheet of paper.

These might typically be set, for an A4 sheet, using:

<fo:simple-page-master   master-name="simple"   page-height="29.7cm"   page-width="21cm"   ...

For U.S. letter or other, substitute appropriate dimensions. The page-height would be "11in", and the page-width would be "8.5in".

The margin properties on the simple-page-master (see Section 3.3.1") determine the size and position of the page-reference-area content rectangle relative to the content rectangle of the page-viewport-area. For page-masters, there is no ambiguity about the meaning of top, bottom, left, or right when discussing the page-viewport-area edges, and therefore no ambiguity with respect to the corresponding margins (see Section 3.5.2").

These might be set using:

<fo:simple-page-master     master-name="simple"     page-height="29.7cm"     page-width="21cm"     margin-top="1cm"     margin-bottom="2cm"     margin-left="2.5cm"     margin-right="2.5cm">

Note the height and width properties here. One simple way of obtaining a landscape page is to shift vertical and horizontal properties, providing a greater width than height.

3.3.1 Margin Properties for Blocks

XSL-FO defines what are called common margin properties block. These are also applicable in the page context. The common margin properties consist of margin-top, margin-bottom, margin-left, margin-right, space-before , space-after, start-indent, and end-indent. Note that how these properties map on the actual page depends on the writing mode and reference orientation selected.

The value of each property may be either an absolute length or a percentage of the applicable dimension of the containing block or page.

In other words for the page-reference-area content rectangle, we have that:

  1. content-rectangle width = page-width - margin-left - margin-right

  2. content-rectangle height = page-height - margin-top - margin-bottom

The page-height is the distance from top to bottom, and the page-width is the distance from left to right.

Two other attributes that may be set on simple-page-master are writing-mode and reference-orientation. We will shortly examine their impact on the placement of regions.

Be aware that the page-reference-area may not have borders or padding. This is an XSL 1.0 limitation.

The following is a rough description of simple-page-master and its contents:

Element

simple-page-master

Purpose

Defines the basic page master used in XSL 1.0

Properties
  • Common margin properties block

  • master-name

  • page-height

  • page-width

  • reference-orientation

  • writing-mode

Content model

(region-body,region-before?,region-after?,region-start?,region-end?)

3.3.2 Regions

Figure 3-2 indicates the five regions that make up any page that can be created by using simple-page-master. All four outer regions, which correspond to the header, footer, left side-bar, and right side-bar, are optional. These elements are children of the simple-page-master element. Example 3-1 provides a simple-page-master that includes all five regions.

Example 3-1. Region example
<fo:simple-page-master     master-name="simple"     page-height="29.7cm"     page-width="21cm"     margin-top="1cm"     margin-bottom="2cm"     margin-left="2.5cm"     margin-right="2.5cm">     <fo:region-body          margin-top="1cm"/>     <fo:region-before          extent="3cm"/>     <fo:region-after          extent="1.5cm"/>     <fo:region-start          extent="2cm"/>     <fo:region-end          extent="2cm"/> </fo:simple-page-master>

3.3.3 Absolute and Relative Directions

Directions and how they are specified are key concepts in XSL. They figure prominently throughout. To understand some aspects of pagination, we must begin to discuss them here.

A number of formatting objects, including simple-page-master, define so-called reference areas. The important characteristic of such elements is that they may have reference-orientation and writing-mode attributes. That is, they can define coordinate systems.

reference-orientation defines the top for the content-rectangle of the reference area in question, with respect to the containing reference area. Permitted values are 0, 90, 180, and 270, and these specify counter-clockwise (CCW) rotations in degrees. Thus, 90 is the same as 9 o'clock, and -90 is the same as 3 o'clock. The default, or initial, value of reference-orientation is 0, so that if you do not explicitly set any other value, the top of all areas will be the same as the top of the sheet of paper, which is the normal requirement for Western usage. The only valid values are -270 , -180, -90, 0, 90, 180, and 270.

3.3.4 Writing Mode

A clear understanding of writing-mode is necessary both for background and to facilitate the insertion of content that does not use the default.

Loosely speaking, writing-mode specifies the progression direction of blocks (lines and paragraphs, for example) as they are laid out on a page and the progression direction of characters and words within a line. For our purposes, it is sufficient to know that we must use writing-mode to fix both progression directions; this can then be used to determine before, after, start, and end, and we can then map these relative directions to some permutation of top, bottom, left, and right. The specific permitted values for writing-mode are lr-tb (left to right, top to bottom), rl-tb, and tb-rl. The relative directions, as determined by writing-mode, are shown in Figure 3-3.

Figure 3-3. Writing mode and relative directions

figs/xslf_0303.gif

There are other possibilities for writing modes, but these are the three that can be currently specified. It should be clear that the writing-mode uniquely determines the two progression directions, one for blocks and one for inlines; and this, in turn, uniquely fixes the four relative directions.

I don't find these terms particularly intuitive and, hence, keep a small diagram in front of me that shows (for Western use) the before direction at 12 o'clock, the after direction at 6 o'clock, the end direction at 3 o'clock, and the start direction at 9 o'clock. This equates with the first diagram in Figure 3-3. If I ever change the writing-mode, all I need to do is rotate the diagram to maintain my orientation. When I want to specify a border at the left edge of my page, I translate this as being at 9 o'clock for my usage. Further examples of orientation of writing-mode can be found in Section 8.1.2.

The writing-mode property also determines the four edges of an area. Specification of writing-mode on the simple-page-master identifies the before, after, start, and end edges of the page-reference-area content-rectangle, relative to the top of that content-rectangle, as determined by the reference-orientation we have specified on the simple-page-master. This is illustrated in Figure 3-4.

Figure 3-4. The regions of a page

figs/xslf_0304.gif

Rule of thumb: the direction of top is specified by reference-orientation. This is determined first, whether by explicit specification on the formatting object, by inheritance, or by using an initial value. Only then is the writing-mode used to figure out the meaning of before, after, start, and end. When reading writing-modes, bear in mind that lr, rl, and tb are short forms, and correspond to lr-tb, rl-tb, and tb-rl, respectively. These are all related to your manner of writing; for instance, English uses lr-tb, that is, left to right, top to bottom. Other languages have different writing directions. The first part of the writing-mode is the inline-progression-direction, which determines start and end. Similarly, the second part of the writing-mode is the block-progression-direction, which determines before and after.

To illustrate, if you lay out text with blocks (paragraphs) stacking from right to left, and words and characters stacking from bottom to top, this would be a bt-rl writing-mode. If the current absolute orientation of the area in question had top at the (real-world) left (at 90), then bottom-to-top (the inline-progression-direction) runs from -90 (start) to 90 (end), and right-to-left runs from 0 (before) to 180 (after).

Now you will understand exactly how we placed the regions for the diagram in Figure 3-2. The writing-mode is taken as lr-tb, and the reference-orientation on the simple-page-master is assumed to be 0 degrees. Hence, region-before is at 12 o'clock, or the top; region-after is at 6 o'clock, or the bottom; region-start is at 9 o'clock, or the left; and region-end is at 3 o'clock, or the right.

Each of the four outside optional regions is flush with the edge of the page-reference-area content-rectangle of the same name. That is, the before edge of region-before is flush with the before edge of the page-reference-area content-rectangle. The same is true for the other three regions.

The single dimension that can be specified on any of the four optional regions is the extent. This is the value of the extent attribute. The extent is the size of the region measured perpendicularly from the flush edge. It is specified either as an explicit length or as a percentage of the corresponding height or width of the page; the default is 0.0pt. Figure 3-5 illustrates this for Example 3-1. The region-body is not displayed, for clarity's sake.

Figure 3-5. Region extents

figs/xslf_0305.gif

The region-body is sized differently. This formatting object has margins, just as does the simple-page-master. For example, let us assume that we have specified a reference-orientation of 90 on the simple-page-master. top for the simple-page-master points to real-world 9 o'clock. This means that the absolute directions for the page-reference-area content-rectangle, which contains all the regions, are top at 90 (9 o'clock), bottom at -90 (3 o'clock), left at 180 (6 o'clock), and right at 0 (12 o'clock). Margins on the region-body use these directions; margin-top on the region-body therefore is taken from the page-reference-area content-rectangle edge at an absolute direction of 90 (9 o'clock).

Specifying a reference-orientation on the region-body does not affect the determination of margin directions for the region. The reference-orientation on the region establishes a coordinate system for its descendant areas.

The four margins margin-top, margin-bottom, margin-right, and margin-left on the region-body are used to size and position that region, relative to the edges of the content rectangle of the page-reference-area. It is important to understand that the positioning and size of the region-body are therefore independent of the extents of any of the four optional outer regions, present or not.

It is up to you to ensure that the margins for the region-body are equal to or exceed the extent of the outer region on each corresponding edge. If you do not explicitly specify any margin properties, they will be set to 0. If percentages are used, the containing block is the page-reference-area content-rectangle. The percentages are therefore mismatched between the region extents and the margin of the region-body. On region-left, 10% extent is 10% of the page width; on region-body, 10% margin-left is 10% of the page-reference-area (that is, less than the extent!).

Each region establishes a viewport-area/reference-area pair, as does the simple-page-master. The reference-orientation of the reference-area, which receives the actual content, is 0, so it has the same top as the corresponding viewport. The overflow property controls behavior when the content "overflows" the viewport. This has relevance to printed media: the default value of auto allows for user-agent-dependent behavior, and none of the other choices hidden, visible, scroll, or error-if-overflow can be translated into well-defined behavior in a print environment. If this is of concern, refer to Section 7.20.2 in the XSL specification and the formatter documentation.

None of the region reference-areas may have any borders or padding. This is an XSL-FO 1.0 limitation. When combined with the similar injunction placed on the simple-page-master, it means that if you want page borders, you will have to do fairly convoluted things with block formatting objects.

You may also need to control the along-edge dimension of the four outer regions. The degree of control is limited, but you can specify how the regions overlap at the corners. You may specify a value of true or false for the precedence property on the region-before or region-after. The default, or initial, value is false. The block-progression-direction of the region-start or region-end extends to the page margins (to the start or end edge of the content-rectangle of the page-reference-area) if the value of the precedence on the adjacent region-before or region-after is false; otherwise, if the precedence of the region-before or region-after is true, then those regions float into the area that would be otherwise occupied by region-start or region-end. In other words, if you specify true for the precedence on the header (region-start), it will cover the top left and right corners; if you specify false, the left and right areas will cover the corners.

Let us work up a fairly complex simple-page-master and depict the resulting regions:

     <fo:simple-page-master [1]     master-name="recto"         page-height="11in"         page-width="8.5in"         margin-top="1in"         margin-bottom="1in"         margin-left="0.75in"         margin-right="0.5in" [2]         reference-orientation="90" [3]         writing-mode="tb-rl">     <fo:region-body         reference-orientation="90"         margin-top="3in"         margin-bottom="1in"         margin-left="1.5in"         margin-right="1.25in"/>     <fo:region-before          precedence="true"          extent="2in"/>             <fo:region-start [4]               extent="1in"/>             <fo:region-end [5]               extent="1in"/> </fo:simple-page-master>
  1. Page sequence references the simple-page-master using the master-name recto.

  2. The reference-orientation of 90 and the writing-mode of tb-rl mean that top for the page-reference-area content-rectangle is at 9 o'clock; with an inline-progression-direction of tb, start is at 9 o'clock, and end is at 3 o'clock. Similarly, blocks are stacked rl, so before is at 12 o'clock, and after is at 6 o'clock.

  3. The extent of the region-start and region-end are set to 1 inch to be the same as the margins.

To associate content with regions of a page, each region must have a region-name property. The defaults are listed in Table 3-1.

Table 3-1. Default region-names

Region

Default region-name

region-body
xsl-region-body
region-before
xsl-region-before
region-after
xsl-region-after
region-start
xsl-region-start
region-end
xsl-region-end

 

The default values are reserved for the specific regions mentioned in the table. For example, you may not assign a value of xsl-region-before to the region-start.

The region-name property may be assigned a value of your choice, other than the default for that region. region-names must be unique within a single simple-page-master. Finally, you may reuse your own names across page-masters, but they must refer to the same region class. For example, Example 3-2 shows a full document with these areas, using the names correctly.

Example 3-2. Correct region names
<fo:root xmlns:fo="http://www.w3.org/1999/XSL/Format"> <fo:layout-master-set> <fo:simple-page-master          master-name="odd"          page-height="11in"          page-width="8.5in"          margin-top="1in"          margin-bottom="1in"          margin-left="1.25in"          margin-right="0.75in">      <fo:region-body          region-name="xsl-region-body"          margin-top="0.6in"          margin-bottom="0.6in"          margin-left="0.6in"          margin-right="0.6in"/>      <fo:region-before        precedence="true"        border="thin black solid"        region-name="xsl-region-before"        extent="0.5in"/>      <fo:region-after        border="thin black solid"        region-name="xsl-region-after"        extent="0.5in"        precedence="true"/>      <fo:region-start        region-name="xsl-region-start"        border="thin black solid"        extent="0.5in"/>      <fo:region-end        border="thin black solid"        region-name="xsl-region-end"        extent="0.5in"/> </fo:simple-page-master> </fo:layout-master-set>   <fo:page-sequence master-reference="odd" format="A">      <fo:static-content        flow-name="xsl-region-start">        <fo:block> <fo:page-number/>         <fo:block>Ch 1 </fo:block>        </fo:block>      </fo:static-content>      <fo:static-content        flow-name="xsl-region-end">        <fo:block>Page <fo:page-number/>        </fo:block>      </fo:static-content>      <fo:static-content flow-name="xsl-region-before" >        <fo:block display-align="before">Part 1        </fo:block>      </fo:static-content>      <fo:static-content        flow-name="xsl-region-after"        display-align="after">        <fo:block         text-align="center">Page <fo:page-number/>              </fo:block>      </fo:static-content>      <fo:flow flow-name="xsl-region-body">       <fo:block> The quick brown fox jumps over the lazy dog. (fill out with further content to show the full page)       </fo:block>      </fo:flow> </fo:page-sequence> </fo:root>

The following summary provides a rough description of region-body and its contents:

Element

region-body

Purpose

Region containing the body content for a page

Properties
  • Common border, padding, and background properties

  • Common margin properties: block

  • clip

  • column-count

  • column-gap

  • display-align

  • overflow

  • region-name

  • reference-orientation

  • writing-mode

Content model

EMPTY

The region-body can be specified to be multicolumn. Although I will not discuss the complex structure of the resulting areas in a region-body in this chapter, suffice it to say that the column-count property indicates the number of columns on every page instance formatted using the simple-page-master to which this region-body belongs. The column-count must be a positive integer greater than or equal to 1. The default is 1.

If a column-count of greater than 1 is specified, a value may be specified for the column-gap property; the default is 12.0pt. The value is either an explicit length or a percentage of the inline-progression-dimension of the content rectangle of the region-body.

The following summary provides a rough description of the region elements and their contents:

Elements

region-before, region-after, region-start, region-end

Purpose

Regions serving as the header, footer, left sidebar, and right sidebar for a page

Properties
  • Common border, padding, and background properties

  • clip

  • display-align

  • extent

  • overflow

  • precedence (region-before and region-after only)

  • region-name

  • reference-orientation

  • writing-mode

Content model

EMPTY

Each region also has a display-align property. This has the default value of auto, and may be assigned the values auto, before, center, and after. The display-align property controls the alignment of the child areas of the region in the block-progression-direction (top to bottom for a lr-tb page). A detailed explanation of the nuances of this property requires concepts not yet discussed; just be content knowing you can, in fact, influence the vertical placement of content on a page.

Because display-align defaults to before, its default value works well in region-before to keep the content away from the content of the region-body, but unless you explicitly set display-align to after for region-after, the footer content will meet the contents of the region-body, which is generally not desired. In footers, display-align is generally better set to after to separate the footers from the main page content.

3.3.5 Content Flows

The page-sequence element contains the content to fill a sequence of pages. This element is a wrapper for content; the semantics of it derive entirely from its association with either a single simple-page-master or a page-sequence-master. A single page-sequence-master can adequately describe the pagination requirements for one chapter of a book; hence, we consider a page-sequence to be the vehicle for encapsulating the content for a chapter.

A page sequence consists of one primary stream of content, contained within the flow. It may also contain as many content chunks, described by static-content elements, as are required by the header, footer, and sidebar regions of the simple-page-masters ultimately referenced by the page-sequence.

Both fo:flow and fo:static-content are referred to as flows. The terminology is confusing because of the existence of the element of that name, but it is hard to devise anything better. Both are flows in the sense that they provide content to be laid out into regions of pages. A fo:flow is intended to supply content for the region-body and, as content is consumed, it will not be reused. Static-content elements, on the other hand, are reuseable content chunks, capable of customization, which is normally derivative of the specific page that they are currently addressing to provide content for the region-start and region-end (also known as left and right sidebars),[2] for headers and for footers. Page numbering and running headers and footers are examples of content that depends either on the current page or on the content delivered by the fo:flow that has been placed on the current page. This is derivative content.

3.3.6 A Basic Example

Example 3-3 demonstrates the most basic concepts that we have discussed so far.

Example 3-3. A Hello World example
<?xml version="1.0" encoding="utf-8"?> <fo:root xmlns:fo="http://www.w3.org/1999/XSL/Format">   <fo:layout-master-set>               <fo:simple-page-master [1]               master-name="simple"                   page-height="29.7cm"                   page-width="21cm"                   margin-left="2.5cm"                   margin-right="2.5cm">       <fo:region-body margin-top="3cm"/>     </fo:simple-page-master>   </fo:layout-master-set>   <fo:page-sequence           master-reference="simple">             <fo:flow [2]           flow-name="xsl-region-body">       <fo:block>Hello, World</fo:block>     </fo:flow>   </fo:page-sequence> </fo:root>
  1. page-sequence references the simple-page-master using the reference to the simple element.

  2. The flow-name on fo:flow specifies the fo:region-body using its default (implicit) region-name.

An A4 sized page is used (again, use your own dimensions if needed), with reasonable margins and the single content that is the block within the flow. This amounts to the smallest XSL-FO document without using default values and having content meet the edges of the page.

3.4 Complex Pagination

We have, so far, developed a reasonably complete understanding of simple-page-masters, but now it is time to examine complex pagination. What mechanism is available to us to specify the sequence of simple-page-masters that will be used to format a given page-sequence and the flows contained within it? For this purpose, XSL 1.0 provides the page-sequence-master element.

This section will look at how the children of a page-sequence-master may be used to vary the selection of page masters.

A page-sequence may select a simple-page-master directly, using the master-name attribute. This simple-page-master then generates every page required by the flows contained in that page-sequence. In other words, the page master is referenced as many times as is needed. This is shown in Figure 3-6.

Figure 3-6. Single simple-page-master

figs/xslf_0306.gif

A page-sequence may alternatively select a page-sequence-master, also through use of the master-reference attribute. The master-reference on the page-sequence matches the master-name on the page-sequence-master. This is most often useful when the layout goes beyond the simple, single layout needs, requiring varying simple-page-master usage, as is the case when recto and verso pages differ.

A page-sequence is not constrained to use a page-sequence-master that has not been used already. page-sequence-masters are not stateful, in this sense, and effectively "reset" themselves when called upon to supply page-masters to a new page-sequence.

The page-sequence-master is a container for so-called sub-sequence-specifiers, which, by definition, are children of the page-sequence-master. Each of the sub-sequence-specifiers defines a subsequence of the page-sequence in question; the sum of all subsequences is the sequence of pages that results from completely formatting the flow in that page-sequence.

The following summary provides a rough description of page-sequence-master and its contents:

Element

page-sequence-master

Purpose

Specifies the constraints on, and the order in which, a certain set of page-masters generates a sequence of pages

Property

master-name

Content model

(single-page-master-reference|repeatable-page-master-reference|repeatable-page-master-alternatives)+

The XSL specification requires that sufficient virtual page-master capacity be available in the page-sequence-master, as provided through its children, to accommodate the needs of the page-sequence. In other words, if the last subsequence runs out of page-masters and the fo:flow is not exhausted, it is an error. A formatter may recover by using the last page-master.

The mapping of sub-sequence-specifiers to the subsequences that comprise the page-sequence is ordered, and there must be at least as many sub-sequence-specifiers as there are subsequences of pages that are satisfied by the specifiers. In plain English, this means it is acceptable for the flow to finish and leave a number of unused sub-sequence-specifiers. The general idea is depicted in Figure 3-7.

Figure 3-7. Subsequences

figs/xslf_0307.gif

During the processing of one page-sequence, once a page-sequence-master is selected, the sub-sequence-specifiers are used, in order, starting with the first, and without breaks,[3] until the flow is completely processed. Sub-sequence-specifiers may not be reused within the context of formatting a single page-sequence.

3.4.1 Single-page-masters

The single-page-master-reference causes exactly one page to be generated. The subsequence that corresponds to this specifier consists of one page. The simple-page-master that is to be used is identified using the master-reference attribute on the page-sequence that corresponds to the single-page-master-reference.

This sub-sequence-specifier is especially useful for front matter and back matter. It is most commonly found at the beginning of page-sequence-masters. Example 3-4 shows a single-page-master-reference.

Example 3-4. A single-page-master-reference
<fo:root> <fo:layout-master-set> <fo:simple-page-master         master-name="single"         page-height="11in"         page-width="8.5in"         margin-top="1in"         margin-bottom="1in"         margin-left="0.5in"         margin-right="0.5in">     <fo:region-body         margin-top="0.5in"         margin-bottom="0.5in"/> </fo:simple-page-master> <fo:page-sequence-master        master-name="single-page">     <fo:single-page-master-reference         master-name="single"/> </fo:page-sequence-master> </fo:layout-master-set> <fo:page-sequence      master-reference="single-page">     ... CONTENT ... </fo:page-sequence> </fo:root>

Note the difference between using a single-page-master-reference and using a simple-page-master directly. In the latter case, the number of instances of pages that are generated is potentially unbounded. In the former case, the renderer will produce an error or warning if the page-sequence contains more than one page of content.

Note also the required link-back from the page-sequence back to the page-sequence-master, which in turn links back to the simple-page-master. This is the basic mechanism by which a page layout is selected for any content.

3.4.2 Constructing Runs of Identical Pages

The repeatable-page-master-reference causes a bounded or unbounded sequence of pages to be generated using the same page-master. The simple-page-master is referenced using the master-reference attribute on repeatable-page-master-reference. The maximum-repeats attribute can be used to set an upper limit on the number of pages that may be generated using this specifier.

The maximum-repeats attribute is typically used to restrict a flow to a fixed number of pages. Use this if, for example, you require a particular content to be limited to 10 pages.

The initial, or default, value of maximum-repeats is no-limit, meaning it will generate a subsequence of pages that consume the rest of the current fo:flow. Other permitted values are integers, from to N. A value of 0 indicates that this sub-sequence-specifier maps to a page subsequence of zero length. Negative values are rounded to 0; positive fractions are rounded up to the nearest integer. Example 3-5 shows the use of a repeatable-page-master-reference.

Example 3-5. A repeatable-page-master-reference
<fo:root> <fo:layout-master-set> <fo:simple-page-master         master-name="many"         page-height="11in"         page-width="8.5in"         margin-top="1in"         margin-bottom="1in"         margin-left="0.5in"         margin-right="0.5in">     <fo:region-body         margin-top="0.5in"         margin-bottom="0.5in"/> </fo:simple-page-master> <fo:page-sequence-master        master-name="many-pages">     <fo:repeatable-page-master-reference         master-name="many"         maximum-repeats="10"/> </fo:page-sequence-master> </fo:layout-master-set> <fo:page-sequence     master-reference="many-pages">     ... CONTENT ... </fo:page-sequence> </fo:root>

3.4.3 Conditional Selection of Page Masters

The most powerful and challenging sub-sequence-specifier is the repeatable-page-master-alternatives formatting object. Some of the nuances will become more clear when we discuss page-breaking in Section 5.1.3. This element does not have a master-reference attribute, because it doesn't reference page masters directly its children do.

Use this element to select one from a number of alternatives for content. A number of conditions may be tested, related to a page's position within a sequence, the page number, or whether or not a particular page is blank.

The children of the repeatable-page-master-alternatives element are known as alternatives. Each alternative is represented using the conditional-page-master-reference formatting object. Each one refers to a specific simple-page-master by name, using the master-reference attribute. It considers each alternative in order. The first condition for which all of the subconditions are true causes its corresponding conditional-page-master-reference to be selected, and the simple-page-master referenced by that alternative generates the current page. The repeatable-page-master-alternatives element may contain one or more of alternatives, although in practice, there are rarely more than three or four. The alternatives have traits, specified using properties on each conditional-page-master-reference, that specify the conditions that must be satisifed for this particular page layout to become active. Example 3-6 uses odd, even, blank, or last pages. If all the conditions for a particular alternative are satisfied, the simple-page-master referenced is used.

The primary use of this class of sequence is to organize content layout such that page layout is grouped according to the formatted output page position or content.

It is considered good practice to supply a final conditional-page-master-reference that has a condition that must always be true. This is akin to a default: statement inside a C or Java switch block. If, at some point during use of a repeatable-page-master-alternative, no condition is true, use of this sub-sequence-specifier will terminate.

Three properties may be used to specify the conditions upon which the selection of the alternative is made:

  • page-position

  • odd-or-even

  • blank-or-not-blank

The page-position trait may take the values first, last, rest, or any. The default is any. The values are interpreted as follows:

first

The subcondition is true if the current page is the first page in the page-sequence.

last

The subcondition is true if the current page is the last page in the page-sequence.

rest

The subcondition is true if the current page is neither the first nor the last page in the page-sequence.

any

Always true.

first or last relates to the formatted output of the flow. Think of it as pouring text into containers. The first piece of content poured has the value of first; the last piece of content has the value of last, in this sense. This might be the first or last piece of content of a chapter or article, as templates are applied to produce the content within the flow.

The odd-or-even trait may have the values odd, even, or any. The default is any. The parity of the page-number is determined with respect to the page-number trait for the current page; see Section 3.5.

The values are interpreted as follows:

odd

The subcondition is true if the current page number is odd.

even

The subcondition is true if the current page number is even.

any

Always true.

This property is used to select the formatting required for odd or even pages, for example, a page layout with left and right margins appropriate for the layout of a book such that the margins nearer the gutter are larger than the opposite margins; this provides a more even appearance when the book is laid open.

The blank-or-not-blank trait may have the values blank, not-blank, or any. The default is any. It may not be immediately obvious that a blank page would be generated; one possibility is to use force-page-count, mentioned in Section 3.5; a discussion of other possibilities will have to wait for the examination of page breaks. The values of this property are interpreted as follows:

blank

The subcondition is true if the current page contains no areas generated from the fo:flow.

not-blank

The subcondition is true if the current page contains areas from the fo:flow.

any

Always true.

Note that we are concerned with areas generated by the fo:flow, not by fo:static-content. So-called "blank" pages will often end up with headers or footers. If you do not yet know what page-master you will use, how can you make a determination of what, if any, the applicable static-contents are? But you can always determine whether the current page will contain areas from the fo:flow. So this condition concerns itself only with fo:flow, not all content on the page.

Static content (as it is referred to in the specification) isn't exactly static. The idea is that, compared to page body content, the headers and footers change relatively little, hence, they are said to be "static." I find this confusing because the most common content for a header or footer is the page number, which changes every page! However, the specification calls it static content, so that's that. It is also static in that it is the same size and location for each page layout, so perhaps we might sway towards agreeing with the spec writers. For simplicity's sake, where you see static content, think headers and footers.

The repeatable-page-master-alternatives formatting object has a maximum-repeats property, which has the same meaning and default value as it does for the repeatable-page-master-reference formatting object. In effect, the number of times we test the conditions supplied by the repeatable-page-master-alternatives to select appropriate page masters may be bounded.

Let us construct an example and use it to clarify everything we have learned so far about repeatable-page-master-alternatives, conditional-page-master-reference, conditions, and subconditions.

3.4.4 Page Conditions

I have already mentioned that a fo:page-sequence is often used to model a chapter of a book or a complete article. In this capacity, let us surmise that we wish to prepare a page-sequence-master that can handle chapters with the following structure:

  • First page

  • The rest of the pages, except for the last

  • Last page

Let us also consider that the chapters have some internal structure that might result in blank pages (page break conditions, which are mentioned later, could cause this). We are also using the force-page-count property on the page-sequence. Note that this is an extended property that should probably be avoided in favor of initial-page-label="auto-odd" . We are using the force-page-count property on the page-sequence to ensure that the total page count is even. (A portable alternative is to use the initial-page-number attribute on the following page-sequence). We also want to handle internal blank pages and blank last pages differently. Because we want new chapters to start on an odd page, we set force-page-count to even. Separate page masters will be employed for the first page, for the last page, for blank pages, and for even and odd pages.

The comments prior to each block of code in Example 3-6 explain the purpose of that block.

Example 3-6. Conditional page selection
<?xml version="1.0" encoding="utf-8"?> <fo:root xmlns:fo="http://www.w3.org/1999/XSL/Format">     <fo:layout-master-set>         <!-- layout for the first page -->         <fo:simple-page-master             master-name="first"             page-height="29.7cm"             page-width="21.0cm"             margin-top="2cm"             margin-bottom="2cm"             margin-left="2.5cm" margin-right="2.5cm">                         <fo:region-body                             margin-top="10cm"                             margin-bottom="2cm"/>                         <fo:region-after                             region-name="non-blank-after"                             extent="2cm"/>         </fo:simple-page-master>         <!-- layout for odd pages -->         <fo:simple-page-master             master-name="odd"             page-height="29.7cm"             page-width="21.0cm"             margin-top="2cm"             margin-bottom="2cm"             margin-left="3.5cm"             margin-right="1.5cm">                         <fo:region-body                             margin-top="2cm"                             margin-bottom="2cm"/>                         <fo:region-before                             region-name="odd-before"                             extent="2cm"/>                         <fo:region-after                             region-name="non-blank-after"                             extent="2cm"/>         </fo:simple-page-master>         <!-- layout for even pages -->         <fo:simple-page-master             master-name="even"             page-height="29.7cm"             page-width="21.0cm"             margin-top="2cm"             margin-bottom="2cm"                 margin-left="1.5cm"                 margin-right="3.5cm">                         <fo:region-body                             margin-top="2cm"                             margin-bottom="2cm"/>                         <fo:region-before                             region-name="even-before"                             extent="2cm"/>                         <fo:region-after                             region-name="non-blank-after"                             extent="2cm"/>         </fo:simple-page-master>         <!-- layout for odd last page, blank or not-blank -->       <!-- Note that this is redundant in the example -->         <!-- layout for even last page, blank or not-blank -->         <fo:simple-page-master master-name="last_even"             page-height="29.7cm"             page-width="21.0cm"             margin-top="2cm"             margin-bottom="2cm"             margin-left="1.5cm"             margin-right="3.5cm">                         <fo:region-body                             margin-top="2cm"                             margin-bottom="2cm"/>                         <fo:region-before                             region-name="even-last-before"                             extent="2cm"/>                         <fo:region-after                             region-name="last-after"                             extent="2cm"/>         </fo:simple-page-master>         <!-- layout for blank pages (non-last) -->         <fo:simple-page-master            master-name="blank"             page-height="29.7cm"             page-width="21.0cm"             margin-top="2cm"             margin-bottom="2cm"                 margin-left="2.5cm"                 margin-right="2.5cm">                         <fo:region-body                             margin-top="2cm"                             margin-bottom="2cm"/>                         <fo:region-before                             region-name="blank-before"                             extent="2cm"/>                         <fo:region-after                             region-name="blank-after"                             extent="2cm"/>         </fo:simple-page-master>                 <fo:page-sequence-master                     master-name="chapter">                         <fo:repeatable-page-master-alternatives>                                 <fo:conditional-page-master-reference                                         master-reference="odd"                                         page-position="rest"                                         odd-or-even="odd" />                                 <fo:conditional-page-master-reference                                         master-reference="even"                                         page-position="rest"                                         odd-or-even="even" />                                 <fo:conditional-page-master-reference                                         master-reference="first"                                         page-position="first" />                                 <fo:conditional-page-master-reference                                         master-reference="last_even"                                         odd-or-even="even"                                         page-position="last" />                                 <fo:conditional-page-master-reference                                         master-reference="blank"                                         blank-or-not-blank="blank" />                         </fo:repeatable-page-master-alternatives>                 </fo:page-sequence-master>         </fo:layout-master-set>         <!-- end: defines page layout -->         <!-- actual layout -->         <fo:page-sequence          master-reference="chapter"          force-page-count="even"          initial-page-label="1">         <fo:static-content              flow-name="non-blank-after">             <fo:block> <fo:page-number/>    <!-- content for non-blank page footers --></fo:block>         </fo:static-content>         <fo:static-content             flow-name="blank-before">             <fo:block> <fo:page-number/>                <!-- content for blank page headers -->             </fo:block>         </fo:static-content>         <fo:static-content             flow-name="blank-after">             <fo:block> <fo:page-number/>                  <!-- content for blank page footers -->             </fo:block>         </fo:static-content>         <fo:static-content flow-name="odd-before">             <fo:block> <fo:page-number/>                <!-- content for odd page headers -->             </fo:block>         </fo:static-content>         <fo:static-content flow-name="even-before">             <fo:block> <fo:page-number/>            <!-- content for even page headers -->             </fo:block>         </fo:static-content>         <fo:static-content flow-name="even-last-before">             <fo:block> <fo:page-number/>              <!-- content for even last page headers -->             </fo:block>         </fo:static-content>         <fo:static-content flow-name="last-after">             <fo:block> <fo:page-number/>               content for last page footers --- >             </fo:block>   </fo:static-content>         <fo:flow flow-name="xsl-region-body">             <fo:block>               Insert sufficient content for 35 pages to complete               this example             </fo:block>             <fo:block break-before="page"/>         </fo:flow>     </fo:page-sequence> <fo:page-sequence          master-reference="chapter"          force-page-count="even"             initial-page-label="39">         <fo:static-content             flow-name="non-blank-after">             <fo:block> <fo:page-number/>              content for non-blank page footers --- >             </fo:block>         </fo:static-content>         <fo:static-content flow-name="blank-before">             <fo:block> <fo:page-number/>             content for blank page headers --- >             </fo:block>         </fo:static-content>         <fo:static-content flow-name="blank-after">             <fo:block> <fo:page-number/>             content for blank page footers --- >             </fo:block>         </fo:static-content>         <fo:static-content            flow-name="odd-before">             <fo:block> <fo:page-number/>              content for odd page headers --- >             </fo:block>         </fo:static-content>         <fo:static-content flow-name="even-before">             <fo:block> <fo:page-number/>             content for even page headers --- >             </fo:block>         </fo:static-content>         <fo:static-content flow-name="even-last-before">             <fo:block> <fo:page-number/>              content for even last page headers --- >             </fo:block>         </fo:static-content>         <fo:static-content flow-name="last-after">             <fo:block> <fo:page-number/>              content for last page footers --- >             </fo:block>         </fo:static-content>         <fo:flow flow-name="xsl-region-body">             <fo:block> Insert sufficient content for some              more  pages to complete this example             </fo:block>         </fo:flow>     </fo:page-sequence> </fo:root>

The repeatable-page-master-alternatives formatting object has a maximum-repeats attribute, which is used in exactly the same fashion as described for repeatable-page-master-alternatives, except that the individual pages are instances of simple-page-masters that are chosen according to conditions. Figure 3-8 demonstrates how these alternatives may work.

Figure 3-8. Repeatable-page-master-alternatives

figs/xslf_0308.gif

3.5 Page Sequences

So far, I have talked about aspects of fo:page-sequence its children, the page-masters to which it points in one fashion or another that pertain to how sequences of pages are married with their page-masters. I have also talked at length about the structure of a page. I have left out several properties that deal with page numbering, but I'll introduce them soon. There are also two other properties that I will talk about that introduce elements of internationalization.

3.5.1 Page Numbering

The initial-page-number property fixes the page number for the first page of the page-sequence to which it applies. The values of the property and its interpretation are listed as follows:

auto

If this is the first page-sequence, the initial page number becomes 1. If it is not the first page-sequence, the initial page number of the current page-sequence becomes the page number of the last page of the preceding page-sequence, plus 1. That is, it simply continues numbering pages sequentially.

auto-odd

As for auto. If the resulting value is even, add 1.

auto-even

As for auto. If the resulting value is odd, add 1.

[number]

A positive integer, that is, 1 or greater. If a non-positive integer is supplied, this number is rounded to the nearest positive integer.

To force content to be numbered starting at, say, page 51, simply use this property, as in Example 3-7.

Example 3-7. Forced page numbering
<fo:page-sequence  master-reference="chapter"  initial-page-label="51" ...

If the first page-sequence has no value specified for initial-page-number, the default of auto is used, and hence, the first page is numbered as 1.

The force-page-count property imposes a condition on the number of pages in a page-sequence. This number may be an absolute count or a parity condition. For each condition, if the condition is not satisfied, one page is added to the current page-sequence. The values of the property and its interpretation are listed as follows:

auto

The action taken depends on the existence of a succeeding page-sequence and the value of its initial-page-number property. If there is a succeeding page-sequence and an even initial-page-number is explicitly specified on it, the current page-sequence must adapt.

even

Force an even page count for the page-sequence.

odd

Force an odd page count for the page sequence.

end-on-even

Force the last page to have an even page number.

end-on-odd

Force the last page to have an odd page number.

no-force

Do not force any page count.

Note that the default value is auto, which is mostly what is wanted, resulting in alignment in the before direction. As a starting point, try the initial-page-number property, which is the easier option here.

Consider what will happen if we set various values for the force-page-count property on the first page-sequence, and modify the second page-sequence to have an explicit initial-page-number value of 39. Let us also assume that the first page-sequence now formats out to less than 39 pages, say 37. If the first section actually formats out to 15 pages, by default, the renderer will add an even page, so that the next page-sequence can start on an odd page, but it won't add the extra pages that are needed to fill in the numeric gaps.

If the default, auto, is in effect, then because the next page-sequence is required to start with a page number of 39, the last page of this one must be even. Currently, it is 37, so a page must be added.

If we set force-page-count to even, a page must be added, bringing the page count up to 38. The last page is numbered 38. Again, note that this could be achieved more portably using simply the initial-page-number property. If we set force-page-count to odd, no action needs be taken. If we set force-page-count to end-on-even, a page must be added, so that the last page is numbered as 38. If we set force-page-count to end-on-odd, no action needs be taken.

If you use initial-page-number attribute and a value of auto (the default) on the page count, this should normally result in the output needed. It is simple and reliable.

 

Do not assume that an actual formatter will follow a particular method of forcing the page count, as in adding blank pages. An implementation may elect to use a different strategy to satisfy this constraint; and this may result in an unexpected blank page or no blank page where you would expect one.

If you are concerned with this class of problem, it may well be worth experimenting with content and these properties to fully come to terms with them. It is also wise to ensure which of these properties are supported by the formatter of your choice.

Four properties that influence the formatting of the page number if it is requested are format, letter-value, grouping-separator, and grouping-size. These properties are defined in Section 7.7.1, "Number to String Conversion," of the XSLT specification.[4] Read that W3C Recommendation for a full exposition; here's a short synopsis.

The common values that the format property may include are 1, which results in a sequence of the form 1, 2, 3, ..., 10, 11, 12, ..., 100, 101, 102, ..., or variants such as 001, which results in a sequence of the form 001, 002, 003, ..., 010, 011, 012, ..., 100, 101, 102, .... A generates an uppercase sequence of the form A, B, C, ..., AA, AB, AC, .... a generates a lowercase sequence of the form a, b, c, ..., aa, ab, ac, .... i generates a sequence of lowercase Roman numerals, and I, a sequence of uppercase Roman numerals. There are other possibilities. Be advised that numbering is influenced by language (see the next section) and that these examples are true for Western scripts, not necessarily others.

The letter-value property disambiguates between an alphabetic letter sequence, such as that realized in English by the format token a and some other assignment of numbers to letters, such as the Roman numeral system in English. The first is obtained by specifying the value of alphabetic, and the other is obtained by specifying a value of traditional. The property is used when the format token would be the same; in other words, the first member of the alphabetic sequence is the same as the first member of the traditional sequence. This is not an issue in English.

For readability, long numbers are frequently grouped, e.g., 10000 becomes 10,000. The grouping-separator property specifies the separator, in this example, a comma; and the grouping-size property specifies the size of the grouping, in this example, 3. For a grouping-separator of . and a grouping-size of 2, 10000 would become 1.00.00, which may or may not be meaningful to you.

3.5.2 Country and Language

The page-sequence country property is specified either as the value none, which is the default, or as an ISO-3166 country specifier; for example, United States is us, United Kingdom is gb, Canada is ca, and Estonia is ee. See http://www.ietf.org/rfc/rfc3066.txt for more information.

The page-sequence language property is specified either as the value none (the default), or as an ISO-639 language code, listed at http://xml.coverpages.org/languageIdentifiers.html. This is a two-letter tag. Again, if you have programmed for web services to any extent, you will be somewhat familiar with such codes as en for English, fr for French, and et for Estonian.

The values of the language and country properties affect the formatting of the fo:block and fo:character elements, which you'll learn about in upcoming chapters. Both of the properties in combination influence hyphenation, line justification, and line breaking.

[1]  I will adhere to the convention of using an fo: prefix to refer to elements in the FO, or http://www.w3.org/1999/XSL/Format namespace. You may use anything you like in place of fo:. For sake of simplicity, I will not generally use prefixes in the main narrative, except where necessary to avoid confusion (as when referring to the fo:flow element as a flow).

[2]  In this context, a sidebar refers to the content placed into region-start or region-end. Be aware that sidebars, in a more general sense, are blocks of explanatory text taken out of the normal narrative flow and visually set apart.

[3]  It is possible for a sub-sequence-specifier to match a subsequence containing zero pages.

[4]  You can find this at http://www.w3.org/TR/xslt.

CONTENTS


XSL-FO
Xsl Fo
ISBN: 0596003552
EAN: 2147483647
Year: 2002
Pages: 24
Authors: Dave Pawson

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net