Installing and Configuring FOP


FOP requires at least a JDK 1.3 installation in order to operate. Before you can do anything with FOP, you need to download and install a build. You can obtain an FOP build by going to http://xml.apache.org/dist/fop, where you’ll see a list of the current official FOP builds. The build for a particular version of FOP is divided into two parts. Let’s use FOP 0.20.5 as an example. The binary distribution of FOP 0.20.5 is in a file named fop-0.20.5-bin..xxx, where xxx is either .zip or .tar.gz, depending on which kind of compressed archive format you need. Typically, people on Windows use a .zip file, whereas people on Mac OS X, Linux, and UNIX of various sorts use a .tar.gz file. There are also .xxx.asc files, which are detached PGP signatures of the corresponding .xxx files. So, fop-0.20.5-bin.asc contains the signature file for the fop-0.20.5-bin.zip distribution file. You can use PGP and the signature file to verify that the contents of the distribution haven’t been tampered with. In addition to the binary distribution, you can download a source distribution, fop-0.20.5-src.zip, if you want to look at or modify the sources.

We’ll focus on installing the binary distribution. Once you’ve downloaded a binary distribution, you should unpack it using either WinZip or tar as discussed in the previous paragraph. Doing so creates a directory called fop-0.20.5 in either the current directory or the directory you specified to your archiving utility. The key files in this directory are as follows:

  • build—A directory containing fop.jar and the FOP site documentation (in build/site).

  • conf—A directory containing the configuration files for FOP.

  • examples—The FOP sample programs.

  • lib—A directory containing the jar files FOP needs. FOP includes versions of jars that have been tested with that version of FOP. In particular, versions of Xerces, Xalan, and Batik are included in the distribution. You may have more recent versions of these libraries on your system. We’ll discuss this issue more in a moment.

  • fop.bat—A Windows batch file for running FOP from the command line.

  • fop.sh—A UNIX/Linux shell script for running FOP from the command line.

You must include all the jars in the lib directory in your Java classpath in order to use FOP in your application. There are a variety of ways to accomplish this, including setting the CLASSPATH environment variable in your DOS Command window or UNIX shell window. You can also set the CLASSPATH for the application server you’re using. The issue is complicated a bit by the fact that FOP includes copies of the Xerces, Xalan, and Batik jars. Depending on what your application does, you may need specific versions of these jars that differ from the versions provided with FOP. In most cases, you can use the jars you need, but in some cases you may run into conflicts because of bugs that have been fixed or worked around in the different versions of Xerces, Xalan, or Batik.

Hyphenation

FOP provides hyphenation on a per-language basis. The standard FOP installation comes with support for English, Spanish, Finnish, Hungarian, Italian, Polish, Portuguese, and Russian. If you need hyphenation support for a language that isn’t in this list, you’ll need to create a new hyphenation pattern file.

FOP’s hyphenation rules use an XML-based pattern scheme based on the hyphenation rules found in TeX. TeX is a powerful document-formatting system popular in academia, particularly in math and the sciences. If TeX has a hyphenation file for the language you’re interested in, you can use it to help generate a file that FOP can use.

FOP’s hyphenation file is an XML file that looks like this:

  1: <hyphenation-info>   2:  <hyphen-char value="-"/>    3:  <hyphen-min before="3" after="2"/>   4:  <classes>aA bB cC — zZ</classes>   5:  <exceptions>as-so-ciate present ta-ble</exceptions>   6:  <patterns>.custom5 .di4al. a5tel.</patterns>   7: </hyphenation-info>

The root element is <hyphenation-info>. The <hyphen-char> element is used to say which character is used as a hyphen. In this example, the character is the expected “-” character (line 2). The <hyphen-min> element specifies the minimum number of characters that must appear on a line before and after a hyphenated word break. In this example file, there must be three characters before the hyphen and two characters after the hyphen (line 3).

Lines 4-6 are shortened versions of what would appear in a real hyphenation file. They are there to show you the order in which the <classes>, <exceptions>, and <patterns> elements must appear. The <classes> element contains sets of characters separated by whitespace. The hyphenation engine treats all characters in a set as equivalent for hyphenation purposes. So, sets like aA, bB, cC, and so on tell the engine to treat the uppercase and lowercase versions of the same letter as equivalents for the purpose of hyphenation. We’ll skip the <exceptions> element for now and come back to it after we describe what the <patterns> element contains.

The <patterns> element contains pattern strings separated by whitespace. Within a pattern, each non-numeric character is a character that may appear in a word. The period (.) character represents a word boundary—either the beginning or end of a word, depending on where it appears. If a number appears in the pattern, it’s a score indicating the desirability of a hyphen at the position where the number appears. The scoring system is a bit complicated. If the number is odd, then the location is desirable for a hyphen; 5 is the score for most desirable and 1 is the score for least desirable. If the number is even, then the location is undesirable for a hyphen; 0 (which is assumed if no number is present) isn’t desirable, whereas 4 is extremely undesirable. In the sample file, the patterns say “it’s very desirable to have a hyphen after custom, when custom is the start of the word,” “it’s extremely undesirable to hyphenate dial between the i and the a,” and “if a word ends in atel, then the best place for the hyphen is between the a and the t.”

The <exceptions> element contains a list of words separated by whitespace. This list overrides the rules in the <patterns> element. Two kinds of words appear in the list: Words containing hyphens should only be broken at the point where the hyphen appears, and words containing no hyphens should never be hyphenated.




Professional XML Development with Apache Tools. Xerces, Xalan, FOP, Cocoon, Axis, Xindice
Professional XML Development with Apache Tools: Xerces, Xalan, FOP, Cocoon, Axis, Xindice (Wrox Professional Guides)
ISBN: 0764543555
EAN: 2147483647
Year: 2003
Pages: 95

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net