International Features


  • Mark: In typography, a glyph for a character like a diacritic or tone mark that combines with other marks or characters. Marks can be spacing or nonspacing glyphs in a font.
  • Cursive attachment: Used when adjacent glyphs need to be positioned in order to join them cursively. It is heavily utilized in fonts that support cursive scripts like Arabic.
  • Mark attachment: Used in typography when marks need to be attached or positioned to other marks, base glyphs, or ligatures.

While TrueType worked well when developing fonts for languages that use a Latin script, more flexibility was required in order to design fonts for languages that use complex scripts like Arabic, Indic languages, and so on. OpenType Layout provides the necessary infrastructure in fonts for laying out text in multilingual and non-Latin environments, by providing the necessary layout information within five tables in OpenType fonts. (See Table 20-1.)

Table 20-1 OpenType Layout tables.



Baseline table (BASE)

Makes information available about baseline offsets when composing multilingual text in a line

Glyph Definition table (GDEF)

Contains information about the type of glyphs (simple, ligature, mark) used in the GSUB and GPOS tables

Glyph Positioning table (GPOS)

Contains information about x and y positioning of glyphs; positioning can be achieved using single glyph adjustment, pair adjustment, cursive attachments, mark attachments, or contextual positioning

Glyph Substitution table (GSUB)

Contains information about a number of glyph substitution types such as single glyph substitution, substitution of alternate forms of glyphs, ligature substitutions, one-to-many substitutions (decomposing a glyph), and contextual substitutions

Justification table (JSTF)

Contains justification information for glyphs, as well as white-space adjustments and kashidas (continuous horizontal lines inserted between adjoining characters that make a word occupy more space for text- justification purposes)

Layout data in the GSUB and GPOS tables is organized in a well-defined structure consisting of script, language system, typographic feature, and lookup. Scripts, which are defined at the top level, help identify the collection of glyphs in the font that will be used to represent one or more languages. Language systems, which are defined within scripts, can modify the function or appearance of glyphs in a script for a particular language. Language systems define features, which are essentially rules for using glyphs to represent that language. Features are classified based on function into the following categories: those that influence metric behavior, those that address linguistic requirements, and those that allow typographic enhancements. Features, in turn, are implemented by using data in lookups. This data provides information about glyphs that are affected by an operation, the type of operation (substitution or positioning), and the resulting glyph output. The data is used by a text-processing client to substitute or position glyphs. Figure 20-1 depicts the structure of layout data in GSUB and GPOS tables.

figure 20.1 structure of layout data in gsub and gpos tables.

Figure 20.1 - Structure of layout data in GSUB and GPOS tables.

Scripts, language systems, and layout features are identified using tags made of 4-byte character strings. Each tag has a specific meaning and conveys precise information to developers and text-processing applications. Clients or text-processing applications retrieve and parse information in the GPOS and GSUB tables and then use it to lay out text.

There are times when a glyph needs to be substituted with an alternative glyph depending on the particular context involved. As mentioned, the GSUB table contains information on glyph substitution types, alternate forms, and so on. The following section examines a lookup from a GSUB table that is used for glyph substitution.

Substitution of Alternative Glyph Forms

The following code, from a GSUB table in a Devanagari font, is a lookup that substitutes the glyph sequence hiRaFull hiHalant with hiReph. The glyph hiRa-Full contains outlines for the consonant "Ra" and is mapped to U+0930. The hiHalant glyph contains outlines for Halant (or Virama), and is mapped to U+094D. The Halant (or Virama) is a vowel omission sign whose purpose is to cancel the inherent vowel of the consonant to which it is applied. The hiReph glyph contains outlines for the Reph-an orthographic abbreviation for displaying the sequence of "Ra" Halant (Virama) when a syllable in Devanagari begins with these characters. The substitution of the Reph is a linguistic feature that is required in this script.

 Lookup LookupReph 4       ; Lookup Type = ligature subst 0       ; LookupFlag = 0 1       ; Number of subtables SubstTableReph     ;       Offset to subtable LigatureSubstFormat1 SubstTableReph 1       ;  Format CoverageReph    ; Offset to Coverage 1       ;  Ligature Set Count LigSetRephRaFull LigatureSet LigSetRephRaFull 1 LigReph Ligature LigReph hiReph 2 hiHalant CoverageFormat1 CoverageReph 1            ; Coverage Format 1            ; Number of glyphs in coverage hiRaFull 

Figure 20-2 illustrates the substitution rule shown in the previous lookup code.

figure 20.2 substitution rule in the gsub lookup code.

Figure 20.2 - Substitution rule in the GSUB lookup code.

Up until now, you have seen some of the advantages that OpenType provides, as well as how OpenType is able to handle layout, substitutions, and so on. However, what enables OpenType to offer capabilities such as the shaping and positioning of glyph strings or the reordering of characters? The following section explores this question.

How OpenType Works

OpenType fonts for international complex scripts make use of Windows operating-system services like Uniscribe (also known as the "Unicode Script Processor") to shape and then position glyph strings that character strings are mapped to. Uniscribe, in turn, uses the OpenType Layout services library, which is a set of helper functions. These functions retrieve glyph substitution and positioning data in a font and lay out the glyph string during text processing.

Uniscribe is a collection of application programming interfaces (APIs) that enables a text-layout client to format complex scripts. Within Uniscribe are multiple shaping engines that contain layout knowledge for various scripts such as Arabic, Devanagari, Hebrew, Thai, among others, some of which require char acters to be reordered within character "clusters." Uniscribe creates and manages a buffer of appropriately reordered character codes. It then obtains the corresponding glyph string by passing the reordered character string to the glyph substitution function of the OpenText Layout services. Because glyph strings are obtained from reordered character strings, the layout features in such fonts are encoded to map reordered characters (and combinations of characters) to their corresponding glyphs. Consequently, font developers are saved from several layers of complexity when they define features, which allows Uniscribe to perform standard character-reordering operations. If a glyph string needs to be both substituted and positioned, substitution is always done prior to any positioning operation. Figure 20-3 illustrates the processing of a sample complex-script string and shows how the substitution rule is applied.

figure 20.3 using the substitution lookup in a devanagari string.

Figure 20.3 - Using the substitution lookup in a Devanagari string.

Specifications for creating a wide range of OpenType fonts used in complex scripts are available on the Microsoft Typography site at Other tables that provide pointers to scripts supported in a font are the cmap and OS/2 tables. However, the information these tables give is not used for OpenType Layout purposes.

Microsoft Corporation - Developing International Software
Developing International Software
ISBN: 0735615837
EAN: 2147483647
Year: 2003
Pages: 198

Similar book on Amazon © 2008-2017.
If you may any questions please contact us: