The Text Processing Pipeline


To begin the discussion, the text processing pipeline on Mac OS begins with a block of text, often one typed in by the user and ends with that text being drawn into a graphics context. Along the way the computer has to work with the writing system that generated the text, how the text is represented, and visual attributes like fonts and styles.

Storing Text

The first step in rendering text is storing the text itself. Computers take written language and break it down into a series of characters. Each character corresponds to a specific symbol or idea in a particular writing system. For example, when writing English text, the uppercase letter "A," the number three, and punctuation marks like periods and question marks are all characters. There is no way to draw a character; characters represent conceptual elements of the writing system. The computer stores a series of characters in memory using an encoding scheme. ASCII was an encoding scheme for representing the characters used to write English text in early computers. Modern computers use a character encoding scheme called Unicode. By using Unicode, Mac OS X gains the ability to represent and store most of the languages of the world in a single encoding.

If you want to learn more about Unicode and the various ways computers use Unicode to represent text, you can visit the Unicode Consortium web site at

http://www.Unicode.org/


Representing Text Visually

When the computer wants to draw a Unicode string, it has to come up with a group of pictures to draw that represent the characters. These pictures are called glyphs. Figure 11.1 shows several different glyphs that all represent the character for the number 7.

Figure 11.1. Seven Glyphs Representing the Number Seven


In the figure, each of the glyphs came from a different font. A font is simply a collection of related glyphs that the computer uses to draw some characters. Usually all the glyphs in a font will have similar drawing characteristics. The people who design fonts also collect individual fonts together into font families, and each font in a family will often differ from the other fonts with a particular stylistic variation. For example, the Times font family traditionally has four variations or styles. The first is often called the "plain" or "normal" style of the font (it is also called the regular style). Other fonts in the family include a boldfaced variation (a.k.a. bold), an italic variation, and a version that is both bold and italic.

Developers familiar with QuickDraw, who are moving to more modern text drawing APIs, may find themselves confused by the terminology surrounding fonts. In the QuickDraw model, the computer works with "fonts," and "styles." These terms correspond roughly to "families" and "fonts" respectively in modern text systems. QuickDraw had the ability to synthesize some styles, such as bold and italic, from other font information. If a font had additional variations, like an ornaments version or a book style, the computer had to create artificial font names and synthetic styles to try and represent them.

The modern APIs use the families and fonts directly. A particular font family can contain fonts in hundreds of variations like ornaments, black, medium, book, and condensed or extended. This allows the computer to present the fonts as the original typeface designer created them and in some ways simplifies the user interface because the fonts you see are the fonts you have on your computer.


Using fonts and other text drawing information, the computer creates a mapping from the characters in the string to the glyphs used to draw text. This mapping itself can be quite complex. Depending on the writing system and the contents of the string, the mapping between glyphs and characters may not be one-to-one. A particular string might require more glyphs than characters in one font and fewer glyphs than characters in another. This is the situation demonstrated in Figure 11.2.

Figure 11.2. Text Sample with Ligatures


The first word of Figure 11.2 consists of three glyphs. The first glyph in the figure is a ligature. It a single glyph that represents a combination of the "f" and "i" characters. The second word also contains a ligature for the sequence of two "f" characters followed by an "l." Ligatures are a typographer's tool used to improve the look of some text. The point to take away from this example is that the number of glyphs does not always match the number of characters, adding to the complexity of displaying text.

Once the text system has selected glyphs, it arranges the glyphs on a line. During this layout phase, the computer takes into account features like tracking and kerning that affect the spacing of glyphs. It also considers text attributes like the writing direction when laying out the glyphs on the line. If the system is trying to put the text on a line with a fixed width, the computer would determine how much text will fit on the line at the given stage and wrap the text as necessary. With the wrapping done, the layout engine can justify each line within its drawing space if necessary.

This description of the text layout process has been greatly simplified but hits many of the high points. Overall, in order to draw text, the computer has to combine Unicode text with text style information, including fonts, to select a series of glyphs that represent the text. The computer then arranges those glyphs according to very specific rules. Once this is done, the computer can draw the glyphs on the output device. It is only this last part, drawing the glyphs on the output device, that is accomplished with Quartz 2D.

ATSUI and Cocoa Text

ATSUI is an acronym for "Apple Type Services for Unicode Imaging." It is the fundamental technology that manages the text rendering pipeline in Mac OS X. The Cocoa application frameworks include a number of classes, built on top of ATSUI and Quartz 2D, that are collectively referred to as the Cocoa Text system. These classes present the text rendering pipeline with concepts and idioms that will be familiar to Cocoa application developers. Any application that wants to take full advantage of the full-featured text rendering capabilities of Mac OS X will need to go through ATSUI or Cocoa Text.

In addition to ATSUI and Cocoa Text, the applications frameworks contain a number of other programming interfaces and systems for drawing text. Generally these routines allow you to avoid the full complexity of the rendering pipeline when all you want to do is draw a simple text string quickly For example, the HIToolbox of the Carbon framework includes routines like HIThemeDrawTextBox. This routine uses ATSUI to draw text, usually in some variation of the system font, for use on custom controls. Similarly, the NSString class in Cocoa has an extension with methods like drawAtPoint:withAttributes, which allow you to draw a single string easily.

The convenience routines that the application frameworks provide shield your code from a lot of the complexity involved in setting up the text drawing pipeline. Unfortunately, however, they also must set up and tear down the text drawing pipeline on your behalf each time they are called. As a result, these routines are not the most efficient way to draw a great deal of text. If your profiling reveals that text drawing is a performance bottleneck, you might be able to improve the situation by taking more responsibility for the text pipeline.


Both ATSUI and Cocoa Text are feature-rich software systems, worthy of entire books of their own. We won't be able to cover them in depth here. The following links may help you to explore those technologies and incorporate them into your application.

The book "Rendering Unicode Text With ATSUI" is an excellent introduction to the text rendering pipeline. The chapter on Typography Concepts, with its section on Text Layout, is particularly interesting if you want to know more about the glyph layout process. The text also introduces much of the terminology that surrounds fonts and text layout. Understanding this terminology can be an important first step to helping you understand text processing in general. You can read this book on Apple's developer documentation site:

http://developer.apple.com/documentation/Carbon/Conceptual/ATSUI_Concepts/index.html

Cocoa developers may want to look at the first few chapters of the book just mentioned. A good overview of the Cocoa Text system is given in the Cocoa Text System Architecture documentation at

http://developer.apple.com/documentation/Cocoa/Conceptual/TextArchitecture/index.html

A particularly interesting topic to read in this document is "Assembling the Text System by Hand." Even if you never need to individually put together the different classes of the Cocoa text system, reading about how to do it may help you understand how Cocoa works with text.

If your Cocoa application has modest text rendering needs or if you want a very detailed explanation of the Cocoa text rendering process, you should read the Text Layout Programming Guide at

http://developer.apple.com/documentation/Cocoa/Conceptual/TextLayout/index.html

This document is also particularly interesting if you are working with the Cocoa Text system and want to ensure that you are getting the best possible performance.




Quartz 2D Graphics for Mac OS X Developers
Quartz 2D Graphics for Mac OS X Developers
ISBN: 0321336631
EAN: 2147483647
Year: 2006
Pages: 100

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net