Section 1.9. Working with Fonts


1.9. Working with Fonts

In a word processor like Microsoft Word, it is deceptively simple to change the overall font, or the font of some particular piece of text. You can paint a piece of text with the mouse and select a font for it from a drop-down menu. In web authoring, it is not much more difficult, especially if you use authoring software that resembles a word processor. However, things become difficult if the chosen font does not contain all the characters you need.

Figure 1-19. The "Languages" settings in the Opera browser


Each computer system is shipped with some repertoire of fonts, which may be insufficient for working with a large character repertoire even if the system is basically "Unicode enabled."


1.9.1. Installing Additional Support

For example, a typical Windows system might not have any font that is rich enough to present all the characters you need. Unfortunately, Windows has often been preinstalled without full "multilingual support." You may therefore need to install additional fonts.

On Windows XP, you would do this as follows:

  1. Select Start Control Panel Regional Options and Language Options.

On older Windows systems, you may need to select Control Panel Add/Remove Programs, click on Multilanguage Support, and then Details. Make sure a checkmark appears beside the language or languages you want to use, and then click on OK.

There is support to many languages available, for different versions of Windows, in the Windows Update site http://windowsupdate.microsoft.com. The site also contains important security updates. However, even if your computer has been configured to download and install security updates automatically, this does not cover the extra language support. You need to download and install it separately.

If you have installed MS Office, you have probably got some important additional fonts, such as Arial Unicode MS, which is not a complete Unicode font, but is rather extensive (though it exists in different versions). However, it is possible that this font was not included when MS Office was installed; you may need to install it separately from the MS Office CD then.

There are some additional instructions for installing fonts in a few environments on the page "Display Problems?" at http://www.Unicode.org/help/display_problems.html.

As a quick check, access http://www.Unicode.org/standard/WhatIsUnicode.html, which contains the document "What is Unicode?" in English but also, under the heading "Translations," links to versions of the document in many other languages. Do the link texts look meaningful (though perhaps all Greek or all Hebrew to you), or are there boxes or question marks that look like symbols of unrepresentable characters? This test is best carried out using a Mozilla or Opera browser rather than Internet Explorer 6, which will only use characters in the currently selected font. Note that it is rare, these days, to be able to see all the link texts there properly, since some of them contain characters that are not present in relatively large fonts. Figure 1-20 shows the test page viewed on Opera, on a system where the font support is relatively good.

Of course, installing additional fonts and language support on your computer does not make documents created by you behave any better on other computers. If you had to install something extra in order to type some Chinese, the odds are that if you send a Chinese document you composed to your neighbor, he might not see it properly without installing something extra, too. We cannot expect all, or even most, computers to be able to display the full Unicode repertoire, or even anything close to it.

The situation is expected to change in time in the sense that new systems will have a few complete Unicode fonts installed. Additional fonts may still be needed for typographic reasons. In general, any font that has a large character repertoire cannot be typographically optimal for any particular writing system. The font needs to have many characters that are distinguishable from each other, and this imposes restrictions on the design of characters.

1.9.2. Font Support in Web Browsers

There is a major difference between Internet Explorer (at least up to Version 6) and more modern browsers. IE basically uses a single font for a piece of text, as specified on the page itself or in the browser settings. If the text contains a character that is not present in

Figure 1-20. A page with links containing characters from several languages, therefore suitable for testing font support


Figure 1-21. A word with a special character in Internet Explorer


the font, IE shows just a small rectangular box to indicate the lack of glyph, as shown in Figure 1-21. Firefox, for example, is capable of picking up glyphs from different fonts, if the primary font is not sufficient for all characters.

When a browser, or other program, uses glyphs from different fonts, the situation is not as happy as you might think. The problem is that a font typically has a distinctive style and flavor, and mixing fonts often produces typographically poor results. This is illustrated in Figure 1-22, where a Romanian word containing letter "t" with comma below is rendered using a different font for that character. (In practice, the letter "t" with a comma below is almost always replaced by the letter "t" with a cedilla, which is much better supported in fonts, making it possible to present words like "Constana in a typographically suitable way.)

Figure 1-22. A word with a special character in Firefox


1.9.3. Font Substitution: a Solution and a Problem

The font problem discussed above appears in contexts other than web pages, too. In composing a text in a word processor, you may have decided (or someone may have decided for you) on the fonts to be used. It may well happen that you need a character that does not appear in the font you use, and you need to pick it up from another font. A program might do this for you, by automatically switching to a substitute font.

The presentation of special symbols like in a different font need not have drastic effects, though it may cause uneven line spacing. Font changes inside a word are often much worse. Thus, it is best to design the use of fonts so that the primary font is sufficient at least for all

When using fonts like Arial Unicode MS or Geneva as "backup fonts," you might run into problems with italicized or bold text. The reason is that some fonts exist in one version only, not in italics or bold versions. Many programs still display them in italics or in bold by "faking" the typographic features by modifying the shapes of characters. However, some programs don't do this, and those that do may produce typographically poor results. Thus, try to limit the use of "backup fonts" to nonstylized copy text.

When deciding on a font, you should use some test files that contain both typical text and some less typical "exotic" characters that may appear in actual documents. The Common Locale Data Repository (CLDR), discussed in Chapter 11, contains lists of "exemplary characters" for different languages. These lists include both characters normally used in the language and characters that often appear in names in texts in that language, due to cultural connections. For example, text in Spanish may well contain names in Portuguese or French. Therefore, a good test file for Spanish contains more than just Spanish characters.

Moreover, depending on the topic areas of texts, various special symbols will be needed. If you design the use of fonts in a publication on technology, you should probably pay attention to the availability of technical symbols such as µ and ࣺ in the fonts. That way, you can avoid embarrassing problems that you might encounter when you have selected a font but later find out that it lacks some essential characters. For example, the site http://www.fileformat.info/info/Unicode/char/ contains tools for finding fonts (within some

Figure 1-23. Information on font support for a character


repertoire of fonts) that support a particular character. Figure 1-23 shows an example of such information, which indicates here that support for the diameter sign is rather limited, despite the rather common use of this character in technical contexts. Among the fonts listed, Arial Unicode MS is the most realistic alternative, since it is shipped with MS Office products, though not always installed along with them.

The font problems are one reason why the common use of "Lorem ipsum" texts (i.e., meaningless pig latin texts) in visual design is not such a good idea. Those texts seldom contain anything but the basic Latin letters and a few punctuation characters. It is safer to play with more realistic texts with a richer character repertoire. This does not mean that you need to reject all fonts that do not contain all the characters you might imaginably need in the texts. Rather, the suggested testing ensures that the most important characters work well, and you can prepare for eventual problems with less common characters.

1.9.4. Printer Fonts

When a document is printed, the fonts used in it may need to be replaced by printer fonts . It may well happen that the document looks fine on screen but some characters are lost or distorted in printing. The situation may vary by printer. A printer may use a font shipped with the printer itself (on ROM), or a font that has been separately installed into it, or a downloaded fonti.e., a font sent by the program to the printer.

It often happens that a document with special characters looks good on screen, but some characters are wrong when the document is printed.


Therefore, you may need to test your fonts especially before making an important decision on using particular fonts in publications. Make sure that your test file is extensive enough to cover even less common characters. Typical printer test pages contain a relatively limited repertoire of characters.

Figure 1-24. Text samples in some large-repertoire fonts


1.9.5. Finding Fonts

Typographically good fonts are usually commercial products, sold either as such or packaged into text-processing, publishing, or other programs. However, there are some fonts with large character repertoires available as freeware or shareware, and they can be useful for special applications or as general "backup fonts." The following fonts are illustrated in Figure 1-24; the rectangular boxes indicate a lack of glyph in the font.


Doulos SIL (http://scripts.sil.org/DoulosSILfont)

A free font family that contains a large repertoire that is suitable for almost any text based on a Latin or Cyrillic script. It also contains a rich set of phonetic symbols and is therefore useful for linguistics.


Code2000, Code2001, Code2002 (http://home.att.net/~jameskass/)

Large shareware fonts. Often used as ultimate backup due to coverage, but not typographically for normal use.


Everson Mono (http://www.evertype.com/emono/)

A simple, monospace font, which is legible even in rather small size. Shareware.

For many additional fonts, please refer to http://www.alanwood.net/Unicode/fonts.html.

Installing new fonts is typically easy. On Windows, having downloaded a font, you can open the Control Panel via the Start menu, open the Fonts folder, and select File Install New Font. Then find the folder where you downloaded the font, and the font will appear in a menu, to be selected for installation. After installation, you can check that the font is available, by opening a programs font menu.

1.9.6. Fonts in Web Authoring

Originally, web pages had no font information; each browser used its own font. Soon after, <font> tags were introduced and gained popularity among web authors. This meant a seemingly simple way to specify a font: <font face="Arial">Hello world</font>, or perhaps with face="Arial,Helvetica,Geneva" to provide a list of alternatives, because not all browsers have a font named Arial. However, this approach is inflexible and makes things difficult to modify and maintain.

The more modern approach is to specify font usage separately from the HTML markup through the use ofCascading Style Sheets (CSS). For example, you could specify the font for an entire page in CSS as follows: body { font-family: Arial, Helvetica, Geneva; }. Modern browsers effectively turn <font> tags into CSS rules internally.

1.9.6.1. The fallback problem

Since many fonts used on computers have rather small character repertoires, the question arises what to do if your document contains more or less "unusual" characters. You might wish to use, say, the Arial font in general, but you may need other fonts for some special characters.

Recommendations by the World Wide Web Consortium (W3C) suggest that an author include a generic font name as the last name in a font list. These names (serif, sans-serif, monospace, cursive, and fantasy) correspond to broad classes of fonts. The idea is that normally the font list consists of fonts of the same class, and the last item there would effectively tell the browser to use some font of the class, if it cannot find any of the specific fonts listed. A typical example would thus be body { font-family: Arial, Helvetica, Geneva, sans-serif; }.

In principle, generic font family names are supposed to be mapped to typical fonts in a category. Thus, sans-serif would not mean any sans-serif font that a browser may have but the most typical representative of the class of sans-serif fonts. However, typical fonts tend to be old fonts, and old fonts tend to have relatively small character repertoires.

The CSS font model (font-matching algorithm) is based on the idea of determining (at the conceptual level at least) the font individually for each character. If none of the fonts declared contains a glyph for the character, the browser is supposed to use a browser-dependent default list of fonts. This is how, for example, the Opera and Firefox browsers work. The problem is with IE, which more or less just uses the declared font-family value and picks up the first font there that is available on the system, without checking whether it contains all the characters that appear in the text.

This means that with the above-mentioned CSS rule, a piece of text containing, for example, the diameter sign ࣺ (U+2600) will contain a small rectangle in place of the character on IE. The browser sees the name Arial in the font list and uses it, since a font with that name exists. Characters not present in that font will then be replaced by a symbol for missing character. On Opera and Firefox, the lack of character is detected and the list of fonts is scanned further. Here, Helvetica and Geneva are probably not of use, whereas the font used as a generic sans-serif font might contain the character. Even if it does not, the browser proceeds to its internal list of fallback fonts. In effect, if any font on the system contains the character, it will be shown. This often results in a typographically poor rendering, but at least the character is displayed.

The practical conclusion is that for web authoring, the font names in a font-family list should be chosen so that each of them contains all the characters that appear in the text of an element. At least as long as IE Version 6 is widely used, we should avoid relying on the defined fallback mechanism.

Generally, if a document or part of a document contains characters that do not appear in commonly available fonts, there are two things you can do. You can specify no font, leaving it to users to select the best font on their browsers they can. Or, according to another school of thought, you can try and identify some fonts so that any of them alone is sufficient for all of your characters. This typically means declarations like font-family: Arial Unicode MS, Lucida Sans Unicode. The choice of fonts cannot be an exact science, since fonts with the same exact name (such as Arial Unicode MS) often exist in different versions, with different character repertoires.

1.9.6.2. Effects of browser settings

As a user of a browser such as Opera or Firefox, you can affect the default fonts used by the browser. You can define what fonts the generic font names serif, sans-serif, etc. map to. You can also specify which default font is used for texts for which no font suggestion is made on a page. The latter possibility exists on IE as well, and it typically contains settings for different writing systems . That is, you can select one font for texts in Latin letters, another font for texts in Cyrillic letters, etc. The details can be somewhat obscure and depend on the browser, but the point is that there can be three kinds of "default fonts" in a browser:

  • The font used (when possible) for a character for which no font information is given on a web page. This font typically depends on the writing system that the character belongs to. Note that a web page very often specifies the overall font to be used, and in that case, this setting has no effect.

  • The font used when a page specifies a generic font name.

  • The internal list of fonts to be used when everything else fails. Typically, this cannot be changed in normal browser settings but only through a configuration file.

For example, in Firefox, you can select Tools Options General Languages to enter a "Fonts & Colors settings window as in Figure 1-25. It lets you specify, for different writing systems, whether a sans serif or a serif font is used by default and what the default serif, sans serif, and monospace (teletype) fonts are, as well as default font size. The effect of these settings is somewhat complicated, since browsers may recognize the script (writing system) of text from some technical matters, not from the text itself.

The "Fonts for" menu in Figure 1-25 has the options shown in Figure 1-26. You should be mildly surprised at seeing a mixture of names, some of which seem to refer to scripts, some not. Unicode is of course not a script, and Western, Central European, Baltic, and Turkish refer to character codes rather than scripts. The logic of deciding what such names really mean varies by browser, but at the implementation level, the assignments are for languages in Firefox. It uses various methods to deduce or guess the language. Here "Unicode" refers to anything that does not fall into any other category.

Figure 1-25. In the Firefox browser, the default fonts can be specified as different for different scripts; here they are set for the Thai script




Unicode Explained
Unicode Explained
ISBN: 059610121X
EAN: 2147483647
Year: 2006
Pages: 139

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net