Section 2.1. Method Varieties


2.1. Method Varieties

There is no single answer to a question like "How do I write the character...?" The methods vary by program and equipment. In any given situation, there are usually several ways to write a character.

When you give individual instructions to someone, or you are solving your own problem with typing characters, you should normally try to find one way to input the characters, preferably the most convenient one. However, as usual, convenience is relative. It does not pay off to find a clever way of producing a character if you need it only once and you already know a general, if clumsy, way to input that character. When you give general instructions to many people, especially to people who work in different environments, you should try to explain a few alternatives. It is quite probable that different people need or like different methods.

2.1.1. A Simple Way or a Universal Way?

There are many different methods for typing characters, often available in parallel. Some of them are very general, allowing even the insertion of any Unicode character. Some methods have been tailored for very special purposes, perhaps even for the entry of one particular character that would otherwise be difficult to produce. This chapter aims at clarifying things by explaining typical approaches. The multitude of methods can be divided into a few basic categories, to make things more understandable.

When you select methods to be explained to users of an application, it is usually best to aim at systematic ways rather than the fastest ways. That is, opt for a method that works for all the characters needed rather than an eclectic combination of tricks. The same may apply to your own use, e.g., when you need to type particular characters frequently.

Appendix A contains a collection of methods for some commonly needed characters. For casual use, pick up whatever works for you and suits you. For more regular use, it is better to analyze the needs and to make some choices.

Suppose, for example, that in some application or document, the only special characters needed are superscript two 2 and the less-than-or-equal-to sign . In a Windows XP environment, the former can be typed rather fast as b2 Alt-X, whereas the latter can be typed with fewer keystrokes: Alt-8804. Understanding and remembering two different methods might be an unnecessary burden. So instead of optimizing each case separately, it might be best to teach (or learn) a single systematic method: either b2 Alt-X and 2264 Alt-X, or Alt-0178 and Alt-8804. On the other hand, if the application is widely used or the document is large, it might pay off to spend some time in customizing things to achieve something more natural. For example, MS Word can be rather easily configured to automatically turn ^2 into 2 and <= into .

2.1.2. An Overview of Methods

For practical reasons, the methods presented in this chapter are mostly from MS Windows and MS Office environments, but various alternatives (such as Unicode editors) are also discussed. The HTML and XML character reference and entity reference techniques are presented as well. The chapter ends with an exercise for writing some specialized texts using some of the techniques presented.

To illustrate the variation in the ways of writing characters, Table 2-1 shows some ways of writing the copyright sign ©, U+00A9. In each program or other environment, there might be several ways to write this. We have omitted the most obvious way, using a keyboard key for the character, since © is hardly ever found on keyboards. Each of the ways will be discussed in more detail in this chapter.

Table 2-1. Typical methods of writing special characters

Program or context

Method

Remarks

Windows Notepad

Alt-0169

169 is decimal code for ©

Win XP WordPad

a9 Alt-X

Uses Uniscribe

Mozilla Thunderbird

Insert Characters...

Common symbols, select ©

Microsoft Word

(c)

Word converts to ©

XML

&#xa9;

Character reference

XML

&#169;

Character reference using decimal notation

HTML

&copy;

Entity reference

CSS

\a9

Has a trailing space

TeX or LaTeX

\copyright

\symbol{'251} works, too


The methods can be classified roughly as follows:


Key combinations

These use keyboard keys, often with modifier keys like Alt and often referring to characters by their code numbers. After using a combination, you see the desired character appear.


Character sequences

These resemble key combinations, but a sequence of characters produced using keyboard keys appears in the data (file) as such. It will only later be rendered as the intended characterby a web browser, for example.


Command menus

You select a command and subcommand from a program's menu. Such tools are almost self-documenting but often very limited, letting you produce just a few commonly used characters. Typically, these commands can also be invoked using keyboard shortcuts.


Selection from a table

You invoke a function of a programe.g., using a menu commandand a window containing a table of characters appears on the screen. By clicking on a character, you select it. So-called virtual keyboards can, in part, be regarded as a special case of this, though the characters appear there in a keyboard setup and not in a rectangular grid.

2.1.3. Choosing Fonts

Some methods of typing characters produce just an abstract character; others include font information. For example, when using Notepad, no font information is included, though whenever someone looks at the characters, some font needs to be used. Databases normally contain character data as coded characters only, with no font information. In MS Word, fonts are an integral part of the content, although the use of fonts can be controlled in a disciplined way by using styles in Word.

Figure 2-1. A character from a different font can be a disturbance


In any case, when text data is to be presented visibly, font issues are essential. You do not need to worry about it in database design and data entry, but when printing out strings from a database, a font or fonts need to be selected at some point. In web design, for example, we can choose to leave the font selection to browsers and users, but font problems still need to be anticipated. You do not want to create a document that most people will not see correctly.

The larger the number of different characters, the more you have problems with typographic quality, for several reasons:

  • The character requirements reduce the number of fonts that can be chosen. Many typographically ambitious fonts have fairly small repertoires of characters, and many large fonts are typographically rather questionable.

  • If you use characters from different fonts, the results are often poor, at least if you are not careful. In Figure 2-1, the "s" with caron () is disturbingly different from the general style of letters, because it is a "loan-character from another font.

  • A large repertoire often contains characters that can easily be confused with each other. Their design in a font should thus be sufficiently different. This may exclude an otherwise excellent font, or it may lead you into mixing fonts.

The moral is that you should look out for typographic discrepancies, when you enter characters. If possible, use the same font throughout. If you need to use characters from different fonts, try to use some rather large font as a backup, such as Arial Unicode MS or Lucida Sans Unicode when the basic font is a sans-serif font, and Times New Roman or Code2000 for serif fonts. Such large fonts tend to be relatively neutral in typographic design, so they can work reasonably in the midst of text in another font of the same class.

When designing a publication or series and selecting fonts for it, try to analyze the repertoire of characters that will be used. Consider especially the potential needs for additional letters in foreign words, special (e.g., mathematical) symbols, and different types of punctuation.


The following true story illustrates the risks of insufficient analysis. A public institution was redesigning the format of its printed serial publication, and this included the choice of a new font. Among other things, the publication discussed orthographic questions such as the difference between the hyphen "-" and the en dash "" and the importance of choosing correctly between the two. The embarrassing thing was that the chosen font made a very small difference between the two characters.

Typographers and designers often used "Lorem ipsum" texts in sample documents. Lorem ipsum is a piece of text that looks like Latin to a person who does not know Latin too well, and it contains only basic Latin letters and a few punctuation marks. This implies that it is not suitable for considering how real-world texts appear in the chosen font. Therefore, it is better to design your own sample text and use it. Its content should depend on the nature of the real texts that will be used, but the following short sample text can be suitable as a starting point for typical non-specialized texts in English (see Table 2-4 at the end of this chapter for more specialized samples):

The quick brown fox jumps over the lazy dog. 1234567890.
THE QUICK BROWN FOX JUMPS OVER THE LAZY DOG.
"You 'quote' inside quoted text this wayin U.S. style."
'You "quote" inside quoted text this way in British style.'
His fiancée Märtha visited Rhône.
áà éè íì óò úù âêîôû æœ äëïöüÿ åø çñß

It is not rare to see fonts that look good for normal mixed-case text but poor for all-caps text. Similarly, letters with diacritic marks can cause surprises: the accents might look like just thrown in, instead of sitting nicely near the base character. Consider the importance of using typographically suitable dashes and quotation marks, too. The last line of the sample lists letters that relatively often appear in foreign names in English texts, according to one version of the Common Locale Data Repository (CLDR).



Unicode Explained
Unicode Explained
ISBN: 059610121X
EAN: 2147483647
Year: 2006
Pages: 139

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net