Section 1.3. Variation of Writing Systems


1.3. Variation of Writing Systems

The most widely used writing systems, or scripts, can be classified as follows:


Alphabetic scripts

Denote sounds with letters, though usually not in a strict one-to-one manner. Examples: Latin, Greek, and Cyrillic scripts, each of which exists in different versions.


Consonant scripts, or abjads

Basically denote consonants, leaving vowels to be inferred; however, consonant scripts may have letters for long vowels, and in some situations even short vowels are written using small signs attached to consonants. Examples: Hebrew and Arabic scripts.


Abugida scripts

These use consonant letters that imply a particular vowel after the consonant, when used in the base form. Alternatives with other vowels or without any vowel are indicated by additional marks. Many South and Southeast Asian scripts belong to this categorye.g., the Devanagari script used for many Indic languages.


Syllabic scripts

Use basically one character for each syllable. Examples: the Hiragana and Katakana scripts, used for Japanese.


Ideographic scripts

Use basically one character for one (short) word. The most widely known ideographic script is Han, often known as Chinese script, though it is also used (in part) for other languages as well, especially Japanese and Korean, and therefore often called "CJK."

Figure 1-4. Sample information on a character in the eki.ee database


Consonantal writing may sound impossible, because it introduces so much ambiguity. However, although an individual written form of a word is often ambiguous, the ambiguities are usually resolved easily from the context by a person who understands the language well. Moreover, languages written with a consonantal script typically have a structure that makes this easier than for English, for example. When vowels are mainly used to express variations of a common theme expressed by a word root, consisting of a pattern described by a combination of consonants, the vowels can usually be inferred from the grammatical context.

The word "script" is often used in character code contexts instead of "writing system." It is important to distinguish it from the use of the word "script" to denote a programming concept'a certain type of a computer program, such as a Perl script.

Some scripts, such as the Latin script, are written with spaces between words, and a space is normally a permissible line break point. Hyphenation may introduce other break points. Other scripts may permit line breaks more freely.

The Latin script and many other scripts are written left to right, with lines proceeding from top to bottom. These are not universal properties of human writing, and even the Latin script is historically based on a script that was written right to left. Unicode addresses the problem of left-to-right versus right-to-left writing in two ways: by defining inherent directionality for characters and by defining control characters for affecting writing direction. For example, Hebrew and Arabic letters have inherent right-to-left directionality. Special methods are needed when text in such letters contains names or quotations that have the opposite directionality, or vice versa.

Figure 1-5. The four contextual forms of the Arabic letter "ba"


In Latin scripts, each character is normally displayed as a separate image on screen or paper, though the spacing between characters may vary. In other scripts, the formatting of texts for visual presentation can be essentially more difficult: the shape of a character may depend on context; adjacent characters can be written together (using a ligature or using cursive writing where characters join smoothly); and a character might be displayed as an auxiliary symbol above, below, before, or behind another character.



Unicode Explained
Unicode Explained
ISBN: 059610121X
EAN: 2147483647
Year: 2006
Pages: 139

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net