Section 8.5. Other European Alphabetic Scripts


8.5. Other European Alphabetic Scripts

There are some writing systems in Europe that have the same structural principle (i.e., that are alphabetic) as the Latin script but different letters. The letters look partly similar to or even identical to Latin letters, largely due to common origin. Beware of the differences, though. For example, the Greek capital letter rho, Ρ, and the Cyrillic capital letter er, , look very similar to the Latin capital letter "P," but they denote an "r sound and historically relate to R rather than P.

8.5.1. Greek Script

The letters α, β, γ,... used in modern Greek have been included into the Greek and Coptic block (U+0370..U+03FF), which is similar to an 8-bit character code, ISO 8859-7. This code, in turn, deviates from windows-1253 in a few code points, in addition to the difference that windows-1253 contains some extra characters in the range 80..9F. Although there is variation in encodings, the characters themselves are well supported.

For ancient Greek as written in modern times, however, other characters are needed. They include vowels with different diacritic marks, indicating three kinds of intonation of stressed vowels. The term polytonic Greek is used to denote such a form of written Greek. The marks were preserved (until the 20th century) long after the intonation had been lost. Modern Greek has only one type of stress mark (called tonos), and it is called monotonic Greek.

The additional characters needed for polytonic Greek, as well as some other characters, have been included into the Greek Extended block, U+1F00..U+1FFF. Basically, you need Unicode to write polytonic Greek properly. On the other hand, various font-based techniques have been used for polytonic Greeki.e., encodings implicitly defined by the design of an 8-bit font.

8.5.2. Cyrillic Script

The Cyrillic script is historically derived from a version of the Greek script, with many modifications, including addition of some characters taken from the Hebrew script. Although you may know the Cyrillic script primarily as used for Russian, it is used (in many variants) for many other Slavic and non-Slavic languages as well. Throughout history, the writing system of some languages has been changed from Latin to Cyrillic or vice versa for political reasons.

The Cyrillic letters as used in Russian are covered by several 8-bit encodings. Among them, the most common are KOI-8R and windows-1251. KOI8-R is specifically for Russian and does not cover most other languages that use the Cyrillic script. The ISO-8859-5 and windows-1251 encodings cover the Cyrillic letters used for Slavic languages, though not many of the letters in other languages using the Cyrillic script.

Even when Unicode is used, problems may arise. Russian is normally written without accent marks, despite the fact that the stress is varying and can be distinctive. However, an acute accent is often used in dictionaries and textbooks, and occasionally in normal text as welle.g., to distinguish ́ "bigger from "big (with stress on the second syllable). This creates a problem, since Unicode does not contain Cyrillic vowel letters with acute accent as precomposed characters. Consequently, you need to use the combining acute accent U+0301 after a vowel letter and try to use software that can handle this. Unfortunately, the result is often typographically poor, though there is more and more software that implements combining diacritic marks well.

When Cyrillic text is transliterated into a Latin script, confusion is often caused by varying transliteration systems. Without knowing the transliteration method, it is impossible to know the original Cyrillic spelling (and hence pronunciation).

8.5.3. Armenian and Georgian Scripts

Characters needed for writing the Armenian and Georgian languages, spoken in the Caucasus, have been included into separate blocks named according to the languages. The languages have relatively small sets of letters, so they can each also be written using an 8-bit encoding.

Modern Georgian makes no case distinction for letters. (Old Georgian had separate upper- and lowercase, though.) In fact, such a situation is common in the writing systems of the world, though most European scripts are an exception. It is also older; the case distinction was invented in medieval Europe.



Unicode Explained
Unicode Explained
ISBN: 059610121X
EAN: 2147483647
Year: 2006
Pages: 139

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net