Character Sets and Encodings


The world runs on a wide variety of character sets . This chapter describes the many encodings for these sets and lists the characters in them. We also describe how conversions between the encodings can be performed, either with the functions of commonly used programs or separate converters. This chapter also discusses practical use of the character sets in different contexts, such as email, Internet discussion forums, and document interchange.

The use of Unicode does not mean that you need not know anything about encodings. You will inevitably encounter non-Unicode data as well, and you need to work with it, even if this only means converting it into Unicode. Moreover, Unicode itself can be represented in different encodings, such as UTF-8 and UTF-16.

Mostly you don't neeed to know about the details of encodings. You certainly don't have to know the code numbers of characters in each encoding, let alone memorize them. What you need is an overview of the world of encodings, general information about the suitability of each encoding for various purposes, and tools for mapping between encodings.

The presentation of encodings in this chapter is practical rather than historical. For history, one place to refer to is "A Brief History of Character Codes in North America, Europe, and East Asia" at http://tronweb.super-nova.co.jp/characcodehist.html.

As explained in Chapter 1, the phrase "character set" is confusing and vague. It is therefore mostly avoided in this book, but you will often see it elsewhere. It may mean any of the following, and often two or three of these at the same time:

  • A collection of characters (character repertoire)

  • A mapping of characters into the mathematical set of integers (character code)

  • A mapping of characters (or their numbers) into sequences of octets (character encoding)



Unicode Explained
Unicode Explained
ISBN: 059610121X
EAN: 2147483647
Year: 2006
Pages: 139

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net