Section 2.1. Character Set


2.1. Character Set

JavaScript programs are written using the Unicode character set. Unlike the 7-bit ASCII encoding, which is useful only for English, and the 8-bit ISO Latin-1 encoding, which is useful only for English and major Western European languages, the 16-bit Unicode encoding can represent virtually every written language in common use on the planet. This is an important feature for internationalization and is particularly important for programmers who do not speak English.

American and other English-speaking programmers typically write programs using a text editor that supports only the ASCII or Latin-1 character encodings, and thus they don't have easy access to the full Unicode character set. This is not a problem, however, because both the ASCII and Latin-1 encodings are subsets of Unicode, so any JavaScript program written using those character sets is perfectly valid. Programmers who are used to thinking of characters as 8-bit quantities may be disconcerted to know that JavaScript represents each character using 2 bytes, but this fact is actually transparent to the programmer and can simply be ignored.

Although the ECMAScript v3 standard allows Unicode characters anywhere in a JavaScript program, versions 1 and 2 of the standard allow Unicode characters only in comments and quoted string literals; all elements are restricted to the ASCII character set. Versions of JavaScript that predate ECMAScript standardization typically do not support Unicode at all.




JavaScript. The Definitive Guide
JavaScript: The Definitive Guide
ISBN: 0596101996
EAN: 2147483647
Year: 2004
Pages: 767

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net