Java APIs for XML Kick Start By Aoyon Chowdhury, Parag Choudhary
Table of Contents
Appendix B. XML: A Quick Tour
In direct contrast to SGML and HTML (which were designed to handle ASCII-based languages such as English and the European and Scandinavian languages), XML is based on the Unicode and ISO/IEC 10646 standard, and is geared to support languages such as Hindi, Arabic, and even Chinese.
However, a problem arises if you want to use such languages in your XML document and you do not have a keyboard that supports these characters. To handle this, you can use character references. A character reference consists of a string starting with &#, followed by the number of the character in the ISO/IEC 10646 character set. This string is terminated by a semicolon (;). For example, to represent the copyright ( ) symbol, you will need to use the following character reference: