6.2. Character ReferencesHTML and XHTML documents use the standard ASCII character set (these are the characters you see printed on the keys of your keyboard). To represent characters that fall outside the ASCII range, you must refer to the character by using a character reference. This is known as escaping the character.
In HTML and XML documents, some ASCII characters that you intend to be rendered in the browser as part of the text content must be escaped in order not to be interpreted as code by the user agent. For example, the less-than symbol (<) must be escaped in order not to be mistaken as the beginning of an element start tag. Other characters that must be escaped are the greater-than symbol (>), ampersand (&), single quote ('), and double quotation marks ("). In XML documents, all ampersands must be escaped or they won't validate. There are two types of character references: Numeric Character References (NCR) and character entities. 6.2.1. Numeric Character ReferencesA Numeric Character Reference (NCR) refers to the character by its Unicode code point (introduced earlier in this chapter). NCRs are always preceded by &# and end with a ; (semicolon). The numeric value may be provided in decimal or hexadecimal. Hexadecimal values are indicated by an x before the value. For example, the copyright symbol (©), which occupies the 169th position in Unicode (U+00A9), may be represented by its hexadecimal NCR © or its decimal equivalent, ©. Decimal values are more common in practice. Note that the zeros at the beginning of the code point may be omitted in the numeric character reference.
6.2.2. Character EntitiesCharacter entities use abbreviations or words instead of numbers to represent characters that may be easier to remember than numbers. In this sense, entities are merely a convenience. Character entities must be predefined in the DTD of a markup language to be available for use. For example, the copyright symbol may be referred to as ©, because that entity has been declared in the DTD. The character entities defined in HTML 4.01 and XHTML are listed in Appendix C (a list of the most common is also provided in Chapter 10). XML defines five character entities for use with all XML languages:
6.2.3. Escapes in CSSIt may be necessary to escape a character in a style sheet if the value of a property contains a non-ASCII character. In CSS, the escape mechanism is a backslash followed by the hexadecimal Unicode code point value. The escape is terminated with a space instead of a semicolon. For example, a font name starting with a capital letter C with a cedilla (Ç) needs to be escaped in the style rule, as shown here. p { font-family: \C7 elikfont; } When the special character appears in a style attribute value, it is possible to use its NCR, entity, or CSS escape. The CSS escape is recommended to make it easier to move it to a style sheet later.
|