Section 6.2. Character References


6.2. Character References

HTML and XHTML documents use the standard ASCII character set (these are the characters you see printed on the keys of your keyboard). To represent characters that fall outside the ASCII range, you must refer to the character by using a character reference. This is known as escaping the character.

Declaring Encoding in Style Sheets

It is also possible to declare the encoding of an external style sheet by including a statement at the beginning of the .css document (it must be the first thing in the file):

 @charset "utf-8"; 

It is important to do this if your style sheet includes non-ASCII characters in property values such as quotation characters used in generated content, font names, and so on.


In HTML and XML documents, some ASCII characters that you intend to be rendered in the browser as part of the text content must be escaped in order not to be interpreted as code by the user agent. For example, the less-than symbol (<) must be escaped in order not to be mistaken as the beginning of an element start tag. Other characters that must be escaped are the greater-than symbol (>), ampersand (&), single quote ('), and double quotation marks ("). In XML documents, all ampersands must be escaped or they won't validate.

There are two types of character references: Numeric Character References (NCR) and character entities.

6.2.1. Numeric Character References

A Numeric Character Reference (NCR) refers to the character by its Unicode code point (introduced earlier in this chapter). NCRs are always preceded by &# and end with a ; (semicolon). The numeric value may be provided in decimal or hexadecimal. Hexadecimal values are indicated by an x before the value.

For example, the copyright symbol (©), which occupies the 169th position in Unicode (U+00A9), may be represented by its hexadecimal NCR &#xA9; or its decimal equivalent, &#169;. Decimal values are more common in practice. Note that the zeros at the beginning of the code point may be omitted in the numeric character reference.

Handy charts of every character in the Basic Multilingual Plane are maintained as a labor of love by Jens Brueckmann at his site J-A-B.net. The Unicode code point and decimal/hexadecimal NCR is provided for every character. It is available at www.j-a-b.net/web/char/char-unicode-bmp.


6.2.2. Character Entities

Character entities use abbreviations or words instead of numbers to represent characters that may be easier to remember than numbers. In this sense, entities are merely a convenience. Character entities must be predefined in the DTD of a markup language to be available for use. For example, the copyright symbol may be referred to as &copy;, because that entity has been declared in the DTD. The character entities defined in HTML 4.01 and XHTML are listed in Appendix C (a list of the most common is also provided in Chapter 10). XML defines five character entities for use with all XML languages:


&lt;

Less than (<)


&gt;

Greater than (>)


&amp;

Ampersand (&)


&apos;

Apostrophe (')


&quot;

Quotation mark (")

6.2.3. Escapes in CSS

It may be necessary to escape a character in a style sheet if the value of a property contains a non-ASCII character. In CSS, the escape mechanism is a backslash followed by the hexadecimal Unicode code point value. The escape is terminated with a space instead of a semicolon. For example, a font name starting with a capital letter C with a cedilla (Ç) needs to be escaped in the style rule, as shown here.

     p { font-family: \C7 elikfont; } 

When the special character appears in a style attribute value, it is possible to use its NCR, entity, or CSS escape. The CSS escape is recommended to make it easier to move it to a style sheet later.

For guidelines on declaring character encodings and using escapes, see the W3C's Authoring Techniques for XHTML & HTML Internationalization available at www.w3.org/TR/i18n-html-tech-char/.





Web Design in a Nutshell
Web Design in a Nutshell: A Desktop Quick Reference (In a Nutshell (OReilly))
ISBN: 0596009879
EAN: 2147483647
Year: 2006
Pages: 325

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net