Section 10.8. Character Entity References


10.8. Character Entity References

Characters not found in the normal alphanumeric character set, such as < and &, must be specified in HTML and XHTML documents using character references . This is known as escaping the character. Using the standard desktop publishing keyboard commands (such as Option-G for the © symbol) within an HTML document will not produce the desired character when the document is rendered in a browser. In fact, the browser generally displays the numeric entity for the character.

In (X)HTML documents, escaped characters are indicated by character references that begin with & and end with ;. The character may be referred to by its Numeric Character Reference (NCR) or a predefined character entity name.

A Numeric Character Reference refers to a character by its Unicode code point in either decimal or hexadecimal form (for more information on Unicode and code points, see Chapter 6). Decimal character references use the syntax &#nnnn;. Hexadecimal values are indicated by an "x": &#xhhhh;. For example, the less-than (<) character could be identified as &#60; (decimal) or &#x3C; (hexadecimal).

Character entities are abbreviated names for characters, such as &lt; for the less-than symbol. Character entities are predefined in the DTDs of markup languages such as HTML and XHMTL as a convenience to authors, because they may be easier to remember than Numeric Character References.

XHTML includes the XML entity declaration for the apostrophe (&apos;). In HTML, the apostrophe character entity was curiously omitted, so its numeric reference (&039;) must be used instead.


Table 10-3 presents the (X)HTML character entities and numeric character references for commonly used special characters. The complete list of character entities defined in HTML 4.01 and XHTML 1.0/1.1 appears in Appendix C.

Table 10-3. Common special characters and their character entities

Character

Description

Entity

Decimal

Hex

 

Character space (nonbreaking space )

&nbsp;

&#160;

&#x00A0;

&

Ampersand

&amp;

&#038;

&#x26;

<

Less-than sign (useful for displaying markup on a web page)

&lt;

&#060;

&#x3C;

>

Greater-than sign (useful for displaying markup on a web page)

&gt;

&#062;

&#x3E;

'

Apostrophe

&apos; (XHTML only)

&#039;

&#x27;

"

Left curly quotes

&lddquo;

&#8220;

&#x201C;

"

Right curly quotes

&rdquo;

&#8221;

&#x201D;

Trademark

&trade;

&#8482;

&#x2122;

£

Pound symbol

&pound;

&#163;

&#xA3;

¥

Yen symbol

&yen;

&#165;

&#xA5;

©

Copyright symbol

&copy;

&#169;

&#xA9;

®

Registered trademark

&reg;

&#174;

&#xAE;


XML Character Entities

XML 1.0 defines five character entities that must be supported by all XML processors. The XHTML DTDs explicitly declare these entities as well, in keeping with recommended practice for XML languages.

Less than (<) &lt; &#60;

Greater than (>) &gt; &#62;

Ampersand (&) &amp; &#38;

Apostrophe (') &apos; &#39;

Quotation mark (") &quot; &#34;

The only significant change is that XHTML includes an entity for the apostrophe character (&apos;), which was curiously omitted from HTML. For backward compatibility, it is recommended that authors use the numeric reference for apostrophe (&#39;) instead.





Web Design in a Nutshell
Web Design in a Nutshell: A Desktop Quick Reference (In a Nutshell (OReilly))
ISBN: 0596009879
EAN: 2147483647
Year: 2006
Pages: 325

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net