27.2 HTML4 Entity Sets

   

HTML 4.0 predefines several hundred named entities, many of which are quite useful. For instance, the nonbreaking space is   . XML, however, defines only five named entities:


&

The ampersand ( & )


<

The less-than sign ( < )


&gt;

The greater-than sign ( > )


&quot;

The straight double quote (")


&apos;

The straight single quote (')

Other needed characters can be inserted with character references in decimal or hexadecimal format. For instance, the nonbreaking space is Unicode character 160 (decimal). Therefore, you can insert it in your document as either &#160; or &#xA0; . If you really want to type it as &nbsp; , you can define this entity reference in your DTD. Doing so requires you to use a character reference:

 <!ENTITY nbsp "&#160;"> 

The XHTML 1.0 specification includes three DTD fragments that define the familiar HTML character references:


Latin-1 characters (http://www.w3.org/TR/xhtml1/DTD/xhtml-lat1.ent)

The non-ASCII, graphic characters included in ISO-8859-1 from code points 160 through 255, shown in Table 27-3


Special characters (http://www.w3.org/TR/xhtml1/DTD/xhtml-special.ent)

A few useful letters and punctuation marks not included in Latin-1


Symbols (http://www.w3.org/TR/xhtml1/DTD/xhtml-symbol.ent)

The Greek alphabet, plus various arrows, mathematical operators, and other symbols used in mathematics

Feel free to borrow these entity sets for your own use. They should be included in your document's DTD with these parameter entity references and PUBLIC identifiers:

 <!ENTITY % HTMLlat1 PUBLIC    "-//W3C//ENTITIES Latin 1 for XHTML//EN"    "http://www.w3.org/TR/xhtml1/DTD/xhtml-lat1.ent"> %HTMLlat1; <!ENTITY % HTMLspecial PUBLIC     "-//W3C//ENTITIES Special for XHTML//EN"     "http://www.w3.org/TR/xhtml1/DTD/xhtml-special.ent"> %HTMLspecial; <!ENTITY % HTMLsymbol PUBLIC     "-//W3C//ENTITIES Symbols for XHTML//EN"     "http://www.w3.org/TR/xhtml1/DTD/xhtml-symbol.ent"> %HTMLsymbol; 

However, we do recommend saving local copies and changing the system identifier to match the new location, rather than downloading them from the http://www.w3.org every time you need to parse a file. You may import just one, two, or all three of them, depending on what you need. There are no interdependencies.

Instead, you can just use the character references shown in Tables Table 27-4, Table 27-5, and Table 27-6.

Table 27-4. The HTML Latin-1 entity set



XML in a Nutshell
XML in a Nutshell, Third Edition
ISBN: 0596007647
EAN: 2147483647
Year: 2003
Pages: 232

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net

Character

Meaning

XHTMLentity reference

Hexadecimalcharacter reference

Decimalcharacter reference

 

Nonbreaking space

 &nbsp; 

 &#xA0; 

 &#160; 

Inverted exclamation mark

 &iexcl; 

 &#xA1; 

 &#161; 

Cent sign

 &cent; 

 &#xA2; 

 &#162; 

Pound sign

 &pound; 

 &#xA3; 

 &#163; 

Currency sign

 &curren; 

 &#xA4; 

 &#164; 

Yen sign, Yuan sign

 &yen; 

 &#xA5; 

 &#165; 

Broken vertical bar

 &brvbar; 

 &#xA6; 

 &#166; 

Section sign

 &sect; 

 &#xA7; 

 &#167; 

figs/u00a8.gif

Dieresis, spacing dieresis

 &uml; 

 &#xA8; 

 &#168; 

Copyright sign

 &copy; 

 &#xA9; 

 &#169; 

Feminine ordinal indicator

 &ordf; 

 &#xAA; 

 &#170; 

Left-pointing double angle quotation mark, left-pointing guillemot

 &laquo; 

 &#xAB; 

 &#171; 

Not sign

 &not; 

 &#xAC; 

 &#172; 

-

Soft hyphen, discretionary hyphen

 &shy; 

 &#xAD; 

 &#173; 

Registered trademark sign

 &reg; 

 &#xAE; 

 &#174; 

Macron, overline, APL overbar

 &macr; 

 &#xAF; 

 &#175; 

Degree sign

 &deg; 

 &#xB0; 

 &#176; 

Plus-or-minus sign

 &plusmn; 

 &#xB1; 

 &#177; 

2

Superscript digit two, squared

 &sup2; 

 &#xB2; 

 &#178; 

3

Superscript digit three, cubed

 &sup3; 

 &#xB3; 

 &#179;