2.4 XML Names

   

2.4 XML Names

The XML specification can be quite legalistic and picky at times. Nonetheless, it tries to be efficient where possible. One way it does that is by reusing the same rules for different items where possible. For example, the rules for XML element names are also the rules for XML attribute names, as well as for the names of several less common constructs. Collectively, these are referred to simply as XML names .

Element and other XML names may contain essentially any alphanumeric character. This includes the standard English letters A through Z and a through z as well as the digits through 9 . XML names may also include non-English letters, numbers , and ideograms, such as , , figs/u03a9.gif , figs/u4e32.gif . They may also include these three punctuation characters :

_ The underscore
- The hyphen
. The period

XML names may not contain other punctuation characters such as quotation marks, apostrophes , dollar signs, carets , percent symbols, and semicolons. The colon is allowed, but its use is reserved for namespaces as discussed in Chapter 4. XML names may not contain whitespace of any kind, whether a space, a carriage return, a line feed, a nonbreaking space, and so forth. Finally, all names beginning with the string "XML" (in any combination of case) are reserved for standardization in W3C XML- related specifications.

The primary new feature in XML 1.1 is that XML names may contain characters only defined in Unicode 3.0 and later. XML 1.0 is limited to the characters defined as of Unicode 2.0. Additional scripts enabled for names by XML 1.1 include Burmese, Mongolian, Thaana, Cambodian, Yi, and Amharic. (All of these scripts are legal in text content in XML 1.0. You just can't use them to name elements, attributes, and entities.) XML 1.1 offers little to no benefit to developers who don't need to use these scripts in their markup.

XML 1.1 also allows names to contain some uncommon symbols such as the musical symbol for a six-string fretboard and even a million or so code points that aren't actually mapped to particular characters. However, taking advantage of this is highly unwise. We strongly recommend that even in XML 1.1 you limit your names to letters, digits, ideographs, and the specifically allowed ASCII punctuation marks.


XML names may only start with letters, ideograms, or the underscore character. They may not start with a number, hyphen, or period. There is no limit to the length of an element or other XML name. Thus these are all well- formed elements:

  • < Drivers_License_Number>98 NY 32</Drivers_License_Number>

  • <month-day-year>7/23/2001</month-day-year>

  • <first_name>Alan</first_name>

  • <_4-lane>I-610</_4-lane>

  • <tlphone>011 33 91 55 27 55 27</tlphone>

  • figs/p19.gif

These are not acceptable elements:

  • <Driver's_License_Number>98 NY 32</Driver's_License_Number>

  • <month/day/year>7/23/2001</month/day/year>

  • <first name>Alan</first name>

  • <4-lane>I-610</4-lane>



XML in a Nutshell
XML in a Nutshell, Third Edition
ISBN: 0596007647
EAN: 2147483647
Year: 2003
Pages: 232

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net