Attributes provide additional information about elements and can be used for a wide variety of tasks. They make it possible to define the relationships between elements, no matter where they appear in the document. You can declare all attributes for an element in one declaration, or you can declare the attributes via several element declarations.
Start tags and empty element tags can contain attributes, which take the form of name value pairs separated by an equals sign ("="). "<!ATTLIST>" declares an attribute in the DTD. To declare attributes in the DTD, use the following general format:
<!ATTLIST ELEMENT_NAME ATTRIBUTE_NAME TYPE DEFAULT_VALUE>
ELEMENT_NAME is the name of the element in which the attribute appears.
ATTRIBUTE_NAME is the name of the attribute.
TYPE identifies the kind of attribute in use for this element.
DEFAULT_VALUE is the value that the parser uses if the document creator specifies none.
4.4.1 Attribute Types
Ten kinds of attribute types are available in XML 1.0. As shown in Table 4-4, an attribute type identifies the kind of content for an attribute. During validation, the processor examines the attribute values in the document to determine whether they conform to the requirements of the attribute type that is assigned to the element.
Table 4-4. Attribute Types
|Type ||Attribute Value and Meaning |
|CDATA ||The value is character data consisting of any string of legal XML characters. CDATA is text that is not markup and does not include ampersands ("&"), less than signs ("<"), or quotation marks ("""). Use escaped characters such as &, <, or " to include those forbidden characters. |
|ENUMERATED ||A value from a list of possible values delimited by the vertical bar symbol. The value must appear in an enumerated list. The document author chooses only one such value. The keyword ENUMERATED is not actually used. |
|ID ||The value is a unique ID for the element such that no other ID type attribute in the document shares this value. ID attributes can never have fixed default values. If an element has multiple attributes, only one can be of type ID. The attribute value for type ID must be a valid XML name. |
|IDREF ||The value is the ID of another element. It specifies that the value of one attribute refers to an element found elsewhere in the document, where the value of the IDREF is the ID value of the referenced element. |
|IDREFS ||A list of tokens separated by white space, each of which is an IDREF. |
|ENTITY ||The name of an entity declared in the DTD. The value is an entity. The attribute value must match the name of an external unparsed entity. An image is an example of an ENTITY attribute where the binary data is available from another URL. |
|ENTITIES ||Same as ENTITY except that multiple entities can be declared in the DTD, as long as they are separated by white space. The value is a list of entities. The attribute values must match the names of external unparsed entities. |
|NMTOKEN ||Restricts the value of the attribute to a valid XML name token. The attribute value must contain letters, digits, periods, dashes, underscores, combining characters, or extenders. No white space or other characters can appear. NMTOKEN can be useful when you need to map an attribute value to a name that isn't part of XML but does meet the requirements for XML name tokens. |
|NMTOKENS ||The value is a list of valid XML name tokens. This attribute is less common than, but similar to, NMTOKEN, except that multiple name values can appear; these name values must be separated from each other by white space. |
|NOTATION ||The value is a name of a notation. You can declare one or more names of notations in the DTD when certain consequences should follow from the attribute. The keyword NOTATION must be followed by a list of notation identifiers. |
| || |
The ID attribute is very convenient for labeling elements but inconvenient in that it requires a DTD and a validating processor. This idea violates one of the goals of XML: It is desirable that documents be usable without a DTD. The flexibility of being able to name the ID attribute for an element anything you want is more or less useless. Given that at most one ID attribute can exist for any element, it would have been better to just pick a fixed name or use someunique syntax for it, such as a double equals sign followed by the ID label; then you would not need a DTD: "<element =="tag123" OtherAttribute="foo"/>".
Enumerated Attribute Type
Example 4-3 shows a DTD that includes an enumerated attribute type. "FICTION" is the default attribute value in this example.
Example 4-3 Enumerated attribute
<?xml version = "1.0" encoding="UTF-8" standalone = "yes"?> <!DOCTYPE LIBRARY_DEPARTMENTS [ <!ELEMENT LIBRARY_DEPARTMENTS ANY> <!ELEMENT CATAGORY EMPTY> <!ATTLIST CATAGORY TYPE (FICTION | BIOGRAPHY | HISTORY | PHILOSOPHY) "FICTION"> ]> <LIBRARY_DEPARTMENTS > <CATAGORY TYPE = "BIOGRAPHY"/> <CATAGORY TYPE = "HISTORY"/> <CATAGORY/> </LIBRARY_DEPARTMENTS >
The case of an attribute name is important. Document authors often declare multiple attributes for a single element. Attributes can hold only simple strings and must appear in the start tag for an element. Note the following:
You may not include attributes in end tags.
You must surround attribute values with quotes (single or double).
You must begin an attribute name with a letter.
Document authors can attach a special attribute named xml:space to an element to signal their intention that the application should preserve the white space in that element or handle it using the application default. For a document containing this attribute to be valid, its author must include this attribute in the DTD. Declare this attribute as an enumerated type where one or both of "default" and "preserve" are the values.
4.4.2 Attribute Defaults
Specifying a default value for an attribute ensures that the attribute will receive a value even if the XML document didn't include it. Four types of attribute defaults exist, as shown in Table 4-5. The syntax for attribute defaults has the following format:
<!ATTLIST element-name attribute-name attribute-type "default-value">