Chapter 4. DTDs: Entities and Attributes

CONTENTS
  •  Entities
  •  Attributes
  •  Creating Internal General Entities
  •  Creating External General Entities
  •  Building a Document from Pieces
  •  Predefined General Entity References
  •  Creating Internal Parameter Entities
  •  External Parameter Entities
  •  Using INCLUDE andIGNORE
  •  All About Attributes
  •  Embedding Non-XML Data in a Document
  •  Embedding Multiple Unparsed Entities in a Document

In Chapter 3, "Valid XML Documents: Creating Document Type Definitions," I discussed creating DTDs and declaring the elements that you use in XML documents. But there's more to DTDs than that you can also declare attributes and entities, and we're going to do that in this chapter. We'll also take a look at embedding non-XML data in XML documents.

Entities

In the previous chapter, we received an introduction to the idea of entities in XML documents. Two kinds of entities exist: general entities and parameter entities. General entities are probably used by more XML authors because you use them in the content of your XML document, but parameter entities, which you use in a document's DTD, are also available and are very powerful.

So what exactly is an entity? An entity is simply XML's way of referring to a data item; entities are usually text, but they can also be binary data. You declare an entity in a DTD and then refer to it by reference in your document. General entity references start with & and end with ;, parameter entity references start with % and end with ;. For text entities, the entity reference is replaced by the entity itself when parsed by an XML processor.

In other words, you declare an entity in the DTD and refer to it with an entity reference, either in the document's content for general entities, or in the DTD for parameter entities.

Entities can be internal or external. An internal entity is defined completely inside the XML document that references it (and, in fact, the document itself is considered an entity in XML). External entities, on the other hand, derive their content from an external source, such as a file, and a reference to them usually includes an URI at which they may be found. Entities can also be parsed or unparsed. The content of parsed entities is well-formed XML text, and unparsed entities hold data that you don't want parsed, such as simple text or binary data. We'll see how to deal with all kinds of entities here.

In fact, we've already seen the five predefined general entity references in XML: &lt;, &gt;, &amp;, &quot;, and &apos; they stand for the characters <, >, &, ", and ' respectively. Because these entities are predefined in XML, you don't need to define them in a DTD; for example, here's a document that uses all five predefined entity references:

<?xml version = "1.0" standalone="yes"?> <TEXT>     This text about the &quot;S&amp;O Railroad&quot;     is the &lt;TEXT&gt; element&apos;s content. </TEXT>

Each entity reference is replaced by the appropriate character when parsed by an XML processor. For example, Figure 4.1 shows this document open in Internet Explorer. Notice that every entity reference has indeed been replaced.

Figure 4.1. Using the predefined entities in Internet Explorer.

graphics/04fig01.gif

The five predefined entity references are very useful when you want to use as text the specific characters that are interpreted as markup.

You can also define your own entities by declaring them in a DTD. To declare an entity, you use the <!ENTITY> element (just as you use the <!ELEMENT> element to declare an element). Declaring a general entity looks like this:

<!ENTITY NAME DEFINITION>

Here, NAME is the entity's name and DEFINITION is its definition. The name of the entity is just the name that you want to use to refer to the entity, and the entity's definition can take several different forms, as we'll see in this chapter.

The simplest possible entity definition is just the text with which you want a reference to that entity to be replaced here's an example showing how that looks. In this case, I'm defining a general entity named TODAY to hold a date, October 15, 2001, in this DTD:

<?xml version = "1.0" standalone="yes"?> <!DOCTYPE DOCUMENT [ <!ELEMENT DOCUMENT (CUSTOMER)*> <!ELEMENT CUSTOMER (NAME,DATE,ORDERS)> <!ELEMENT NAME (LAST_NAME,FIRST_NAME)> <!ELEMENT LAST_NAME (#PCDATA)> <!ELEMENT FIRST_NAME (#PCDATA)> <!ELEMENT DATE (#PCDATA)> <!ELEMENT ORDERS (ITEM)*> <!ELEMENT ITEM (PRODUCT,NUMBER,PRICE)> <!ELEMENT PRODUCT (#PCDATA)> <!ELEMENT NUMBER (#PCDATA)> <!ELEMENT PRICE (#PCDATA)> <!ENTITY TODAY "October 15, 2001"> ]>     .     .     .

And that's all it takes. Now when I put a reference to this entity &TODAY; into the document, it will be replaced with the text October 15, 2001 by the XML processor:

<?xml version = "1.0" standalone="yes"?> <!DOCTYPE DOCUMENT [ <!ELEMENT DOCUMENT (CUSTOMER)*> <!ELEMENT CUSTOMER (NAME,DATE,ORDERS)> <!ELEMENT NAME (LAST_NAME,FIRST_NAME)> <!ELEMENT LAST_NAME (#PCDATA)> <!ELEMENT FIRST_NAME (#PCDATA)> <!ELEMENT DATE (#PCDATA)> <!ELEMENT ORDERS (ITEM)*> <!ELEMENT ITEM (PRODUCT,NUMBER,PRICE)> <!ELEMENT PRODUCT (#PCDATA)> <!ELEMENT NUMBER (#PCDATA)> <!ELEMENT PRICE (#PCDATA)> <!ENTITY TODAY "October 15, 2001"> ]> <DOCUMENT> <CUSTOMER>         <NAME>             <LAST_NAME>Smith</LAST_NAME>             <FIRST_NAME>Sam</FIRST_NAME>         </NAME>         <DATE>&TODAY;</DATE>         <ORDERS>             <ITEM>                 <PRODUCT>Tomatoes</PRODUCT>                 <NUMBER>8</NUMBER>                 <PRICE>$1.25</PRICE>             </ITEM>             <ITEM>                 <PRODUCT>Oranges</PRODUCT>                 <NUMBER>24</NUMBER>                 <PRICE>$4.98</PRICE>             </ITEM>         </ORDERS>     </CUSTOMER>     <CUSTOMER>         <NAME>             <LAST_NAME>Jones</LAST_NAME>             <FIRST_NAME>Polly</FIRST_NAME>         </NAME>         <DATE>&TODAY;</DATE>         <ORDERS>             <ITEM>                 <PRODUCT>Bread</PRODUCT>                 <NUMBER>12</NUMBER>                 <PRICE>$14.95</PRICE>             </ITEM>             <ITEM>                 <PRODUCT>Apples</PRODUCT>                 <NUMBER>6</NUMBER>                 <PRICE>$1.50</PRICE>             </ITEM>         </ORDERS>     </CUSTOMER>     <CUSTOMER>         <NAME>             <LAST_NAME>Weber</LAST_NAME>             <FIRST_NAME>Bill</FIRST_NAME>         </NAME>         <DATE>&TODAY;</DATE>         <ORDERS>             <ITEM>                 <PRODUCT>Asparagus</PRODUCT>                 <NUMBER>12</NUMBER>                 <PRICE>$2.95</PRICE>             </ITEM>             <ITEM>                 <PRODUCT>Lettuce</PRODUCT>                 <NUMBER>6</NUMBER>                 <PRICE>$11.50</PRICE>             </ITEM>         </ORDERS>     </CUSTOMER> </DOCUMENT>

Figure 4.2 shows the results of this document in Internet Explorer. Notice that the &TODAY; entity references have been replaced with the full text that we've specified.

Figure 4.2. Using user-defined entities in Internet Explorer.

graphics/04fig02.gif

Besides general entities as in this example, we'll also see parameter entities in this chapter, which are designed to be used in DTDs themselves. Declaring a parameter entity looks like this (notice the %):

<!ENTITY % NAME DEFINITION>

Besides setting up your own entities in XML, you can customize your document's elements by declaring attributes for those elements.

Attributes

We've already discussed attributes in some detail; they're those name/value pairs that you can use in start tags and empty tags to provide additional information for an element. Here's an example; in this case, I'm adding an attribute named TYPE to the <CUSTOMER> tag to indicate what type of customer a person is:

<CUSTOMER TYPE = "excellent">         <NAME>             <LAST_NAME>Smith</LAST_NAME>             <FIRST_NAME>Sam</FIRST_NAME>         </NAME>         <DATE>October 15, 2001</DATE>     .     .     .

You can use attributes like this one and assign them values in XML documents, but unless you also declare them, your document won't be valid. You can declare a list of attributes for an element with the <!ATTLIST> element in the DTD. Here's the general form of an <!ATTLIST> element:

<!ATTLIST ELEMENT_NAME     ATTRIBUTE_NAME TYPE DEFAULT_VALUE     ATTRIBUTE_NAME TYPE DEFAULT_VALUE     ATTRIBUTE_NAME TYPE DEFAULT_VALUE     .     .     .     ATTRIBUTE_NAME TYPE DEFAULT_VALUE>

In this case, ELEMENT_NAME is the name of the element that you're declaring attributes for, ATTRIBUTE_NAME is the name of an attribute that you're declaring, TYPE is the attribute's type, and DEFAULT_VALUE specifies its default value. As we'll see in this chapter, DEFAULT_VALUE can take several forms.

Here's an example in which I'll declare the TYPE attribute that we used previously. In this case, I'll use the simplest kind of declaration, making the attribute's type CDATA, which is simple character data, and using an #IMPLIED default value, which means that you can use this attribute in an element or skip it entirely. This is what the document looks like, including the DTD:

<?xml version = "1.0" standalone="yes"?> <!DOCTYPE DOCUMENT [ <!ELEMENT DOCUMENT (CUSTOMER)*> <!ELEMENT CUSTOMER (NAME,DATE,ORDERS)> <!ELEMENT NAME (LAST_NAME,FIRST_NAME)> <!ELEMENT LAST_NAME (#PCDATA)> <!ELEMENT FIRST_NAME (#PCDATA)> <!ELEMENT DATE (#PCDATA)> <!ELEMENT ORDERS (ITEM)*> <!ELEMENT ITEM (PRODUCT,NUMBER,PRICE)> <!ELEMENT PRODUCT (#PCDATA)> <!ELEMENT NUMBER (#PCDATA)> <!ELEMENT PRICE (#PCDATA)> <!ATTLIST CUSTOMER     TYPE CDATA #IMPLIED> ]> <DOCUMENT> <CUSTOMER TYPE = "excellent">         <NAME>             <LAST_NAME>Smith</LAST_NAME>             <FIRST_NAME>Sam</FIRST_NAME>         </NAME>         <DATE>October 15, 2001</DATE>         <ORDERS>             <ITEM>                 <PRODUCT>Tomatoes</PRODUCT>                 <NUMBER>8</NUMBER>                 <PRICE>$1.25</PRICE>             </ITEM>             <ITEM>                 <PRODUCT>Oranges</PRODUCT>                 <NUMBER>24</NUMBER>                 <PRICE>$4.98</PRICE>             </ITEM>             .             .             .             <ITEM>                 <PRODUCT>Asparagus</PRODUCT>                 <NUMBER>12</NUMBER>                 <PRICE>$2.95</PRICE>             </ITEM>             <ITEM>                 <PRODUCT>Lettuce</PRODUCT>                 <NUMBER>6</NUMBER>                 <PRICE>$11.50</PRICE>             </ITEM>         </ORDERS>     </CUSTOMER> </DOCUMENT>

That introduces us to the idea of declaring attributes in DTDs. I'll get into the details on entities and attributes now, starting with entities first general entities and then parameter entities.

Creating Internal General Entities

As discussed at the beginning of the chapter, entities can either be internal or external. We've already seen how to create an internal general reference in this chapter, when we created an internal general entity named TODAY and referenced it as &TODAY; in the document:

<?xml version = "1.0" standalone="yes"?> <!DOCTYPE DOCUMENT [ <!ELEMENT DOCUMENT (CUSTOMER)*> <!ELEMENT CUSTOMER (NAME,DATE,ORDERS)> <!ELEMENT NAME (LAST_NAME,FIRST_NAME)> <!ELEMENT LAST_NAME (#PCDATA)> <!ELEMENT FIRST_NAME (#PCDATA)> <!ELEMENT DATE (#PCDATA)> <!ELEMENT ORDERS (ITEM)*> <!ELEMENT ITEM (PRODUCT,NUMBER,PRICE)> <!ELEMENT PRODUCT (#PCDATA)> <!ELEMENT NUMBER (#PCDATA)> <!ELEMENT PRICE (#PCDATA)> <!ENTITY TODAY "October 15, 2001"> ]> <DOCUMENT> <CUSTOMER>         <NAME>             <LAST_NAME>Smith</LAST_NAME>             <FIRST_NAME>Sam</FIRST_NAME>         </NAME>         <DATE>&TODAY;</DATE>         <ORDERS>             <ITEM>                 <PRODUCT>Tomatoes</PRODUCT>                 <NUMBER>8</NUMBER>                 <PRICE>$1.25</PRICE>             </ITEM>             .             .             .             <ITEM>                 <PRODUCT>Lettuce</PRODUCT>                 <NUMBER>6</NUMBER>                 <PRICE>$11.50</PRICE>             </ITEM>         </ORDERS>     </CUSTOMER> </DOCUMENT>

Refer to Figure 4.2 to see the results.

There are a few things to note here; one is that you can nest general reference definitions, like this:

<!ENTITY NAME "Alfred Hitchcock"> <!ENTITY SIGNATURE "&NAME; 14 Mystery Drive">

Another point is that entity references can't be circular, or you'll drive the XML processor crazy. Here's an example:

<!ENTITY NAME "Alfred Hitchcock &SIGNATURE;"> <!ENTITY SIGNATURE "&NAME; 14 Mystery Drive">

In this case, when the XML processor tries to resolve the &NAME; reference, it finds that it needs to substitute the text for the SIGNATURE entity in the text for the NAME entity, but the NAME entity needs the SIGNATURE entity's text, and so on, around in a circle that never ends. The result is that circular entity references have been made illegal in valid documents.

Also, it's worth noting that you can't use general entity references to insert text that is supposed to be used only in the DTD, not in the document content itself. Here's an example of something that's considered illegal:

<!ENTITY TAGS "(NAME,DATE,ORDERS)"> <!ELEMENT CUSTOMER &TAGS;>

The correct way to do this is with parameter entities, not general entities, and I'll cover them in a few pages. You can use general entities in the DTD to insert text that will become part of the document body, however.

Creating External General Entities

Besides internal entities, entities can also be external, which means that you should provide a URI directing the XML processor to the entity. You can use references to external entities to embed those entities in your document. As we'll see near the end of this chapter, you can also indicate that an external entity should not be parsed, which means that you can associate binary data with a document (much like associating images with an HTML document).

External entities can be simple strings of text, they can be entire documents, or they can be sections of documents. All that matters is that when they are inserted into the document's content, the XML processor is satisfied that the document is well-formed and valid.

As with DTDs, you can declare external entities using the SYSTEM or PUBLIC keywords. Entities declared with the SYSTEM keyword are for private use by an organization of individuals, and entities declared with PUBLIC are public, so they need a formal public identifier (FPI see Chapter 3 for the rules on creating FPIs). Here's how you use the SYSTEM and PUBLIC keywords to declare an external entity:

<!ENTITY NAME SYSTEM URI> <!ENTITY NAME PUBLIC FPI URI>

For example, say that you've stored a date as the text October 15, 2001 in a file named date.xml. Here's how you could set up an entity named TODAY connected to that file:

<!ENTITY TODAY SYSTEM "date.xml">

And here's how you could use a reference to that entity to insert the data into a document's content (notice that I changed the value of the standalone attribute from "yes" to "no" here because we're working with an external entity):

<?xml version = "1.0" standalone="no"?> <!DOCTYPE DOCUMENT [ <!ELEMENT DOCUMENT (CUSTOMER)*> <!ELEMENT CUSTOMER (NAME,DATE,ORDERS)> <!ELEMENT NAME (LAST_NAME,FIRST_NAME)> <!ELEMENT LAST_NAME (#PCDATA)> <!ELEMENT FIRST_NAME (#PCDATA)> <!ELEMENT DATE (#PCDATA)> <!ELEMENT ORDERS (ITEM)*> <!ELEMENT ITEM (PRODUCT,NUMBER,PRICE)> <!ELEMENT PRODUCT (#PCDATA)> <!ELEMENT NUMBER (#PCDATA)> <!ELEMENT PRICE (#PCDATA)> <!ENTITY TODAY SYSTEM "date.xml"> ]> <DOCUMENT> <CUSTOMER>         <NAME>             <LAST_NAME>Smith</LAST_NAME>             <FIRST_NAME>Sam</FIRST_NAME>         </NAME>         <DATE>&TODAY;</DATE>         <ORDERS>             <ITEM>                 <PRODUCT>Tomatoes</PRODUCT>                 <NUMBER>8</NUMBER>                 <PRICE>$1.25</PRICE>             </ITEM>             <ITEM>                 <PRODUCT>Oranges</PRODUCT>                 <NUMBER>24</NUMBER>                 <PRICE>$4.98</PRICE>             </ITEM>         </ORDERS>     </CUSTOMER>     <CUSTOMER>         <NAME>             <LAST_NAME>Jones</LAST_NAME>             <FIRST_NAME>Polly</FIRST_NAME>         </NAME>         <DATE>&TODAY;</DATE>         <ORDERS>             <ITEM>                 <PRODUCT>Bread</PRODUCT>                 <NUMBER>12</NUMBER>                 <PRICE>$14.95</PRICE>             </ITEM>             <ITEM>                 <PRODUCT>Apples</PRODUCT>                 <NUMBER>6</NUMBER>                 <PRICE>$1.50</PRICE>             </ITEM>         </ORDERS>     </CUSTOMER>     <CUSTOMER>         <NAME>             <LAST_NAME>Weber</LAST_NAME>             <FIRST_NAME>Bill</FIRST_NAME>         </NAME>         <DATE>&TODAY;</DATE>         <ORDERS>             <ITEM>                 <PRODUCT>Asparagus</PRODUCT>                 <NUMBER>12</NUMBER>                 <PRICE>$2.95</PRICE>             </ITEM>             <ITEM>                 <PRODUCT>Lettuce</PRODUCT>                 <NUMBER>6</NUMBER>                 <PRICE>$11.50</PRICE>             </ITEM>         </ORDERS>     </CUSTOMER> </DOCUMENT>

Notice how powerful this technique is now you can create documents that are themselves pieced together from other documents. If you wanted to use a public entity instead of a private one, you could use the SYSTEM keyword with an FPI, like this:

<!ENTITY TODAY SYSTEM "-//starpowder//Custom Entity Version 1.0//EN" "date.xml">

Defining external entities makes them available for multiple documents. This is useful, for example, in case you want to have the same text appear as a signature in all your documents, or you work with text that will change frequently (such as a greeting for the day) that you want to edit in only one place.

Here's another note: Often nonvalidating XML processors (such as Internet Explorer) will read a DTD to pick up any entity declarations that you may have put there, even though they don't use the DTD to validate the document. This means that XML authors sometimes even add partial DTDs to documents that would not be considered valid so that they can use entity references (this is just an expedient programming practice, not a good one). Here's an example (note that this DTD is not complete by any means and that it carries only the declaration for the entity TODAY):

<?xml version = "1.0" standalone="no"?> <!DOCTYPE DOCUMENT [ <!ENTITY TODAY SYSTEM "date.xml"> ]> <DOCUMENT> <CUSTOMER>         <NAME>             <LAST_NAME>Smith</LAST_NAME>             <FIRST_NAME>Sam</FIRST_NAME>         </NAME>         <DATE>&TODAY;</DATE>         <ORDERS>             <ITEM>                 <PRODUCT>Tomatoes</PRODUCT>                 <NUMBER>8</NUMBER>                 <PRICE>$1.25</PRICE>             </ITEM>             .             .             .             <ITEM>                 <PRODUCT>Asparagus</PRODUCT>                 <NUMBER>12</NUMBER>                 <PRICE>$2.95</PRICE>             </ITEM>             <ITEM>                 <PRODUCT>Lettuce</PRODUCT>                 <NUMBER>6</NUMBER>                 <PRICE>$11.50</PRICE>             </ITEM>         </ORDERS>     </CUSTOMER> </DOCUMENT>

Building a Document from Pieces

One way to use external general entities is to build a document from pieces, in which you treat each piece as a general entity. Here's an example; in this case, I'm including an entity that refers to the file data.xml in my document:

<?xml version = "1.0" standalone="no"?> <!DOCTYPE DOCUMENT [ <!ELEMENT DOCUMENT (CUSTOMER)*> <!ELEMENT CUSTOMER (NAME,DATE,ORDERS)> <!ELEMENT NAME (LAST_NAME,FIRST_NAME)> <!ELEMENT LAST_NAME (#PCDATA)> <!ELEMENT FIRST_NAME (#PCDATA)> <!ELEMENT DATE (#PCDATA)> <!ELEMENT ORDERS (ITEM)*> <!ELEMENT ITEM (PRODUCT,NUMBER,PRICE)> <!ELEMENT PRODUCT (#PCDATA)> <!ELEMENT NUMBER (#PCDATA)> <!ELEMENT PRICE (#PCDATA)> <!ENTITY data SYSTEM "data.xml"> ]> <DOCUMENT> &data; </DOCUMENT>

The file data.xml itself holds the actual data for the document:

<CUSTOMER>         <NAME>             <LAST_NAME>Smith</LAST_NAME>             <FIRST_NAME>Sam</FIRST_NAME>         </NAME>         <DATE>October 15, 2001</DATE>         <ORDERS>             <ITEM>                 <PRODUCT>Tomatoes</PRODUCT>                 <NUMBER>8</NUMBER>                 <PRICE>$1.25</PRICE>             </ITEM>             <ITEM>                 <PRODUCT>Oranges</PRODUCT>                 <NUMBER>24</NUMBER>                 <PRICE>$4.98</PRICE>             </ITEM>         </ORDERS>     </CUSTOMER>     <CUSTOMER>         <NAME>             <LAST_NAME>Jones</LAST_NAME>             <FIRST_NAME>Polly</FIRST_NAME>         </NAME>         <DATE>October 20, 2001</DATE>         <ORDERS>             <ITEM>                 <PRODUCT>Bread</PRODUCT>                 <NUMBER>12</NUMBER>                 <PRICE>$14.95</PRICE>             </ITEM>             <ITEM>                 <PRODUCT>Apples</PRODUCT>                 <NUMBER>6</NUMBER>                 <PRICE>$1.50</PRICE>             </ITEM>         </ORDERS>     </CUSTOMER>     <CUSTOMER>         <NAME>             <LAST_NAME>Weber</LAST_NAME>             <FIRST_NAME>Bill</FIRST_NAME>         </NAME>         <DATE>October 25, 2001</DATE>         <ORDERS>             <ITEM>                 <PRODUCT>Asparagus</PRODUCT>                 <NUMBER>12</NUMBER>                 <PRICE>$2.95</PRICE>             </ITEM>             <ITEM>                 <PRODUCT>Lettuce</PRODUCT>                 <NUMBER>6</NUMBER>                 <PRICE>$11.50</PRICE>             </ITEM>         </ORDERS>     </CUSTOMER>

In this way, you can put documents together from various pieces, choosing the pieces that you want.

Predefined General Entity References

As we already know, there are five predefined entity references in XML, and they stand for characters that can be interpreted as markup or other control characters:

  • &amp; becomes the & character

  • &apos; becomes the ' character

  • &gt; becomes the > character

  • &lt; becomes the < character

  • &quot; becomes the " character

It turns out that you can create entity references for individual characters yourself in XML all you have to do is to specify the correct character code in the encoding that you're using. For example, in the UTF-8 encoding, the character code for @ is #64 (where the # indicates that this value is in hexadecimal), so you can define an entity named, say, at_new, so that references to at_new will be replaced by @ when parsed. Here's how that entity would look:

<!ENTITY at_new "&#64;">

In fact, you can even define the predefined entity references yourself, in case you run across an XML processor that doesn't understand them. Here's how I modify the example document at the beginning of this chapter that uses those entity references this time I define the entities myself:

<?xml version = "1.0" standalone="yes"?> <!DOCTYPE TEXT [ <!ENTITY amp_new "&#38;#38;"> <!ENTITY apos_new "&#39;"> <!ENTITY gt_new "&#62;"> <!ENTITY lt_new "&#38;#60;"> <!ENTITY quot_new "&#34;"> ]> <TEXT>     This text about the &quot_new;S&amp_new;O Railroad&quot_new;     is the &lt_new;TEXT&gt_new; element&apos_new;s content. </TEXT>

Creating Internal Parameter Entities

As we've seen, you use general entity references in documents so that the XML processor will replace them with the entity to which they refer. However, you can use general entities only in a limited way in DTDs that is, you can use them to insert text that will itself be inserted into the document content, but you can't use them to work with the declarations themselves in the DTD.

To actually work with element and attribute declarations, you use parameter entities. Parameter entity references can be used in only the DTD. In fact, there's an additional restriction: Any parameter entity references that you use in any DTD declaration must appear only in the DTD's external subset (the external subset is that part of the DTD that is external). You can use parameter entities in the internal subset, but only in a limited way, as we'll see.

Unlike general entity references, parameter entity references start with %, not &. Creating a parameter entity is just like creating a general entity, except that you include a % in the <!ENTITY> element, like this:

<!ENTITY % NAME DEFINITION

You can also declare external parameter entities using the SYSTEM and PUBLIC keywords, like this (where FPI stands for a formal public identifier):

<!ENTITY % NAME SYSTEM URI> <!ENTITY % NAME PUBLIC FPI URI>

Here's an example using an internal parameter entity; in this case, I'll declare a parameter entity named BR that stands for the text <!ELEMENT BR EMPTY> inside this DTD:

<?xml version = "1.0" standalone="yes"?> <!DOCTYPE DOCUMENT [ <!ENTITY % BR "<!ELEMENT BR EMPTY>"> <!ELEMENT DOCUMENT (CUSTOMER)*> <!ELEMENT CUSTOMER (NAME,DATE,ORDERS)> <!ELEMENT NAME (LAST_NAME,FIRST_NAME)> <!ELEMENT LAST_NAME (#PCDATA)> <!ELEMENT FIRST_NAME (#PCDATA)> <!ELEMENT DATE (#PCDATA)> <!ELEMENT ORDERS (ITEM)*> <!ELEMENT ITEM (PRODUCT,NUMBER,PRICE)> <!ELEMENT PRODUCT (#PCDATA)> <!ELEMENT NUMBER (#PCDATA)> <!ELEMENT PRICE (#PCDATA)> ]>     .     .     .

Now I can reference that parameter entity this way to include the element declaration <!ELEMENT BR EMPTY> in the DTD:

<?xml version = "1.0" standalone="yes"?> <!DOCTYPE DOCUMENT [ <!ENTITY % BR "<!ELEMENT BR EMPTY>"> <!ELEMENT DOCUMENT (CUSTOMER)*> <!ELEMENT CUSTOMER (NAME,DATE,ORDERS)> <!ELEMENT NAME (LAST_NAME,FIRST_NAME)> <!ELEMENT LAST_NAME (#PCDATA)> <!ELEMENT FIRST_NAME (#PCDATA)> <!ELEMENT DATE (#PCDATA)> <!ELEMENT ORDERS (ITEM)*> <!ELEMENT ITEM (PRODUCT,NUMBER,PRICE)> <!ELEMENT PRODUCT (#PCDATA)> <!ELEMENT NUMBER (#PCDATA)> <!ELEMENT PRICE (#PCDATA)> %BR; ]> <DOCUMENT>     <CUSTOMER>         <NAME>             <LAST_NAME>Smith</LAST_NAME>             <FIRST_NAME>Sam</FIRST_NAME>         </NAME>         <DATE>October 15, 2001</DATE>         <ORDERS>             <ITEM>                 <PRODUCT>Tomatoes</PRODUCT>                 <NUMBER>8</NUMBER>                 <PRICE>$1.25</PRICE>             </ITEM>             <ITEM>                 <PRODUCT>Oranges</PRODUCT>                 <NUMBER>24</NUMBER>                 <PRICE>$4.98</PRICE>             </ITEM>             .             .             .             <ITEM>                 <PRODUCT>Asparagus</PRODUCT>                 <NUMBER>12</NUMBER>                 <PRICE>$2.95</PRICE>             </ITEM>             <ITEM>                 <PRODUCT>Lettuce</PRODUCT>                 <NUMBER>6</NUMBER>                 <PRICE>$11.50</PRICE>             </ITEM>         </ORDERS>     </CUSTOMER> </DOCUMENT>

Notice that I haven't really saved much time here; I might as well have just put the declaration <!ELEMENT BR EMPTY> directly into the DTD. On the other hand, you can't do much more with internal parameter entities (those that are defined in the DTD's internal subset) because you can't use them inside any other declarations. If you want to find out what people really use parameter entities for, we have to take a look at external parameter entities.

External Parameter Entities

When you use a parameter entity in the DTD's external subset, you can reference that entity anywhere in the DTD, including in element declarations. Here's an example; in this case, I'm using an external DTD named order.dtd for this document:

<?xml version = "1.0" standalone="no"?> <!DOCTYPE DOCUMENT SYSTEM "order.dtd"> <DOCUMENT>     <CUSTOMER>         <NAME>             <LAST_NAME>Smith</LAST_NAME>             <FIRST_NAME>Sam</FIRST_NAME>         </NAME>         <DATE>October 15, 2001</DATE>         <ORDERS>             <ITEM>                 <PRODUCT>Tomatoes</PRODUCT>                 <NUMBER>8</NUMBER>                 <PRICE>$1.25</PRICE>             </ITEM>             <ITEM>                 <PRODUCT>Oranges</PRODUCT>                 <NUMBER>24</NUMBER>                 <PRICE>$4.98</PRICE>             </ITEM>             .             .             .             <ITEM>                 <PRODUCT>Asparagus</PRODUCT>                 <NUMBER>12</NUMBER>                 <PRICE>$2.95</PRICE>             </ITEM>             <ITEM>                 <PRODUCT>Lettuce</PRODUCT>                 <NUMBER>6</NUMBER>                 <PRICE>$11.50</PRICE>             </ITEM>         </ORDERS>     </CUSTOMER> </DOCUMENT>

In the external DTD subset, order.dtd, I'm going to set things up so that the <DOCUMENT> element can contain not only <CUSTOMER> elements, but also <BUYER> and <DISCOUNTER> elements. Each of these two new elements, <BUYER> and <DISCOUNTER>, has the same content model as the <CUSTOMER> element (that is, they can contain <NAME>, <DATE>, and <ORDERS> elements), so to save a little time, I'll assign that content model, (NAME,DATE,ORDERS), to a parameter entity named record:

<!ENTITY % record "(NAME,DATE,ORDERS)"> <!ELEMENT DOCUMENT (CUSTOMER | BUYER | DISCOUNTER)*>     .     .     .

Now I'm free to refer to the record parameter entity where I like; in this case, that means using it to declare the <CUSTOMER>, <BUYER>, and <DISCOUNTER> elements:

<!ENTITY % record "(NAME,DATE,ORDERS)"> <!ELEMENT DOCUMENT (CUSTOMER | BUYER | DISCOUNTER)*> <!ELEMENT CUSTOMER %record;> <!ELEMENT BUYER %record;> <!ELEMENT DISCOUNTER %record;> <!ELEMENT NAME (LAST_NAME,FIRST_NAME)> <!ELEMENT LAST_NAME (#PCDATA)> <!ELEMENT FIRST_NAME (#PCDATA)> <!ELEMENT DATE (#PCDATA)> <!ELEMENT ORDERS (ITEM)*> <!ELEMENT ITEM (PRODUCT,NUMBER,PRICE)> <!ELEMENT PRODUCT (#PCDATA)> <!ELEMENT NUMBER (#PCDATA)> <!ELEMENT PRICE (#PCDATA)>

Now the document works and parses as expected I can use <CUSTOMER>, <BUYER>, and <DISCOUNTER> elements inside the <DOCUMENT> element, and all three of those elements have the same content model:

<?xml version = "1.0" standalone="no"?> <!DOCTYPE DOCUMENT SYSTEM "order.dtd"> <DOCUMENT>     <CUSTOMER>         <NAME>             <LAST_NAME>Smith</LAST_NAME>             <FIRST_NAME>Sam</FIRST_NAME>         </NAME>         <DATE>October 15, 2001</DATE>         <ORDERS>             <ITEM>                 <PRODUCT>Tomatoes</PRODUCT>                 <NUMBER>8</NUMBER>                 <PRICE>$1.25</PRICE>             </ITEM>             <ITEM>                 <PRODUCT>Oranges</PRODUCT>                 <NUMBER>24</NUMBER>                 <PRICE>$4.98</PRICE>             </ITEM>         </ORDERS>     </CUSTOMER>     <BUYER>         <NAME>             <LAST_NAME>Jones</LAST_NAME>             <FIRST_NAME>Polly</FIRST_NAME>         </NAME>         <DATE>October 20, 2001</DATE>         <ORDERS>             <ITEM>                 <PRODUCT>Bread</PRODUCT>                 <NUMBER>12</NUMBER>                 <PRICE>$14.95</PRICE>             </ITEM>             <ITEM>                 <PRODUCT>Apples</PRODUCT>                 <NUMBER>6</NUMBER>                 <PRICE>$1.50</PRICE>             </ITEM>         </ORDERS>     </BUYER>     <DISCOUNTER>         <NAME>             <LAST_NAME>Weber</LAST_NAME>             <FIRST_NAME>Bill</FIRST_NAME>         </NAME>         <DATE>October 25, 2001</DATE>         <ORDERS>             <ITEM>                 <PRODUCT>Asparagus</PRODUCT>                 <NUMBER>12</NUMBER>                 <PRICE>$2.95</PRICE>             </ITEM>             <ITEM>                 <PRODUCT>Lettuce</PRODUCT>                 <NUMBER>6</NUMBER>                 <PRICE>$11.50</PRICE>             </ITEM>         </ORDERS>     </DISCOUNTER> </DOCUMENT>

This example points out probably the biggest reason people use parameter entities: to handle text that's repeated often in element declarations in a DTD. In this case, I specified the content model of three elements using the same parameter entity, but I could just have easily set up a parameter entity to let me specify an attribute list that was the same for as many elements as I like. In this way, you can control the declarations of many elements and attributes, even in a huge DTD. And if you need to modify a declaration, you need to modify only the parameter entity, not each declaration in detail.

For example, you might divide your attributes in a big DTD into various types. When you declare some new element, you might want to give it only the image-handling and URI-handling attributes, which you could do like this (in fact, this is the way the XHTML DTDs are built):

<!ATTLIST NEW_ELEMENT %image_attributes; %URI_attributes;>

Here's another example showing how to use parameter entities; in this case, I'm going to base my document on the XHTML 1.0 transitional DTD, adding a few elements of my own to XHTML. To do that, I declare the elements that I want to use and then simply include the entire XHTML 1.0 transitional DTD, using a parameter reference like this:

<!ENTITY % record "(NAME,DATE,ORDERS)"> <!ELEMENT DOCUMENT (CUSTOMER | BUYER | DISCOUNTER)*> <!ELEMENT CUSTOMER %record;> <!ELEMENT BUYER %record;> <!ELEMENT DISCOUNTER %record;> <!ELEMENT NAME (LAST_NAME,FIRST_NAME)> <!ELEMENT LAST_NAME (#PCDATA)> <!ELEMENT FIRST_NAME (#PCDATA)> <!ELEMENT DATE (#PCDATA)> <!ELEMENT ORDERS (ITEM)*> <!ELEMENT ITEM (PRODUCT,NUMBER,PRICE)> <!ELEMENT PRODUCT (#PCDATA)> <!ELEMENT NUMBER (#PCDATA)> <!ELEMENT PRICE (#PCDATA)> <!ENTITY % XHTML1-t.dtd PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> %XHTML1-t.dtd;

Using INCLUDE and IGNORE

Two important DTD directives are often used with parameter entities: INCLUDE and IGNORE. You use these directives to include or remove sections of a DTD; here's how you use them: <![ INCLUDE [DTD Section]]> and <![ IGNORE [DTD Section]]>. Using these directives, you can customize your DTD.

Here's an example showing what these two directives look like in practice:

<![ INCLUDE [ <!ELEMENT PRODUCT_ID (#PCDATA)> <!ELEMENT SHIP_DATE (#PCDATA)> <!ELEMENT SKU (#PCDATA)> ]]> <![ IGNORE [ <!ELEMENT PRODUCT_ID (#PCDATA)> <!ELEMENT SHIP_DATE (#PCDATA)> <!ELEMENT SKU (#PCDATA)> ]]>

You might wonder what the big deal is here after all, you can just use a comment to hide sections of a DTD. The usefulness of INCLUDE and IGNORE sections becomes more apparent when you use them together with parameter entities to parameterize DTDs. When you parameterize a DTD, you can include or ignore multiple sections of a DTD simply by changing the value of a parameter entity from IGNORE to INCLUDE or back again.

Here's an example; in this case, I'm going to let XML authors include or ignore sections of a DTD just by changing the value of a parameter entity named includer. To use a parameter entity in INCLUDE and IGNORE sections, you must work with the external DTD subset, so I'll set up an external DTD subset named order.dtd:

<?xml version = "1.0" standalone="no"?> <!DOCTYPE DOCUMENT SYSTEM "order.dtd"> <DOCUMENT>     <CUSTOMER>         <NAME>             <LAST_NAME>Smith</LAST_NAME>             <FIRST_NAME>Sam</FIRST_NAME>         </NAME>         <DATE>October 15, 2001</DATE>         <ORDERS>             <ITEM>                 <PRODUCT>Tomatoes</PRODUCT>                 <NUMBER>8</NUMBER>                 <PRICE>$1.25</PRICE>             </ITEM>             .             .             .             <ITEM>                 <PRODUCT>Lettuce</PRODUCT>                 <NUMBER>6</NUMBER>                 <PRICE>$11.50</PRICE>             </ITEM>         </ORDERS>     </CUSTOMER> </DOCUMENT>

Here's what order.dtd looks like; first I set up the includer parameter entity, setting it to the text "INCLUDE" by default:

<!ENTITY % includer "INCLUDE"> <!ELEMENT DOCUMENT (CUSTOMER)*> <!ELEMENT CUSTOMER (NAME,DATE,ORDERS)> <!ELEMENT NAME (LAST_NAME,FIRST_NAME)> <!ELEMENT LAST_NAME (#PCDATA)> <!ELEMENT FIRST_NAME (#PCDATA)> <!ELEMENT DATE (#PCDATA)> <!ELEMENT ORDERS (ITEM)*> <!ELEMENT ITEM (PRODUCT,NUMBER,PRICE)> <!ELEMENT PRODUCT (#PCDATA)> <!ELEMENT NUMBER (#PCDATA)> <!ELEMENT PRICE (#PCDATA)>

Now I can use the value of this entity to set up an INCLUDE (or IGNORE) section in the DTD like this:

<!ENTITY % includer "INCLUDE"> <!ELEMENT DOCUMENT (CUSTOMER)*> <!ELEMENT CUSTOMER (NAME,DATE,ORDERS)> <!ELEMENT NAME (LAST_NAME,FIRST_NAME)> <!ELEMENT LAST_NAME (#PCDATA)> <!ELEMENT FIRST_NAME (#PCDATA)> <!ELEMENT DATE (#PCDATA)> <!ELEMENT ORDERS (ITEM)*> <!ELEMENT ITEM (PRODUCT,NUMBER,PRICE)> <!ELEMENT PRODUCT (#PCDATA)> <!ELEMENT NUMBER (#PCDATA)> <!ELEMENT PRICE (#PCDATA)> <![ %includer; [ <!ELEMENT PRODUCT_ID (#PCDATA)> <!ELEMENT SHIP_DATE (#PCDATA)> <!ELEMENT SKU (#PCDATA)> ]]>

At this point, you can include or ignore the indicated section of the DTD just by changing the value of the includer entity. Using a technique like this makes it easy to centralize the entities that you need to use to customize a whole DTD at one time.

In fact, that's the way the XHTML 1.1 DTD works; XHTML is expressly built to be modular to allow devices that can't handle full XHTML to support partial implementations. The main XHTML 1.1 DTD is actually a DTD driver, which means that it includes the various XHTML 1.1 modules using parameter entities. For example, here's how the XHTML 1.1 DTD includes the DTD module (that is, a section of a DTD) that supports HTML tables, xhtml11-table-1.mod; note that it declares a parameter entity corresponding to that module and then uses an entity reference to include the actual module:

<!-- Tables Module ............................................... --> <!ENTITY % xhtml-table.mod      PUBLIC "-//W3C//ELEMENTS XHTML 1.1 Tables 1.0//EN"             "xhtml11-table-1.mod" > %xhtml-table.mod;

However, not all devices that support XHTML might be capable of supporting tables (for example, cell phones or PDAs). So, the XHTML 1.1 DTD also defines a parameter entity named xhtml-table.module that's set to "INCLUDE" by default and includes the table module with an INCLUDE section like this:

<!-- Tables Module ............................................... --> <!ENTITY % xhtml-table.module "INCLUDE" > <![%xhtml-table.module;[ <!ENTITY % xhtml-table.mod      PUBLIC "-//W3C//ELEMENTS XHTML 1.1 Tables 1.0//EN"             "xhtml11-table-1.mod" > %xhtml-table.mod;]]>

Now you can customize the XHTML 1.1 DTD by changing the value of xhtml-table.module to "IGNORE" to exclude support for tables. Because all the various XHTML 1.1 DTD modules are part of INCLUDE sections based on parameter entities like this, that DTD is considered fully parameterized.

All About Attributes

Attributes are name/value pairs that you can use in start and empty tags to add additional information. We've already seen in Chapter 2, "Creating Well-Formed XML Documents," that you can set up attributes as easily in XML as in HTML. Here's an example showing several attributes:

<CUSTOMER LAST_NAME="Smith" FIRST_NAME="Sam"     DATE="October 15, 2001" PURCHASE="Tomatoes"     PRICE="$1.25" NUMBER="8" />

In this case, I'm indicating that the customer's last name is Smith; that his first name is Sam; that the date of the current purchase is October 15, 2001; and that Sam purchased eight tomatoes for a total cost of $1.25.

Because you can declare elements in DTDs, you might expect that you can declare attributes as well, and you'd be right. In fact, there's good support of attribute declarations in DTDs, and we'll take a look at how it works now.

Declaring Attributes in DTDs

Declaring attributes and their types is very useful in XML. If you want your document to be valid, you must declare any attributes you use before using them. You can give attributes default values and even require XML authors who use your DTD to assign values to attributes.

As we saw at the beginning of this chapter, you declare a list of attributes for an element with the <!ATTLIST> element:

<!ATTLIST ELEMENT_NAME     ATTRIBUTE_NAME TYPE DEFAULT_VALUE>     ATTRIBUTE_NAME TYPE DEFAULT_VALUE     ATTRIBUTE_NAME TYPE DEFAULT_VALUE     .     .     .     ATTRIBUTE_NAME TYPE DEFAULT_VALUE>

In this case, ELEMENT_NAME is the name of the element that you're declaring attributes for, ATTRIBUTE_NAME is the name of an attribute that you're declaring, TYPE is the attribute's type, and DEFAULT_VALUE represents its default value.

Here are the possible TYPE values that you can use:

Type Description
CDATA Is simple character data (that is, text that does not include any markup)
ENTITIES Gives multiple entity names (which must be declared in the DTD), separated by whitespace
ENTITY Names an entity (which must be declared in the DTD)
Enumerated Represents a list of values; any one item from the list is an appropriate attribute value (and you must use one of the items from the list)
ID Is a proper XML name that must be unique (that is, not shared by any other attribute of the ID type)
IDREF Will hold the value of an ID attribute of some element, usually another element to which the current element is related
IDREFS Shows multiple IDs of elements separated by whitespace
NMTOKEN Shows a proper XML name
NMTOKENS Shows multiple proper XML names in a list, separated by whitespace
NOTATION Shows a notation name (which must be declared in the DTD)

I'll take a look at all these possibilities in this chapter.

Here are the possible DEFAULT_VALUE settings that you can use:

Method Description
VALUE Shows a simple text value, enclosed in quotes.
#IMPLIED Indicates that there is no default value for this attribute, and this attribute need not be used.
#REQUIRED Indicates that there is no default value, but that a value must be assigned to this attribute.
#FIXED VALUE In this case, VALUE is the attribute's value, and the attribute must always have this value.

We saw a simple example at the beginning of this chapter; in this case, I declare a TYPE attribute of the CDATA type for the <CUSTOMER> element and indicate that this attribute can be used as the author prefers:

<?xml version = "1.0" standalone="yes"?> <!DOCTYPE DOCUMENT [ <!ELEMENT DOCUMENT (CUSTOMER)*> <!ELEMENT CUSTOMER (NAME,DATE,ORDERS)> <!ELEMENT NAME (LAST_NAME,FIRST_NAME)> <!ELEMENT LAST_NAME (#PCDATA)> <!ELEMENT FIRST_NAME (#PCDATA)> <!ELEMENT DATE (#PCDATA)> <!ELEMENT ORDERS (ITEM)*> <!ELEMENT ITEM (PRODUCT,NUMBER,PRICE)> <!ELEMENT PRODUCT (#PCDATA)> <!ELEMENT NUMBER (#PCDATA)> <!ELEMENT PRICE (#PCDATA)> <!ATTLIST CUSTOMER     TYPE CDATA #IMPLIED> ]> <DOCUMENT>     <CUSTOMER TYPE = "excellent">         <NAME>             <LAST_NAME>Smith</LAST_NAME>             <FIRST_NAME>Sam</FIRST_NAME>         </NAME>         <DATE>October 15, 2001</DATE>         <ORDERS>             <ITEM>                 <PRODUCT>Tomatoes</PRODUCT>                 <NUMBER>8</NUMBER>                 <PRICE>$1.25</PRICE>             </ITEM>             <ITEM>                 <PRODUCT>Oranges</PRODUCT>                 <NUMBER>24</NUMBER>                 <PRICE>$4.98</PRICE>             </ITEM>         </ORDERS>     </CUSTOMER>     <CUSTOMER TYPE = "lousy">         <NAME>             <LAST_NAME>Jones</LAST_NAME>             <FIRST_NAME>Polly</FIRST_NAME>         </NAME>         <DATE>October 20, 2001</DATE>         <ORDERS>             <ITEM>                 <PRODUCT>Bread</PRODUCT>                 <NUMBER>12</NUMBER>                 <PRICE>$14.95</PRICE>             </ITEM>             <ITEM>                 <PRODUCT>Apples</PRODUCT>                 <NUMBER>6</NUMBER>                 <PRICE>$1.50</PRICE>             </ITEM>         </ORDERS>     </CUSTOMER>     <CUSTOMER TYPE="good">         <NAME>             <LAST_NAME>Weber</LAST_NAME>             <FIRST_NAME>Bill</FIRST_NAME>         </NAME>         <DATE>October 25, 2001</DATE>         <ORDERS>             <ITEM>                 <PRODUCT>Asparagus</PRODUCT>                 <NUMBER>12</NUMBER>                 <PRICE>$2.95</PRICE>             </ITEM>             <ITEM>                 <PRODUCT>Lettuce</PRODUCT>                 <NUMBER>6</NUMBER>                 <PRICE>$11.50</PRICE>             </ITEM>         </ORDERS>     </CUSTOMER> </DOCUMENT>

This example shows how to declare a single attribute, but as its name implies, you can use <!ATTLIST> to declare an entire list of attributes for an element; here's an example where I declare the attributes OWES, LAYAWAY, and DEFAULTS for the <CUSTOMER> element all at once:

<?xml version = "1.0" standalone="yes"?> <!DOCTYPE DOCUMENT [ <!ELEMENT DOCUMENT (CUSTOMER)*> <!ELEMENT CUSTOMER (NAME,DATE,ORDERS)> <!ELEMENT NAME (LAST_NAME,FIRST_NAME)> <!ELEMENT LAST_NAME (#PCDATA)> <!ELEMENT FIRST_NAME (#PCDATA)> <!ELEMENT DATE (#PCDATA)> <!ELEMENT ORDERS (ITEM)*> <!ELEMENT ITEM (PRODUCT,NUMBER,PRICE)> <!ELEMENT PRODUCT (#PCDATA)> <!ELEMENT NUMBER (#PCDATA)> <!ELEMENT PRICE (#PCDATA)> <!ATTLIST CUSTOMER     OWES CDATA "0"     LAYAWAY CDATA "0"     DEFAULTS CDATA "0"> ]> <DOCUMENT>     <CUSTOMER OWES="$12.13" LAYAWAY="$0" DEFAULTS="0">         <NAME>             <LAST_NAME>Smith</LAST_NAME>             <FIRST_NAME>Sam</FIRST_NAME>         </NAME>         <DATE>October 15, 2001</DATE>         <ORDERS>             <ITEM>                 <PRODUCT>Tomatoes</PRODUCT>                 <NUMBER>8</NUMBER>                 <PRICE>$1.25</PRICE>             </ITEM>             <ITEM>                 <PRODUCT>Oranges</PRODUCT>                 <NUMBER>24</NUMBER>                 <PRICE>$4.98</PRICE>             </ITEM>         </ORDERS>     </CUSTOMER>     <CUSTOMER OWES="$132.69" LAYAWAY="$44.99" DEFAULTS="0">         <NAME>             <LAST_NAME>Jones</LAST_NAME>             <FIRST_NAME>Polly</FIRST_NAME>         </NAME>         <DATE>October 20, 2001</DATE>         <ORDERS>             <ITEM>                 <PRODUCT>Bread</PRODUCT>                 <NUMBER>12</NUMBER>                 <PRICE>$14.95</PRICE>             </ITEM>             <ITEM>                 <PRODUCT>Apples</PRODUCT>                 <NUMBER>6</NUMBER>                 <PRICE>$1.50</PRICE>             </ITEM>         </ORDERS>     </CUSTOMER>     <CUSTOMER OWES="$0" LAYAWAY="$1.99" DEFAULTS="0">         <NAME>             <LAST_NAME>Weber</LAST_NAME>             <FIRST_NAME>Bill</FIRST_NAME>         </NAME>         <DATE>October 25, 2001</DATE>         <ORDERS>             <ITEM>                 <PRODUCT>Asparagus</PRODUCT>                 <NUMBER>12</NUMBER>                 <PRICE>$2.95</PRICE>             </ITEM>             <ITEM>                 <PRODUCT>Lettuce</PRODUCT>                 <NUMBER>6</NUMBER>                 <PRICE>$11.50</PRICE>             </ITEM>         </ORDERS>     </CUSTOMER> </DOCUMENT>

Now that I've declared these attributes, the document is valid.

Setting Default Values for Attributes

I'm going to start the examination of declaring attributes in DTDs by seeing what kind of default values you can specify for attributes.

Immediate Values

You can supply a default value for an attribute simply by giving that value in quotes in the attribute's declaration in the <!ATTLIST> element, as we've seen:

<?xml version = "1.0" standalone="yes"?> <!DOCTYPE DOCUMENT [ <!ELEMENT DOCUMENT (CUSTOMER)*> <!ELEMENT CUSTOMER (NAME,DATE,ORDERS)> <!ELEMENT NAME (LAST_NAME,FIRST_NAME)> <!ELEMENT LAST_NAME (#PCDATA)> <!ELEMENT FIRST_NAME (#PCDATA)> <!ELEMENT DATE (#PCDATA)> <!ELEMENT ORDERS (ITEM)*> <!ELEMENT ITEM (PRODUCT,NUMBER,PRICE)> <!ELEMENT PRODUCT (#PCDATA)> <!ELEMENT NUMBER (#PCDATA)> <!ELEMENT PRICE (#PCDATA)> <!ATTLIST CUSTOMER     OWES CDATA "0"     LAYAWAY CDATA "0"     DEFAULTS CDATA "0"> ]>     .     .     .

However, you can also use other keywords here, such as #REQUIRED.

#REQUIRED

When you use the #REQUIRED keyword as an attribute's default value, it means that you're actually not providing a default value, but that you're requiring anyone using this DTD to do so. Here's an example in which I'm requiring anyone who uses this DTD to supply the <CUSTOMER> element with an OWES attribute:

<?xml version = "1.0" standalone="yes"?> <!DOCTYPE DOCUMENT [ <!ELEMENT DOCUMENT (CUSTOMER)*> <!ELEMENT CUSTOMER (NAME,DATE,ORDERS)> <!ELEMENT NAME (LAST_NAME,FIRST_NAME)> <!ELEMENT LAST_NAME (#PCDATA)> <!ELEMENT FIRST_NAME (#PCDATA)> <!ELEMENT DATE (#PCDATA)> <!ELEMENT ORDERS (ITEM)*> <!ELEMENT ITEM (PRODUCT,NUMBER,PRICE)> <!ELEMENT PRODUCT (#PCDATA)> <!ELEMENT NUMBER (#PCDATA)> <!ELEMENT PRICE (#PCDATA)> <!ATTLIST CUSTOMER     OWES CDATA #REQUIRED> ]> <DOCUMENT>     <CUSTOMER OWES="$0">         <NAME>             <LAST_NAME>Smith</LAST_NAME>             <FIRST_NAME>Sam</FIRST_NAME>         </NAME>         <DATE>October 15, 2001</DATE>         <ORDERS>             <ITEM>                 <PRODUCT>Tomatoes</PRODUCT>                 <NUMBER>8</NUMBER>                 <PRICE>$1.25</PRICE>             </ITEM>             <ITEM>                 <PRODUCT>Oranges</PRODUCT>                 <NUMBER>24</NUMBER>                 <PRICE>$4.98</PRICE>             </ITEM>         </ORDERS>     </CUSTOMER>     <CUSTOMER OWES="$599.99">         <NAME>             <LAST_NAME>Jones</LAST_NAME>             <FIRST_NAME>Polly</FIRST_NAME>         </NAME>         <DATE>October 20, 2001</DATE>         <ORDERS>             <ITEM>                 <PRODUCT>Bread</PRODUCT>                 <NUMBER>12</NUMBER>                 <PRICE>$14.95</PRICE>             </ITEM>             <ITEM>                 <PRODUCT>Apples</PRODUCT>                 <NUMBER>6</NUMBER>                 <PRICE>$1.50</PRICE>             </ITEM>         </ORDERS>     </CUSTOMER>     <CUSTOMER OWES="$29.99">         <NAME>             <LAST_NAME>Weber</LAST_NAME>             <FIRST_NAME>Bill</FIRST_NAME>         </NAME>         <DATE>October 25, 2001</DATE>         <ORDERS>             <ITEM>                 <PRODUCT>Asparagus</PRODUCT>                 <NUMBER>12</NUMBER>                 <PRICE>$2.95</PRICE>             </ITEM>             <ITEM>                 <PRODUCT>Lettuce</PRODUCT>                 <NUMBER>6</NUMBER>                 <PRICE>$11.50</PRICE>             </ITEM>         </ORDERS>     </CUSTOMER> </DOCUMENT>

Requiring a value for an attribute is useful for those cases in which a document should be customized, as when you want to list the document author's name or email. It's also useful, of course, when the element needs more information such as when you use a URI attribute for an element that displays an image or loads an applet.

#IMPLIED

You use the #IMPLIED keyword when you don't have a default value for an attribute in mind, and you want to indicate that the document author doesn't even have to use this attribute at all. XML processors will know about this attribute and will not be disturbed if the attribute is not used. (Note that some XML processors will explicitly inform the underlying software application that no value is available for this attribute if no value is given.) The #IMPLIED keyword is the one to use when you want to allow the document author to include this attribute but not require it.

Here's an example in which I'm making the OWES attribute of the <CUSTOMER> element implied, which means that not every element needs to use it:

<?xml version = "1.0" standalone="yes"?> <!DOCTYPE DOCUMENT [ <!ELEMENT DOCUMENT (CUSTOMER)*> <!ELEMENT CUSTOMER (NAME,DATE,ORDERS)> <!ELEMENT NAME (LAST_NAME,FIRST_NAME)> <!ELEMENT LAST_NAME (#PCDATA)> <!ELEMENT FIRST_NAME (#PCDATA)> <!ELEMENT DATE (#PCDATA)> <!ELEMENT ORDERS (ITEM)*> <!ELEMENT ITEM (PRODUCT,NUMBER,PRICE)> <!ELEMENT PRODUCT (#PCDATA)> <!ELEMENT NUMBER (#PCDATA)> <!ELEMENT PRICE (#PCDATA)> <!ATTLIST CUSTOMER      OWES CDATA #IMPLIED> ]> <DOCUMENT>     <CUSTOMER OWES="$23.99">         <NAME>             <LAST_NAME>Smith</LAST_NAME>             <FIRST_NAME>Sam</FIRST_NAME>         </NAME>         <DATE>October 15, 2001</DATE>         <ORDERS>             <ITEM>                 <PRODUCT>Tomatoes</PRODUCT>                 <NUMBER>8</NUMBER>                 <PRICE>$1.25</PRICE>             </ITEM>             <ITEM>                 <PRODUCT>Oranges</PRODUCT>                 <NUMBER>24</NUMBER>                 <PRICE>$4.98</PRICE>             </ITEM>         </ORDERS>     </CUSTOMER>     <CUSTOMER>         <NAME>             <LAST_NAME>Jones</LAST_NAME>             <FIRST_NAME>Polly</FIRST_NAME>         </NAME>         <DATE>October 20, 2001</DATE>         <ORDERS>             <ITEM>                 <PRODUCT>Bread</PRODUCT>                 <NUMBER>12</NUMBER>                 <PRICE>$14.95</PRICE>             </ITEM>             <ITEM>                 <PRODUCT>Apples</PRODUCT>                 <NUMBER>6</NUMBER>                 <PRICE>$1.50</PRICE>             </ITEM>         </ORDERS>     </CUSTOMER>     <CUSTOMER>         <NAME>             <LAST_NAME>Weber</LAST_NAME>             <FIRST_NAME>Bill</FIRST_NAME>         </NAME>         <DATE>October 25, 2001</DATE>         <ORDERS>             <ITEM>                 <PRODUCT>Asparagus</PRODUCT>                 <NUMBER>12</NUMBER>                 <PRICE>$2.95</PRICE>             </ITEM>             <ITEM>                 <PRODUCT>Lettuce</PRODUCT>                 <NUMBER>6</NUMBER>                 <PRICE>$11.50</PRICE>             </ITEM>         </ORDERS>     </CUSTOMER> </DOCUMENT>

It's very common to declare attributes as #IMPLIED because that means they either can appear in elements or not, as the document author prefers.

#FIXED

You can even set the value of an attribute so that it must always have that value. To do that, you use the #FIXED keyword, which sets a fixed value for the attribute, and then specify the value that you want the attribute to have.

Here's an example in which I'm setting the LANGUAGE attribute of the <CUSTOMER> elements to English, EN, and specifying that this is the only valid value for the attribute. This assumes that the underlying application can handle only English and thus needs data provided in that language:

<?xml version = "1.0" standalone="yes"?> <!DOCTYPE DOCUMENT [ <!ELEMENT DOCUMENT (CUSTOMER)*> <!ELEMENT CUSTOMER (NAME,DATE,ORDERS)> <!ELEMENT NAME (LAST_NAME,FIRST_NAME)> <!ELEMENT LAST_NAME (#PCDATA)> <!ELEMENT FIRST_NAME (#PCDATA)> <!ELEMENT DATE (#PCDATA)> <!ELEMENT ORDERS (ITEM)*> <!ELEMENT ITEM (PRODUCT,NUMBER,PRICE)> <!ELEMENT PRODUCT (#PCDATA)> <!ELEMENT NUMBER (#PCDATA)> <!ELEMENT PRICE (#PCDATA)> <!ATTLIST CUSTOMER     LANGUAGE CDATA #FIXED "EN"> ]> <DOCUMENT>     <CUSTOMER>         <NAME>             <LAST_NAME>Smith</LAST_NAME>             <FIRST_NAME>Sam</FIRST_NAME>         </NAME>         <DATE>October 15, 2001</DATE>         <ORDERS>             <ITEM>                 <PRODUCT>Tomatoes</PRODUCT>                 <NUMBER>8</NUMBER>                 <PRICE>$1.25</PRICE>             </ITEM>             <ITEM>                 <PRODUCT>Oranges</PRODUCT>                 <NUMBER>24</NUMBER>                 <PRICE>$4.98</PRICE>             </ITEM>         </ORDERS>     </CUSTOMER>     <CUSTOMER>         <NAME>             <LAST_NAME>Jones</LAST_NAME>             <FIRST_NAME>Polly</FIRST_NAME>         </NAME>         <DATE>October 20, 2001</DATE>         <ORDERS>             <ITEM>                 <PRODUCT>Bread</PRODUCT>                 <NUMBER>12</NUMBER>                 <PRICE>$14.95</PRICE>             </ITEM>             <ITEM>                 <PRODUCT>Apples</PRODUCT>                 <NUMBER>6</NUMBER>                 <PRICE>$1.50</PRICE>             </ITEM>         </ORDERS>     </CUSTOMER>     <CUSTOMER>         <NAME>             <LAST_NAME>Weber</LAST_NAME>             <FIRST_NAME>Bill</FIRST_NAME>         </NAME>         <DATE>October 25, 2001</DATE>         <ORDERS>             <ITEM>                 <PRODUCT>Asparagus</PRODUCT>                 <NUMBER>12</NUMBER>                 <PRICE>$2.95</PRICE>             </ITEM>             <ITEM>                 <PRODUCT>Lettuce</PRODUCT>                 <NUMBER>6</NUMBER>                 <PRICE>$11.50</PRICE>             </ITEM>         </ORDERS>     </CUSTOMER> </DOCUMENT>

Note that I didn't even use the LANGUAGE attribute in the <CUSTOMER> elements here the XML processor passes that attribute and its value to the underlying application anyway because I've declared them #FIXED. If you do explicitly use this attribute, you must set its value to the value that you've set as the default in the DTD, or the XML processor will generate an error.

That covers the possible default value types that you can specify when declaring attributes; I'll take a look at the possible attribute types next.

Attribute Types

So far, I've used just the CDATA attribute type when declaring attributes and, in fact, that's probably the most common declaration type for attributes because it allows you to use simple text for the attribute's value. However, you can specify a number of different attribute types, and I'll take a look at them here. These types are not (not yet, anyway) detailed enough to indicate specific data types such as float, int, or double, but they can provide you with some ability to check the syntax of a document.

CDATA

The most simple attribute type that you can have is CDATA, which is simple character data. This means that the attribute may be set to a value that is any string of text, as long as the string does not contain markup. The requirement that you can't use markup explicitly excludes any string that includes the characters <, ", or &. If you want to use those characters, use their predefined entity references (&lt;, &quot;, and &amp;) instead because these entity references will be parsed and replaced with the corresponding characters. (Because these attribute values are parsed, you must be careful about in- cluding anything that looks like markup you use the term CDATA for this type, not PCDATA, which is character data that has already been parsed.)

We've already seen a number of examples of attributes declared with the CDATA type, as here:

<?xml version = "1.0" standalone="yes"?> <!DOCTYPE DOCUMENT [ <!ELEMENT DOCUMENT (CUSTOMER)*> <!ELEMENT CUSTOMER (NAME,DATE,ORDERS)> <!ELEMENT NAME (LAST_NAME,FIRST_NAME)> <!ELEMENT LAST_NAME (#PCDATA)> <!ELEMENT FIRST_NAME (#PCDATA)> <!ELEMENT DATE (#PCDATA)> <!ELEMENT ORDERS (ITEM)*> <!ELEMENT ITEM (PRODUCT,NUMBER,PRICE)> <!ELEMENT PRODUCT (#PCDATA)> <!ELEMENT NUMBER (#PCDATA)> <!ELEMENT PRICE (#PCDATA)> <!ATTLIST CUSTOMER     OWES CDATA "0"     LAYAWAY CDATA "0"     DEFAULTS CDATA "0"> ]>     .     .     .

The CDATA type is the most general type of attribute; from here, we get into more specific types, such as the enumerated type.

Enumerated

The enumerated type does not use a keyword like the other attribute types do; instead, the enumerated type provides a list (or enumeration) of possible values. Each possible value must be a valid XML name (following the usual rules that the first character must be a letter or underscore, and so on).

Here's an example: In this case, I'm declaring an attribute named CREDIT_OK that can have only one of two possible values "TRUE" or "FALSE" and that has the default value "TRUE":

<?xml version = "1.0" standalone="yes"?> <!DOCTYPE DOCUMENT [ <!ELEMENT DOCUMENT (CUSTOMER)*> <!ELEMENT CUSTOMER (NAME,DATE,ORDERS)> <!ELEMENT NAME (LAST_NAME,FIRST_NAME)> <!ELEMENT LAST_NAME (#PCDATA)> <!ELEMENT FIRST_NAME (#PCDATA)> <!ELEMENT DATE (#PCDATA)> <!ELEMENT ORDERS (ITEM)*> <!ELEMENT ITEM (PRODUCT,NUMBER,PRICE)> <!ELEMENT PRODUCT (#PCDATA)> <!ELEMENT NUMBER (#PCDATA)> <!ELEMENT PRICE (#PCDATA)> <!ATTLIST CUSTOMER     CREDIT_OK (TRUE | FALSE) "TRUE"> ]> <DOCUMENT>     <CUSTOMER CREDIT_OK = "FALSE">         <NAME>             <LAST_NAME>Smith</LAST_NAME>             <FIRST_NAME>Sam</FIRST_NAME>         </NAME>         <DATE>October 15, 2001</DATE>         <ORDERS>             <ITEM>                 <PRODUCT>Tomatoes</PRODUCT>                 <NUMBER>8</NUMBER>                 <PRICE>$1.25</PRICE>             </ITEM>             <ITEM>                 <PRODUCT>Oranges</PRODUCT>                 <NUMBER>24</NUMBER>                 <PRICE>$4.98</PRICE>             </ITEM>         </ORDERS>     </CUSTOMER>     <CUSTOMER>         <NAME>             <LAST_NAME>Jones</LAST_NAME>             <FIRST_NAME>Polly</FIRST_NAME>         </NAME>         <DATE>October 20, 2001</DATE>         <ORDERS>             <ITEM>                 <PRODUCT>Bread</PRODUCT>                 <NUMBER>12</NUMBER>                 <PRICE>$14.95</PRICE>             </ITEM>             <ITEM>                 <PRODUCT>Apples</PRODUCT>                 <NUMBER>6</NUMBER>                 <PRICE>$1.50</PRICE>             </ITEM>         </ORDERS>     </CUSTOMER>     <CUSTOMER CREDIT_OK="TRUE">         <NAME>             <LAST_NAME>Weber</LAST_NAME>             <FIRST_NAME>Bill</FIRST_NAME>         </NAME>         <DATE>October 25, 2001</DATE>         <ORDERS>             <ITEM>                 <PRODUCT>Asparagus</PRODUCT>                 <NUMBER>12</NUMBER>                 <PRICE>$2.95</PRICE>             </ITEM>             <ITEM>                 <PRODUCT>Lettuce</PRODUCT>                 <NUMBER>6</NUMBER>                 <PRICE>$11.50</PRICE>             </ITEM>         </ORDERS>     </CUSTOMER> </DOCUMENT>

Using enumerations like this is great if you want to set the possible range of values an attribute can take; for example, you might want to restrict an attribute named WEEKDAY to these possible values: "Sunday", "Monday", "Tuesday", "Wednesday", "Thursday", "Friday", or "Saturday".

NMTOKEN

Document authors also commonly use another attribute type: NMTOKEN. An attribute of this type can take only values that are proper XML names (that is, they must start with a letter or underscore, and the following letters may include digits, letters, and underscores in particular, note that NMTOKEN values cannot include whitespace).

Using NMTOKEN attribute values can be useful in some applications; note, for example, that XML names are very close to those that are legal for variables in C++, Java and JavaScript, which means that you could even use those names in underlying applications in fancy ways. NMTOKEN values also mean that attribute values must consist of a single word because whitespace of any kind is not allowed; that can be a useful restriction.

Here's an example; in this case, I'm declaring an attribute named SHIP_STATE to hold the two-letter state code to which an order was shipped. Declaring that attribute with NMTOKEM rules out the possibility of values that are longer than a single term:

<?xml version = "1.0" standalone="yes"?> <!DOCTYPE DOCUMENT [ <!ELEMENT DOCUMENT (CUSTOMER)*> <!ELEMENT CUSTOMER (NAME,DATE,ORDERS)> <!ELEMENT NAME (LAST_NAME,FIRST_NAME)> <!ELEMENT LAST_NAME (#PCDATA)> <!ELEMENT FIRST_NAME (#PCDATA)> <!ELEMENT DATE (#PCDATA)> <!ELEMENT ORDERS (ITEM)*> <!ELEMENT ITEM (PRODUCT,NUMBER,PRICE)> <!ELEMENT PRODUCT (#PCDATA)> <!ELEMENT NUMBER (#PCDATA)> <!ELEMENT PRICE (#PCDATA)> <!ATTLIST CUSTOMER     SHIP_STATE NMTOKEN #REQUIRED> ]> <DOCUMENT>     <CUSTOMER SHIP_STATE = "CA">         <NAME>             <LAST_NAME>Smith</LAST_NAME>             <FIRST_NAME>Sam</FIRST_NAME>         </NAME>         <DATE>October 15, 2001</DATE>         <ORDERS>             <ITEM>                 <PRODUCT>Tomatoes</PRODUCT>                 <NUMBER>8</NUMBER>                 <PRICE>$1.25</PRICE>             </ITEM>             <ITEM>                 <PRODUCT>Oranges</PRODUCT>                 <NUMBER>24</NUMBER>                 <PRICE>$4.98</PRICE>             </ITEM>         </ORDERS>     </CUSTOMER>     <CUSTOMER SHIP_STATE = "LA">         <NAME>             <LAST_NAME>Jones</LAST_NAME>             <FIRST_NAME>Polly</FIRST_NAME>         </NAME>         <DATE>October 20, 2001</DATE>         <ORDERS>             <ITEM>                 <PRODUCT>Bread</PRODUCT>                 <NUMBER>12</NUMBER>                 <PRICE>$14.95</PRICE>             </ITEM>             <ITEM>                 <PRODUCT>Apples</PRODUCT>                 <NUMBER>6</NUMBER>                 <PRICE>$1.50</PRICE>             </ITEM>         </ORDERS>     </CUSTOMER>     <CUSTOMER SHIP_STATE = "MA">         <NAME>             <LAST_NAME>Weber</LAST_NAME>             <FIRST_NAME>Bill</FIRST_NAME>         </NAME>         <DATE>October 25, 2001</DATE>         <ORDERS>             <ITEM>                 <PRODUCT>Asparagus</PRODUCT>                 <NUMBER>12</NUMBER>                 <PRICE>$2.95</PRICE>             </ITEM>             <ITEM>                 <PRODUCT>Lettuce</PRODUCT>                 <NUMBER>6</NUMBER>                 <PRICE>$11.50</PRICE>             </ITEM>         </ORDERS>     </CUSTOMER> </DOCUMENT>
NMTOKENS

You can even specify that an attribute value must be made up of NMTOKENs separated by whitespace if you use the NMTOKENS attribute type. For example, here I'm giving the attribute CONTACT_NAME the type NMTOKENS to allow attribute values to hold first and last names, separated by whitespace:

<?xml version = "1.0" standalone="yes"?> <!DOCTYPE DOCUMENT [ <!ELEMENT DOCUMENT (CUSTOMER)*> <!ELEMENT CUSTOMER (NAME,DATE,ORDERS)> <!ELEMENT NAME (LAST_NAME,FIRST_NAME)> <!ELEMENT LAST_NAME (#PCDATA)> <!ELEMENT FIRST_NAME (#PCDATA)> <!ELEMENT DATE (#PCDATA)> <!ELEMENT ORDERS (ITEM)*> <!ELEMENT ITEM (PRODUCT,NUMBER,PRICE)> <!ELEMENT PRODUCT (#PCDATA)> <!ELEMENT NUMBER (#PCDATA)> <!ELEMENT PRICE (#PCDATA)> <!ATTLIST CUSTOMER     CONTACT_NAME NMTOKENS #IMPLIED> ]> <DOCUMENT>     <CUSTOMER CONTACT_NAME = "George Starr">         <NAME>             <LAST_NAME>Smith</LAST_NAME>             <FIRST_NAME>Sam</FIRST_NAME>         </NAME>         <DATE>October 15, 2001</DATE>         <ORDERS>             <ITEM>                 <PRODUCT>Tomatoes</PRODUCT>                 <NUMBER>8</NUMBER>                 <PRICE>$1.25</PRICE>             </ITEM>             <ITEM>                 <PRODUCT>Oranges</PRODUCT>                 <NUMBER>24</NUMBER>                 <PRICE>$4.98</PRICE>             </ITEM>         </ORDERS>     </CUSTOMER>     <CUSTOMER CONTACT_NAME = "Ringo Harrison">         <NAME>             <LAST_NAME>Jones</LAST_NAME>             <FIRST_NAME>Polly</FIRST_NAME>         </NAME>         <DATE>October 20, 2001</DATE>         <ORDERS>             <ITEM>                 <PRODUCT>Bread</PRODUCT>                 <NUMBER>12</NUMBER>                 <PRICE>$14.95</PRICE>             </ITEM>             <ITEM>                 <PRODUCT>Apples</PRODUCT>                 <NUMBER>6</NUMBER>                 <PRICE>$1.50</PRICE>             </ITEM>         </ORDERS>     </CUSTOMER>     <CUSTOMER CONTACT_NAME = "Paul Lennon">         <NAME>             <LAST_NAME>Weber</LAST_NAME>             <FIRST_NAME>Bill</FIRST_NAME>         </NAME>         <DATE>October 25, 2001</DATE>         <ORDERS>             <ITEM>                 <PRODUCT>Asparagus</PRODUCT>                 <NUMBER>12</NUMBER>                 <PRICE>$2.95</PRICE>             </ITEM>             <ITEM>                 <PRODUCT>Lettuce</PRODUCT>                 <NUMBER>6</NUMBER>                 <PRICE>$11.50</PRICE>             </ITEM>         </ORDERS>     </CUSTOMER> </DOCUMENT>
ID

You also can declare another very important attribute type: ID. XML gives special meaning to an element's ID value because that's the value that applications typically use to identify elements. For that reason, XML processors are supposed to make sure that no two elements have the same value for the attribute that is of type ID in a document (and you can give elements only one attribute of this type). The actual value that you assign to the attribute of this type must be a proper XML name.

Applications can use the ID value of elements to uniquely identify those elements but note that you don't have to name the attribute "ID", as you do in HTML because simply specifying an attribute's type to be the ID type makes it into an ID attribute. Here's an example in which I add an ID attribute named CUSTOMER_ID to the <CUSTOMER> elements in this document:

<?xml version = "1.0" standalone="yes"?> <!DOCTYPE DOCUMENT [ <!ELEMENT DOCUMENT (CUSTOMER)*> <!ELEMENT CUSTOMER (NAME,DATE,ORDERS)> <!ELEMENT NAME (LAST_NAME,FIRST_NAME)> <!ELEMENT LAST_NAME (#PCDATA)> <!ELEMENT FIRST_NAME (#PCDATA)> <!ELEMENT DATE (#PCDATA)> <!ELEMENT ORDERS (ITEM)*> <!ELEMENT ITEM (PRODUCT,NUMBER,PRICE)> <!ELEMENT PRODUCT (#PCDATA)> <!ELEMENT NUMBER (#PCDATA)> <!ELEMENT PRICE (#PCDATA)> <!ATTLIST CUSTOMER     CUSTOMER_ID ID #REQUIRED> ]> <DOCUMENT>     <CUSTOMER CUSTOMER_ID = "C1232231">         <NAME>             <LAST_NAME>Smith</LAST_NAME>             <FIRST_NAME>Sam</FIRST_NAME>         </NAME>         <DATE>October 15, 2001</DATE>         <ORDERS>             <ITEM>                 <PRODUCT>Tomatoes</PRODUCT>                 <NUMBER>8</NUMBER>                 <PRICE>$1.25</PRICE>             </ITEM>             <ITEM>                 <PRODUCT>Oranges</PRODUCT>                 <NUMBER>24</NUMBER>                 <PRICE>$4.98</PRICE>             </ITEM>         </ORDERS>     </CUSTOMER>     <CUSTOMER CUSTOMER_ID = "C1232232">         <NAME>             <LAST_NAME>Jones</LAST_NAME>             <FIRST_NAME>Polly</FIRST_NAME>         </NAME>         <DATE>October 20, 2001</DATE>         <ORDERS>             <ITEM>                 <PRODUCT>Bread</PRODUCT>                 <NUMBER>12</NUMBER>                 <PRICE>$14.95</PRICE>             </ITEM>             <ITEM>                 <PRODUCT>Apples</PRODUCT>                 <NUMBER>6</NUMBER>                 <PRICE>$1.50</PRICE>             </ITEM>         </ORDERS>     </CUSTOMER>     <CUSTOMER CUSTOMER_ID = "C1232233">         <NAME>             <LAST_NAME>Weber</LAST_NAME>             <FIRST_NAME>Bill</FIRST_NAME>         </NAME>         <DATE>October 25, 2001</DATE>         <ORDERS>             <ITEM>                 <PRODUCT>Asparagus</PRODUCT>                 <NUMBER>12</NUMBER>                 <PRICE>$2.95</PRICE>             </ITEM>             <ITEM>                 <PRODUCT>Lettuce</PRODUCT>                 <NUMBER>6</NUMBER>                 <PRICE>$11.50</PRICE>             </ITEM>         </ORDERS>     </CUSTOMER> </DOCUMENT>

Note that you cannot use the ID type with #FIXED attributes (because all #FIXED attributes have same value). You usually use the #REQUIRED keyword instead.

Because ID values must be proper XML names, they can't be simple numbers like 12345; these values can't start with a digit.

IDREF

The IDREF attribute type represents an attempt to let you use attributes to specify something about a document's structure in particular, something about the relationship that exists between elements. IDREF attributes hold the ID value of another element in the document.

For example, say that you wanted to set up a parent-child relationship between elements that was not reflected in the normal nesting structure of the document. In that case, you could set an IDREF attribute of an element to the ID of its parent. An application could then check the attribute with the IDREF type to determine the child's parent.

Here's an example; in this case, I'm declaring two attributes, a CUSTOMER_ID attribute of type ID and an EMPLOYER_ID attribute of type IDREF that holds the ID value of the customer's employer:

<?xml version = "1.0" standalone="yes"?> <!DOCTYPE DOCUMENT [ <!ELEMENT DOCUMENT (CUSTOMER)*> <!ELEMENT CUSTOMER (NAME,DATE,ORDERS)> <!ELEMENT NAME (LAST_NAME,FIRST_NAME)> <!ELEMENT LAST_NAME (#PCDATA)> <!ELEMENT FIRST_NAME (#PCDATA)> <!ELEMENT DATE (#PCDATA)> <!ELEMENT ORDERS (ITEM)*> <!ELEMENT ITEM (PRODUCT,NUMBER,PRICE)> <!ELEMENT PRODUCT (#PCDATA)> <!ELEMENT NUMBER (#PCDATA)> <!ELEMENT PRICE (#PCDATA)> <!ATTLIST CUSTOMER     CUSTOMER_ID ID #REQUIRED     EMPLOYER_ID IDREF #IMPLIED> ]> <DOCUMENT>     <CUSTOMER CUSTOMER_ID = "C1232231">         <NAME>             <LAST_NAME>Smith</LAST_NAME>             <FIRST_NAME>Sam</FIRST_NAME>         </NAME>         <DATE>October 15, 2001</DATE>         <ORDERS>             <ITEM>                 <PRODUCT>Tomatoes</PRODUCT>                 <NUMBER>8</NUMBER>                 <PRICE>$1.25</PRICE>             </ITEM>             <ITEM>                 <PRODUCT>Oranges</PRODUCT>                 <NUMBER>24</NUMBER>                 <PRICE>$4.98</PRICE>             </ITEM>         </ORDERS>     </CUSTOMER>     <CUSTOMER CUSTOMER_ID = "C1232232" EMPLOYER_ID="C1232231">         <NAME>             <LAST_NAME>Jones</LAST_NAME>             <FIRST_NAME>Polly</FIRST_NAME>         </NAME>         <DATE>October 20, 2001</DATE>         <ORDERS>             <ITEM>                 <PRODUCT>Bread</PRODUCT>                 <NUMBER>12</NUMBER>                 <PRICE>$14.95</PRICE>             </ITEM>             <ITEM>                 <PRODUCT>Apples</PRODUCT>                 <NUMBER>6</NUMBER>                 <PRICE>$1.50</PRICE>             </ITEM>         </ORDERS>     </CUSTOMER>     <CUSTOMER CUSTOMER_ID = "C1232233">         <NAME>             <LAST_NAME>Weber</LAST_NAME>             <FIRST_NAME>Bill</FIRST_NAME>         </NAME>         <DATE>October 25, 2001</DATE>         <ORDERS>             <ITEM>                 <PRODUCT>Asparagus</PRODUCT>                 <NUMBER>12</NUMBER>                 <PRICE>$2.95</PRICE>             </ITEM>             <ITEM>                 <PRODUCT>Lettuce</PRODUCT>                 <NUMBER>6</NUMBER>                 <PRICE>$11.50</PRICE>             </ITEM>         </ORDERS>     </CUSTOMER> </DOCUMENT>

An XML processor can pass on the ID and IDREF structure of a document to an underlying application, which can then use that information to reconstruct the relationships of the elements in the document.

ENTITY

You can also specify that an attribute be of type ENTITY, which means that the attribute can be set to the name of an entity you've declared. For example, say that I declared an entity named SNAPSHOT1 that referred to an external image file. I could then create a new attribute named, say, IMAGE, that I could set to the entity name SNAPSHOT1. Here's how that looks:

<?xml version = "1.0" standalone="no"?> <!DOCTYPE DOCUMENT [ <!ELEMENT DOCUMENT (CUSTOMER)*> <!ELEMENT CUSTOMER (NAME,DATE,ORDERS)> <!ELEMENT NAME (LAST_NAME,FIRST_NAME)> <!ELEMENT LAST_NAME (#PCDATA)> <!ELEMENT FIRST_NAME (#PCDATA)> <!ELEMENT DATE (#PCDATA)> <!ELEMENT ORDERS (ITEM)*> <!ELEMENT ITEM (PRODUCT,NUMBER,PRICE)> <!ELEMENT PRODUCT (#PCDATA)> <!ELEMENT NUMBER (#PCDATA)> <!ELEMENT PRICE (#PCDATA)> <!ATTLIST CUSTOMER     IMAGE ENTITY #IMPLIED> <!ENTITY SNAPSHOT1 SYSTEM "image.gif"> ]> <DOCUMENT>     <CUSTOMER IMAGE="SNAPSHOT1">         <NAME>             <LAST_NAME>Smith</LAST_NAME>             <FIRST_NAME>Sam</FIRST_NAME>         </NAME>         <DATE>October 15, 2001</DATE>         <ORDERS>             <ITEM>                 <PRODUCT>Tomatoes</PRODUCT>                 <NUMBER>8</NUMBER>                 <PRICE>$1.25</PRICE>             </ITEM>             .             .             .             <ITEM>                 <PRODUCT>Lettuce</PRODUCT>                 <NUMBER>6</NUMBER>                 <PRICE>$11.50</PRICE>             </ITEM>         </ORDERS>     </CUSTOMER> </DOCUMENT>

This points out how to use the ENTITY attribute type (but actually it's not a complete example because there are specific ways to set up entities to refer to external, non-XML data that we'll see at the end of this chapter). In general, the ENTITY attribute type is a useful one if you declare your own entities; for example, you might want to declare entities named SIGNATURE_HOME, SIGNATURE_WORK, and so on that hold your name and home address, work address, and so on. If you then declare an attribute named, say, SIGNATURE of the ENTITY type, you can assign the SIGNATURE_HOME or SIGNATURE_WORK entities to the SIGNATURE attribute in the document.

ENTITIES

As with the NMTOKEN attribute type, which has a plural type, NMTOKENS, the ENTITY attribute type also has a plural type, ENTITIES. Attributes of this type can hold lists of entity names, separated by whitespace.

Here's an example; in this case, I'm declaring two entities, SNAPSHOT1 and SNAPSHOT2, and an attribute named IMAGES that you can assign both SNAPSHOT1 and SNAPSHOT2 to at the same time:

<?xml version = "1.0" standalone="no"?> <!DOCTYPE DOCUMENT [ <!ELEMENT DOCUMENT (CUSTOMER)*> <!ELEMENT CUSTOMER (NAME,DATE,ORDERS)> <!ELEMENT NAME (LAST_NAME,FIRST_NAME)> <!ELEMENT LAST_NAME (#PCDATA)> <!ELEMENT FIRST_NAME (#PCDATA)> <!ELEMENT DATE (#PCDATA)> <!ELEMENT ORDERS (ITEM)*> <!ELEMENT ITEM (PRODUCT,NUMBER,PRICE)> <!ELEMENT PRODUCT (#PCDATA)> <!ELEMENT NUMBER (#PCDATA)> <!ELEMENT PRICE (#PCDATA)> <!ATTLIST CUSTOMER     IMAGES ENTITIES #IMPLIED> <!ENTITY SNAPSHOT1 SYSTEM "image.gif"> <!ENTITY SNAPSHOT2 SYSTEM "image2.gif"> ]> <DOCUMENT>     <CUSTOMER IMAGES="SNAPSHOT1 SNAPSHOT2">         <NAME>             <LAST_NAME>Smith</LAST_NAME>             <FIRST_NAME>Sam</FIRST_NAME>         </NAME>         <DATE>October 15, 2001</DATE>         <ORDERS>             <ITEM>                 <PRODUCT>Tomatoes</PRODUCT>                 <NUMBER>8</NUMBER>                 <PRICE>$1.25</PRICE>             </ITEM>             .             .             .             <ITEM>                 <PRODUCT>Lettuce</PRODUCT>                 <NUMBER>6</NUMBER>                 <PRICE>$11.50</PRICE>             </ITEM>         </ORDERS>     </CUSTOMER> </DOCUMENT>

As with the NMTOKENS attribute type, you use the plural ENTITIES type when you want to assign a number of entities to the same attributes. For example, you may have multiple entities defined that represent a customer's usernames and want to assign all of them to an attribute named USERNAMES. Because entities can be quite complex and even can include other entities, this is one way to store detailed data in a document simply using attributes.

NOTATION

The final type of attribute type is NOTATION. When you declare an attribute of this type, you can assign values to it that have been declared notations.

A notation specifies the format of non-XML data, and you use it to describe external entities. One popular type of notations are multipurpose Internet mail extensions (MIME) types such as image/gif, application/xml, text/html, and so on. (You can get a list of the registered MIME types at ftp://ftp.isi.edu/in-notes/iana/assignments/media-types/media-types).

Here's an example; in this case, I'll declare two notations, GIF and JPG, that stand for the MIME types image/gif and image/jpeg. Then I'll set up an attribute that may be assigned either of these values.

To declare a notation, you use the <!NOTATION> element in a DTD like this:

<!NOTATION NAME SYSTEM "EXTERNAL_ID">

Here, NAME is the name of the notation, and EXTERNAL_ID is the external ID that you want to use for the notation, often a MIME type.

You can also use the PUBLIC keyword for public notations if you supply a formal public identifier (FPI see the rules for constructing FPIs in the previous chapter), like this:

<!NOTATION NAME PUBLIC FPI "EXTERNAL_ID">

Here's how I create the GIF and JPG notations:

<?xml version = "1.0" standalone="no"?> <!DOCTYPE DOCUMENT [ <!ELEMENT DOCUMENT (CUSTOMER)*> <!ELEMENT CUSTOMER (NAME,DATE,ORDERS)> <!ELEMENT NAME (LAST_NAME,FIRST_NAME)> <!ELEMENT LAST_NAME (#PCDATA)> <!ELEMENT FIRST_NAME (#PCDATA)> <!ELEMENT DATE (#PCDATA)> <!ELEMENT ORDERS (ITEM)*> <!ELEMENT ITEM (PRODUCT,NUMBER,PRICE)> <!ELEMENT PRODUCT (#PCDATA)> <!ELEMENT NUMBER (#PCDATA)> <!ELEMENT PRICE (#PCDATA)> <!NOTATION GIF SYSTEM "image/gif"> <!NOTATION JPG SYSTEM "image/jpeg">     .     .     .

Now I'm free to create an attribute named, say, IMAGE_TYPE, of type NOTATION that you can assign either the GIF or the JPG notations to:

<?xml version = "1.0" standalone="no"?> <!DOCTYPE DOCUMENT [ <!ELEMENT DOCUMENT (CUSTOMER)*> <!ELEMENT CUSTOMER (NAME,DATE,ORDERS)> <!ELEMENT NAME (LAST_NAME,FIRST_NAME)> <!ELEMENT LAST_NAME (#PCDATA)> <!ELEMENT FIRST_NAME (#PCDATA)> <!ELEMENT DATE (#PCDATA)> <!ELEMENT ORDERS (ITEM)*> <!ELEMENT ITEM (PRODUCT,NUMBER,PRICE)> <!ELEMENT PRODUCT (#PCDATA)> <!ELEMENT NUMBER (#PCDATA)> <!ELEMENT PRICE (#PCDATA)> <!NOTATION GIF SYSTEM "image/gif"> <!NOTATION JPG SYSTEM "image/jpeg"> <!ATTLIST CUSTOMER     IMAGE NMTOKEN #IMPLIED     IMAGE_TYPE NOTATION (GIF | JPG) #IMPLIED> ]>     .     .     .

At this point, I'm free to use the IMAGE_TYPE attribute:

<?xml version = "1.0" standalone="no"?> <!DOCTYPE DOCUMENT [ <!ELEMENT DOCUMENT (CUSTOMER)*> <!ELEMENT CUSTOMER (NAME,DATE,ORDERS)> <!ELEMENT NAME (LAST_NAME,FIRST_NAME)> <!ELEMENT LAST_NAME (#PCDATA)> <!ELEMENT FIRST_NAME (#PCDATA)> <!ELEMENT DATE (#PCDATA)> <!ELEMENT ORDERS (ITEM)*> <!ELEMENT ITEM (PRODUCT,NUMBER,PRICE)> <!ELEMENT PRODUCT (#PCDATA)> <!ELEMENT NUMBER (#PCDATA)> <!ELEMENT PRICE (#PCDATA)> <!NOTATION GIF SYSTEM "image/gif"> <!NOTATION JPG SYSTEM "image/jpeg"> <!ATTLIST CUSTOMER     IMAGE NMTOKEN #IMPLIED     IMAGE_TYPE NOTATION (GIF | JPG) #IMPLIED> ]> <DOCUMENT>     <CUSTOMER IMAGE="image.gif" IMAGE_TYPE="GIF">         <NAME>             <LAST_NAME>Smith</LAST_NAME>             <FIRST_NAME>Sam</FIRST_NAME>         </NAME>         <DATE>October 15, 2001</DATE>         <ORDERS>             <ITEM>                 <PRODUCT>Tomatoes</PRODUCT>                 <NUMBER>8</NUMBER>                 <PRICE>$1.25</PRICE>             </ITEM>             <ITEM>                 <PRODUCT>Oranges</PRODUCT>                 <NUMBER>24</NUMBER>                 <PRICE>$4.98</PRICE>             </ITEM>             .             .             .             <ITEM>                 <PRODUCT>Asparagus</PRODUCT>                 <NUMBER>12</NUMBER>                 <PRICE>$2.95</PRICE>             </ITEM>             <ITEM>                 <PRODUCT>Lettuce</PRODUCT>                 <NUMBER>6</NUMBER>                 <PRICE>$11.50</PRICE>             </ITEM>         </ORDERS>     </CUSTOMER> </DOCUMENT>

This example brings up an interesting point; here, I've just set the value of an attribute, IMAGE, to the name of an image file, image.gif but how do you actually make an unparsed entity like an image part of a document? There's a way of doing that explicitly, and now that we know about notations, we're ready to use it in the next section.

This completes our coverage of creating attributes, but don't forget that there are also two attributes that are in some sense predefined in XML, and we've already covered those: xml:space, which you can use to preserve the whitespace in an element, and xml:lang, which you can use to specify the language used in an element and its attributes. They're not really predefined because you must declare them if you want to use them, but you shouldn't use these attribute names for anything other than their intended use.

Embedding Non-XML Data in a Document

In the previous example, I associated an image, image.gif, with a document, but only by setting an attribute to the text "image.gif". What if I wanted to make image.gif a real part of the document? I can do that by treating image.gif as an external unparsed entity. The creators of XML realized that XML was not ideal for storing data that is not text, so they added the idea of unparsed entities as a way of associating non-XML data, such as non-XML text, or binary data, with XML documents.

To declare an external unparsed entity, use an <!ENTITY> element note the keyword NDATA, indicating that I'm referring to an unparsed entity:

<!ENTITY NAME SYSTEM VALUE NDATA TYPE>

Here, NAME is the name of the external unparsed entity, VALUE is the value of the entity, such as the name of an external file (for example, image.gif), and TYPE is a declared notation. You can also use public external unparsed entities if you use the PUBLIC keyword with a formal public identifier:

<!ENTITY NAME PUBLIC FPI VALUE NDATA TYPE>

Here's an example; in this case, I start by declaring a notation named GIF that stands for the image/gif MIME type:

<?xml version = "1.0" standalone="no"?> <!DOCTYPE DOCUMENT [ <!ELEMENT DOCUMENT (CUSTOMER)*> <!ELEMENT CUSTOMER (NAME,DATE,ORDERS)> <!ELEMENT NAME (LAST_NAME,FIRST_NAME)> <!ELEMENT LAST_NAME (#PCDATA)> <!ELEMENT FIRST_NAME (#PCDATA)> <!ELEMENT DATE (#PCDATA)> <!ELEMENT ORDERS (ITEM)*> <!ELEMENT ITEM (PRODUCT,NUMBER,PRICE)> <!ELEMENT PRODUCT (#PCDATA)> <!ELEMENT NUMBER (#PCDATA)> <!ELEMENT PRICE (#PCDATA)> <!NOTATION GIF SYSTEM "image/gif">     .     .     .

Now I create an external unparsed entity named SNAPSHOT1 to refer to the external image file, image.gif:

<?xml version = "1.0" standalone="no"?> <!DOCTYPE DOCUMENT [ <!ELEMENT DOCUMENT (CUSTOMER)*> <!ELEMENT CUSTOMER (NAME,DATE,ORDERS)> <!ELEMENT NAME (LAST_NAME,FIRST_NAME)> <!ELEMENT LAST_NAME (#PCDATA)> <!ELEMENT FIRST_NAME (#PCDATA)> <!ELEMENT DATE (#PCDATA)> <!ELEMENT ORDERS (ITEM)*> <!ELEMENT ITEM (PRODUCT,NUMBER,PRICE)> <!ELEMENT PRODUCT (#PCDATA)> <!ELEMENT NUMBER (#PCDATA)> <!ELEMENT PRICE (#PCDATA)> <!NOTATION GIF SYSTEM "image/gif"> <!ENTITY SNAPSHOT1 SYSTEM "image.gif" NDATA GIF>     .     .     .

After you've declared an external unparsed entity like SNAPSHOT1, you can't just embed it in an XML document directly. Instead, you create a new attribute of the ENTITY type that you can assign the entity to. I'll call this new attribute IMAGE:

<?xml version = "1.0" standalone="no"?> <!DOCTYPE DOCUMENT [ <!ELEMENT DOCUMENT (CUSTOMER)*> <!ELEMENT CUSTOMER (NAME,DATE,ORDERS)> <!ELEMENT NAME (LAST_NAME,FIRST_NAME)> <!ELEMENT LAST_NAME (#PCDATA)> <!ELEMENT FIRST_NAME (#PCDATA)> <!ELEMENT DATE (#PCDATA)> <!ELEMENT ORDERS (ITEM)*> <!ELEMENT ITEM (PRODUCT,NUMBER,PRICE)> <!ELEMENT PRODUCT (#PCDATA)> <!ELEMENT NUMBER (#PCDATA)> <!ELEMENT PRICE (#PCDATA)> <!NOTATION GIF SYSTEM "image/gif"> <!ENTITY SNAPSHOT1 SYSTEM "image.gif" NDATA GIF> <!ATTLIST CUSTOMER     IMAGE ENTITY #IMPLIED> ]>     .     .     .

Now, finally, I'm able to assign the IMAGE attribute the value SNAPSHOT1 like this, making image.gif an official part of the document:

<?xml version = "1.0" standalone="no"?> <!DOCTYPE DOCUMENT [ <!ELEMENT DOCUMENT (CUSTOMER)*> <!ELEMENT CUSTOMER (NAME,DATE,ORDERS)> <!ELEMENT NAME (LAST_NAME,FIRST_NAME)> <!ELEMENT LAST_NAME (#PCDATA)> <!ELEMENT FIRST_NAME (#PCDATA)> <!ELEMENT DATE (#PCDATA)> <!ELEMENT ORDERS (ITEM)*> <!ELEMENT ITEM (PRODUCT,NUMBER,PRICE)> <!ELEMENT PRODUCT (#PCDATA)> <!ELEMENT NUMBER (#PCDATA)> <!ELEMENT PRICE (#PCDATA)> <!NOTATION GIF SYSTEM "image/gif"> <!ENTITY SNAPSHOT1 SYSTEM "image.gif" NDATA GIF> <!ATTLIST CUSTOMER     IMAGE ENTITY #IMPLIED> ]> <DOCUMENT>     <CUSTOMER IMAGE="SNAPSHOT1">         <NAME>             <LAST_NAME>Smith</LAST_NAME>             <FIRST_NAME>Sam</FIRST_NAME>         </NAME>         <DATE>October 15, 2001</DATE>         <ORDERS>             <ITEM>                 <PRODUCT>Tomatoes</PRODUCT>                 <NUMBER>8</NUMBER>                 <PRICE>$1.25</PRICE>             </ITEM>             .             .             .             <ITEM>                 <PRODUCT>Lettuce</PRODUCT>                 <NUMBER>6</NUMBER>                 <PRICE>$11.50</PRICE>             </ITEM>         </ORDERS>     </CUSTOMER> </DOCUMENT>

If you use external unparsed entities like this, validating XML processors won't try to read and parse them, but they'll often check to make sure that they're there. So, be sure that the document is complete.

What if I wanted to embed multiple unparsed entities? Take a look at the next topic.

Embedding Multiple Unparsed Entities in a Document

Embedding multiple unparsed entities is no problem; just create an attribute of the ENTITIES type and assign multiple entities to it, like this:

<?xml version = "1.0" standalone="no"?> <!DOCTYPE DOCUMENT [ <!ELEMENT DOCUMENT (CUSTOMER)*> <!ELEMENT CUSTOMER (NAME,DATE,ORDERS)> <!ELEMENT NAME (LAST_NAME,FIRST_NAME)> <!ELEMENT LAST_NAME (#PCDATA)> <!ELEMENT FIRST_NAME (#PCDATA)> <!ELEMENT DATE (#PCDATA)> <!ELEMENT ORDERS (ITEM)*> <!ELEMENT ITEM (PRODUCT,NUMBER,PRICE)> <!ELEMENT PRODUCT (#PCDATA)> <!ELEMENT NUMBER (#PCDATA)> <!ELEMENT PRICE (#PCDATA)> <!NOTATION GIF SYSTEM "image/gif"> <!ATTLIST CUSTOMER     IMAGES ENTITIES #IMPLIED> <!ENTITY SNAPSHOT1 SYSTEM "image.gif" NDATA GIF> <!ENTITY SNAPSHOT2 SYSTEM "image2.gif" NDATA GIF> <!ENTITY SNAPSHOT3 SYSTEM "image3.gif" NDATA GIF> ]> <DOCUMENT>     <CUSTOMER IMAGES="SNAPSHOT1 SNAPSHOT2 SNAPSHOT3">         <NAME>             <LAST_NAME>Smith</LAST_NAME>             <FIRST_NAME>Sam</FIRST_NAME>         </NAME>         <DATE>October 15, 2001</DATE>         <ORDERS>             <ITEM>                 <PRODUCT>Tomatoes</PRODUCT>                 <NUMBER>8</NUMBER>                 <PRICE>$1.25</PRICE>             </ITEM>             .             .             .             <ITEM>                 <PRODUCT>Lettuce</PRODUCT>                 <NUMBER>6</NUMBER>                 <PRICE>$11.50</PRICE>             </ITEM>         </ORDERS>     </CUSTOMER> </DOCUMENT>

And that's it all it takes.

And that's it for our coverage of constructing and using DTDs as well. In this chapter and the previous chapter, we've seen what goes into a DTD and how to handle elements, attributes, entities, and notations. In the next chapter, we'll take a look at the proposed alternate way of declaring those items in XML documents: XML schemas.

CONTENTS


Inside XML
Real World XML (2nd Edition)
ISBN: 0735712867
EAN: 2147483647
Year: 2005
Pages: 23
Authors: Steve Holzner

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net