Internal Entities

[Previous] [Next]

Let's begin by looking at internal entities. An entity that is going to be used in only one DTD can be an internal entity. If you intend to use the entity in multiple DTDs, it should be an external entity. In this section, you'll learn how to declare internal entities, where to insert them, and how to reference them.

Internal General Entities

Internal general entities are the simplest among the five types of entities. They are defined in the DTD section of the XML document. First let's look at how to declare an internal general entity.

Declaring an internal general entity

The syntax for the declaration of an internal general entity is shown here:

<!ENTITY name "string_of_characters">

NOTE
As you can see from the syntax line above, characters such as angle brackets(< >) and quotation marks (" ") are used specifically for marking up the XML document; they cannot be used as content directly. So to include such a character as part of your content, you must use one of .XML's five predefined entities. The literal entity values for these predefined entities are &amp;, &lt;, &gt;, &quot;, and &apos;. The replacement text for these literal entity values will be &, <, >, ", and '.

You can create your own general entities. General entities are useful for associating names with foreign language characters, such as ü or ß, or escape characters, such as <, >, and &. You can use Unicode character values in your XML documents as replacements for any character defined in the Unicode standard. These are called character references.

To use a Unicode representation in your XML document, you must precede the Unicode character value with &#. You can use either the Unicode characters' hex values or their decimal values. For example, in Unicode, ü is represented as xFC and ß is represented as xDF. These two characters' decimal values are 252 and 223. Thus, in your DTD you could create general entities for the preceding two characters as follows:

 <!ENTITY u_um "&#xFC"> <!ENTITY s_sh "&#xDF"> 

The two entities could also be declared like this:

 <!ENTITY u_um "&#252"> <!ENTITY s_sh "&#223"> 

Using internal general entities

To reference a general entity in the XML document, you must precede the entity with an ampersand (&) and follow it with a semicolon (;). For example, the following XML statement references the two general entities we declared in the previous section:

 <title>Gr&u_um;&s_sh;</title> 

When the replacement text is inserted by the parser, it will look like this:

 <title>Grüß</title> 

Internal general entities can be used in three places: in the XML document as content for an element, within the DTD in an attribute with a #FIXED data type declaration as the default value for the attribute, and within other general entities inside the DTD. We used the first location in the preceding example: (<title>Gr&u_um;&s_
sh;</title>).

The second place you can use an internal general entity is within the DTD in an attribute with a #FIXED data type declaration or as the default value for an attribute. For example, you can use the following general entities in your DTD declaration to create entities for several colors:

 <!ENTITY Cy "Cyan"> <!ENTITY Lm "Lime"> <!ENTITY Bk "Black"> <!ENTITY Wh "White"> <!ENTITY Ma "Maroon"> 

Then if you want the value of the bgcolor attribute for tr elements to be White for all XML documents that use the DTD, you could include the following line in the previous DTD declaration:

 <!ATTLIST tr align (Left | Right | Center) 'Center' valign (Top | Middle | Bottom) 'Middle' bgcolor CDATA #FIXED "&Wh;"> 

The internal general entities must be defined before they can be used in an attribute default value since the DTD is read through once from beginning to end. In this case, internal general entities for several colors have been created. The bgcolor attribute is declared with the keyword #FIXED, which means that its value cannot be changed by the user—the value will always be White. The color general entities could also be used as content for the elements in the body section of the XML document.

You can use the internal general entity as a default value—for example, bgcolor CDATA "&Wh;". In this case, if no value is given, &Wh; is substituted for bgcolor when the XML attribute is needed in the document body, and that reference will be converted to White.

NOTE
You can use an internal general entity in a DTD for a #FIXED attribute, but the attribute value will be assigned in the XML document's body only when the attribute is referenced. You cannot use an internal general entity in an enumerated type attribute declaration because the general entity would have to be interpreted in the DTD, which cannot happen.

The third place you can use internal general entities is within other general entities inside the DTD. For example, we could use the preceding special character entities as follows:

 <!ENTITY u_um "&#252> <!ENTITY s_sh "&#223"> <!ENTITY greeting "Gr&u_um;&s_sh;"> 

At this point, it's not clear whether greeting will be replaced with Gr&u_um;&s_sh; in the XML document's body and then converted to Grüß or whether greeting will be replaced directly with Grüß when the entity is parsed. The order of replacement will be discussed in the section "Processing Order" later in this chapter.

CAUTION
When you include general entities within other general entities, circular references are not allowed. For example, the following construction is not correct:

 <!ENTITY greeting "&hello;! Gr&u_um;&s_sh;"> <!ENTITY hello "Hello &greeting;"> 

In this case, greeting is referencing hello, and hello is referencing greeting, making a circular reference.

Internal Parameter Entities

Internal parameter entities are interpreted and replaced within the DTD and can be used only within the DTD. While you need to use an ampersand (&) when referencing general entities, you need to use a percent sign (%) when referencing parameter entities.

NOTE
If you need to use a quotation mark, percent sign, or ampersand in your parameter or general entity strings, you must use character or general entity references—for example, &#x22, &#x25, &#x26, or &quot;, and &amp;. (There is no predefined entity for the percent sign, but you could create a general or parameter entity for it.)

Declaring an internal parameter entity

The syntax for declaring an internal parameter entity is shown here:

<!ENTITY % name "string_of_characters">

As you can see, the syntax for declaring an internal parameter entity is only slightly different from that used for declaring internal general entities—a percent sign is used in front of the entity name. (The percent sign must be preceded and followed by a white space character.)

In Chapter 4, we created a sample DTD for a static HTML page. If you want to create a dynamic page, you will probably want to add forms and other objects to your DTD. There is a standard set of events associated with all of these objects, but instead of listing the events for every declaration of every object, you could use the following parameter entity in your DTD:

 <!ENTITY % events "onclick CDATA #IMPLIED ondblclick CDATA #IMPLIED onmousedown CDATA #IMPLIED onmouseup CDATA #IMPLIED onmouseover CDATA #IMPLIED onmousemove CDATA #IMPLIED onmouseout CDATA #IMPLIED onkeypress CDATA #IMPLIED onkeydown CDATA #IMPLIED onkeyup CDATA #IMPLIED" > 

This code declares a parameter entity named events that can be used as an attribute for all of your objects that have these attributes.

NOTE
You could have also declared a parameter entity named Script, and then used it within the events parameter entity declaration, as shown here:

 <!ENTITY % Script "CDATA"> <!ENTITY % events "onclick %Script;  #IMPLIED ondblclick %Script; #IMPLIED  > 

The Script parameter entity allows you to use data type names that are more readable than just using CDATA. Although this code is more readable, some XML tools (such as XML Authority) cannot accept parameter entities used in this way. Be aware of this limitation if you use this technique.

Using internal parameter entities

The events parameter entity will be used in the attribute declaration of the form objects and in other elements, such as body. To reference a parameter entity, you must precede the entity with a percent sign and follow it with a semicolon. For example, you could now make this declaration:

 <!ATTLIST body alink CDATA #IMPLIED text CDATA #IMPLIED bgcolor CDATA #IMPLIED link CDATA #IMPLIED vlink CDATA #IMPLIED %events; onload CDATA #IMPLIED onunload CDATA #IMPLIED > 

In this case, the internal parameter entity %events; has been added to the body element's attribute declaration. The parameter entity events could be used in any declaration in which these events are allowed.

The XHTML Standard and Internal Parameter Entities

Now would be a good time to introduce a new standard that is being created for HTML. This new standard is called XHTML; it is also represented in a new version of HTML (version 4.01). The World Wide Web Consortium (W3C) standards committee is currently working out the last details of the standard, which is all about doing what we've done in the last few chapters, XMLizing HTML. You can find information about this standard by visiting http://www.w3.org.

Basically, the XHTML standard introduces two content models: inline and block. The inline elements affect individual text elements, whereas the block elements affect entire blocks of text. These two elements are then used as child elements for other elements.

Inline entities and elements

The XHTML standard provides the following declarations for defining a series of internal parameter entities to be used to define the inline elements:

 <!ENTITY % special "br                     | span                     | img"> <!ENTITY % fontstyle "tt                       | i                       | b                       | big                       | small"> <!ENTITY % phrase "em                    | strong                    | q                    | sub                    | sup"> <!ENTITY % inline.forms "input                          | select                          | textarea                          | label                          | button"> <!ENTITY % inline "a                    | %special;                    | %fontstyle;                    | %phrase;                    | %inline.forms;"> <!-- Entities that can occur at block or inline level. --> <!ENTITY % misc "script                  | noscript"> <!ENTITY % Inline " (#PCDATA                    | %inline;                    | %misc; )*"> 

This declaration fragment builds the final Inline parameter entity in small pieces. Notice that the Inline entity definition contains the inline and misc entities and uses the technique described in Chapter 4 for including an unlimited number of child elements in any order—in this example, using (#PCDATA | %inline; | %misc; )*.

In the example DTD created in Chapters 3 and 4, the p element was used to organize the content within a cell. Although that usage makes sense, the purpose of the p element is to make text that is not included in a block element (such as text within an h element) word-wrap properly. Therefore, putting the h element or any of the block elements within a p element is not necessary because text within a block element is already word-wrapped. On the other hand, if any of the inline elements are used outside of a block element, they should be placed inside a p element so that the text element wraps properly. Therefore, you could rewrite the definition for the p element as follows:

 <!ELEMENT p %Inline;> 

This shows exactly the way the definition for the p element appears in the XHTML specification.

Block entities and elements

The XHTML standard also declares a set of internal parameter entities that can be used in the declarations of the block elements. These internal parameter entities appear as follows:

 <!ENTITY % heading "h1                     | h2                     | h3                     | h4                     | h5                     | h6"> <!ENTITY % lists "ul                   | ol"> <!ENTITY % blocktext "hr                       | blockquote"> <!ENTITY % block "p                   | %heading;                   | div                   | %lists;                   | %blocktext;                   | fieldset                   | table"> <!ENTITY % Block " (%block;                   | form                   | %misc; )*"> 

Notice that the Block entity contains the block entity, the misc entity, and the form element and also includes an unlimited number of these child elements in any order. Using the Block parameter entity, the declaration for the body element becomes the following:

 <!ELEMENT body %Block;> 

As you can see, using the parameter entities, you can give your document a clear structure.

Using parameter entities in attributes

The XHTML standard also uses parameter entities in attributes, as we saw earlier with the events entity. You could use this events entity and two additional entities to create an internal parameter entity for attributes shared among many elements, as shown here:

 <!-- Internationalization attributes lang Language code (backward-compatible) xml:lang Language code (per XML 1.0 spec) dir Direction for weak/neutral text --> <!ENTITY % i18n " lang NMTOKEN #IMPLIED xml:lang NMTOKEN #IMPLIED dir (ltr | rtl ) #IMPLIED"> <!ENTITY % coreattrs " id ID #IMPLIED class CDATA #IMPLIED style CDATA #IMPLIED title CDATA #IMPLIED"> <!ENTITY % attrs " %coreattrs; %i18n; %events;"> 

The language entity i18n can be understood by XML and non-XML compliant browsers and is used to mark elements as belonging to a particular language.

NOTE
For more information about language codes, visit the Web site http://www.oasis-open.org/cover/iso639a.html.

The attrs parameter entity can be used for the most common attributes associated with the HTML elements in the DTD. For example, the body element's attribute can now be written as follows:

 <!ATTLIST body %attrs; onload CDATA #IMPLIED onunload CDATA #IMPLIED> 

Rewriting the sample DTD using parameter entities

Ideally, you want your XML Web documents to be compatible with the new XHTML standard. Using entities and with other changes, the DTD example from Chapter 4 can be rewritten as follows:

 <!-- Entities that can occur at block or inline level. ====--> <!ENTITY % misc " script                  | noscript"> <!ENTITY % Inline "(#PCDATA | %inline; | %misc;)*"> <!-- Entities for inline elements ================--> <!ENTITY % special "br                     | span                     | img"> <!ENTITY % fontstyle "tt                       | i                       | b                       | big                       | small"> <!ENTITY % phrase "em                    | strong                    | q                    | sub                    | sup"> <!ENTITY % inline.forms "input                          | select                          | textarea                          | label                          | button"> <!ENTITY % inline "a                    | %special;                    | %fontstyle;                    | %phrase;                    | %inline.forms;"> <!ENTITY % Inline  "(#PCDATA                    | %inline;                    | %misc;)*"> <!-- Entities used for block elements ============--> <!ENTITY % heading "h1                     | h2                     | h3                     | h4                     | h5                     | h6"> <!ENTITY % lists "ul                   | ol"> <!ENTITY % blocktext "hr                       | blockquote"> <!ENTITY % block "p                   | %heading;                   | div                   | %lists;                   | %blocktext;                   | fieldset                   | table"> <!ENTITY % Block " (%block;                   | form                   | %misc; )*"> <!-- Mixed block and inline ========================--> <!-- %Flow; mixes block and inline and is used for list       items and so on. --> <!ENTITY % Flow " (#PCDATA                  | %block;                  | form                  | %inline;                  | %misc; )*"> <!ENTITY % form.content " #PCDATA                          | p                          | %lists;                          | %blocktext;                          | a                          | %special;                          | %fontstyle;                          | %phrase;                          | %inline.forms;                          | table                          | %heading;                          | div                          | fieldset                          | %misc; "> <!ENTITY % events " onclick     CDATA  #IMPLIED                      ondblclick  CDATA  #IMPLIED                      onmousedown CDATA  #IMPLIED                      onmouseup   CDATA  #IMPLIED                      onmouseover CDATA  #IMPLIED                      onmousemove CDATA  #IMPLIED                      onmouseout  CDATA  #IMPLIED                      onkeypress  CDATA  #IMPLIED                      onkeydown   CDATA  #IMPLIED                      onkeyup     CDATA  #IMPLIED"> <!ENTITY % i18n " lang     NMTOKEN  #IMPLIED                      xml:lang NMTOKEN  #IMPLIED                      dir      (ltr | rtl )  #IMPLIED"> <!-- Core attributes common to most elements  id       Document-wide unique ID  class    Space-separated list of classes  style    Associated style info  title    Advisory title/amplification --> <!-- Style sheet data --> <!ENTITY % StyleSheet "CDATA"> <!ENTITY % coreattrs " id    ID   #IMPLIED                      class CDATA  #IMPLIED                      style CDATA  #IMPLIED"> <!ENTITY % attrs " %coreattrs;                       %i18n;                       %events;"> <!-- End Entity Declarations  ====================--> <!ENTITY % URI "CDATA"> <!--a Uniform Resource Identifier, see [RFC2396]--> <!ELEMENT html  (head, body)> <!ATTLIST html  %i18n;                                 xmlns CDATA  #FIXED 'http://www.w3.org/1999/xhtml'> <!ELEMENT head  (title, base?)> <!ATTLIST head  %i18n;                 profile CDATA  #IMPLIED> <!ELEMENT title  (#PCDATA )> <!ATTLIST title  %i18n; > <!ELEMENT base EMPTY> <!ATTLIST base  target CDATA  #REQUIRED > <!ELEMENT body  (basefont? ,  (p )? , table )> <!ATTLIST body  alink   CDATA  #IMPLIED                 text    CDATA  #IMPLIED                 bgcolor CDATA  #IMPLIED                 link    CDATA  #IMPLIED                 vlink   CDATA  #IMPLIED > <!ELEMENT basefont EMPTY> <!ATTLIST basefont  size CDATA  #REQUIRED > <!-- generic language/style container ==============--> <!ELEMENT a  (#PCDATA )> <!ATTLIST a  %attrs;              href   CDATA  #IMPLIED              name   CDATA  #IMPLIED              target CDATA  #IMPLIED > <!ELEMENT table  (tr )+> <!ATTLIST table  %attrs;                  width       CDATA  #IMPLIED                  rules       CDATA  #IMPLIED                  frame       CDATA  #IMPLIED                  align       CDATA  'Center'                  cellpadding CDATA  '0'                  border      CDATA  '0'                  cellspacing CDATA  '0' > <!ELEMENT tr  (td+ )> <!ATTLIST tr  %attrs; > <!ELEMENT td  (cellcontent )> <!ATTLIST td  %attrs;               bgcolor  (Cyan|Lime|Black|White|Maroon ) 'White'               align   CDATA  'Center'               rowspan CDATA  #IMPLIED               colspan CDATA  #IMPLIED > <!ELEMENT cellcontent  (%Block; | p?)+> <!ATTLIST cellcontent  cellname CDATA  #REQUIRED > <!ELEMENT h1 %Inline;> <!ATTLIST h1  align CDATA  #IMPLIED               %attrs; > <!ELEMENT h2 %Inline;> <!ATTLIST h2  align CDATA  #IMPLIED               %attrs; > <!ELEMENT h3 %Inline;> <!ATTLIST h3  align CDATA  #IMPLIED               %attrs; > <!ELEMENT h4 %Inline;> <!ATTLIST h4  align CDATA  #IMPLIED               %attrs; > <!ELEMENT h5 %Inline;> <!ATTLIST h5  align CDATA  #IMPLIED               %attrs; > <!ELEMENT h6 %Inline;> <!ATTLIST h6  align CDATA  #IMPLIED               %attrs; > <!ELEMENT p %Inline;> <!ATTLIST p  %attrs; > <!-- Inline Element Declarations =================--> <!-- Forced line break --> <!ELEMENT br EMPTY> <!ATTLIST br  %coreattrs;               clear     CDATA  #REQUIRED > <!-- Emphasis --> <!ELEMENT em %Inline;> <!ATTLIST em  %attrs; > <!-- Strong emphasis --> <!ELEMENT strong %Inline;> <!ATTLIST strong  %attrs; > <!-- Inlined quote --> <!ELEMENT q %Inline;> <!ATTLIST q  %attrs;              cite  CDATA  #IMPLIED > <!-- Subscript --> <!ELEMENT sub %Inline;> <!ATTLIST sub  %attrs; > <!-- Superscript --> <!ELEMENT sup %Inline;> <!ATTLIST sup  %attrs; > <!-- Fixed-pitch font --> <!ELEMENT tt %Inline;> <!ATTLIST tt  %attrs; > <!-- Italic font --> <!ELEMENT i %Inline;> <!ATTLIST i  %attrs; > <!-- Bold font --> <!ELEMENT b %Inline;> <!ATTLIST b  %attrs; > <!-- Bigger font --> <!ELEMENT big %Inline;> <!ATTLIST big  %attrs; > <!-- Smaller font --> <!ELEMENT small %Inline;> <!ATTLIST small  %attrs; > <!-- hspace, border, align, and vspace are not in the strict     XHTML standard for img. --> <!ELEMENT img EMPTY> <!ATTLIST img  %attrs;               align  CDATA  #IMPLIED               border CDATA  #IMPLIED               width  CDATA  #IMPLIED               height CDATA  #IMPLIED               hspace CDATA  #IMPLIED               vspace CDATA  #IMPLIED               src    CDATA  #REQUIRED > <!ELEMENT ul  (font? , li+ )> <!ATTLIST ul  %attrs;               type  CDATA  'text' > <!ELEMENT ol  (font? , li+ )> <!ATTLIST ol  type  CDATA  'text'               start CDATA  #IMPLIED               %attrs; > <!ELEMENT li  %Flow; > <!ATTLIST li  %attrs; > <!--================= Form Elements===============--> <!--Each label must not contain more than one field.     Label elements shouldn't be nested. --> <!ELEMENT label %Inline;> <!ATTLIST label  %attrs;                  for   IDREF  #IMPLIED > <!ENTITY % InputType "(text | password | checkbox |     radio | submit | reset |     file | hidden | image | button)"> <!-- The name attribute is required for all elements but       the submit and reset elements. --> <!ELEMENT input EMPTY> <!ATTLIST input  %attrs; > <!ELEMENT select  (optgroup | option )+> <!ATTLIST select %attrs;> <!-- Option selector --> <!ATTLIST select name     CDATA  #IMPLIED> <!ATTLIST select size     CDATA  #IMPLIED> <!ATTLIST select multiple  (multiple)  #IMPLIED> <!ATTLIST select disabled  (disabled)  #IMPLIED> <!ATTLIST select tabindex CDATA  #IMPLIED> <!ATTLIST select onfocus  CDATA  #IMPLIED> <!ATTLIST select onblur   CDATA  #IMPLIED> <!ATTLIST select onchange CDATA  #IMPLIED> <!ELEMENT optgroup  (option )+> <!ATTLIST optgroup  %attrs;                     disabled  (disabled )  #IMPLIED                     label    CDATA  #REQUIRED> <!ELEMENT option  (#PCDATA )> <!ATTLIST option  %attrs;                   selected  (selected )  #IMPLIED                   disabled  (disabled )  #IMPLIED                   label    CDATA  #IMPLIED                   value    CDATA  #IMPLIED > <!-- Multiple-line text field --> <!ELEMENT textarea  (#PCDATA )> <!ATTLIST textarea  %attrs; > <!ELEMENT legend %Inline;> <!ATTLIST legend  %attrs; > <!--=================== Horizontal Rule ============--> <!ELEMENT hr EMPTY> <!ATTLIST hr  %attrs; > <!--=================== Block-like Quotes ==========--> <!ELEMENT blockquote %Block;> <!ATTLIST blockquote  %attrs;                       cite  CDATA  #IMPLIED > <!-- The fieldset element is used to group form fields.   Only one legend element should occur in the content,   and if present it should be preceded only by white space. --> <!ELEMENT fieldset      (#PCDATA | legend | %block; | form | %inline; | %misc; )*> <!ATTLIST fieldset  %attrs; > <!ELEMENT script  (#PCDATA )> <!ATTLIST script  charset   CDATA  #IMPLIED                   type      CDATA  #REQUIRED                   src       CDATA  #IMPLIED                   defer     CDATA  #IMPLIED                   xml:space CDATA  #FIXED 'preserve' > <!-- Alternative content container for non-script-based      rendering --> <!ELEMENT noscript %Block;> <!ATTLIST noscript %attrs; > <!ELEMENT button  (#PCDATA | p | %heading; | div | %lists; |     %blocktext; | table | %special; | %fontstyle; |     %phrase; | %misc; )*> <!ATTLIST button  %attrs;                   name      CDATA  #IMPLIED                   value     CDATA  #IMPLIED                   type      (button | submit | reset )  'submit'                   disabled  (disabled )  #IMPLIED                   tabindex  CDATA  #IMPLIED                   accesskey CDATA  #IMPLIED                   onfocus   CDATA  #IMPLIED                   onblur    CDATA  #IMPLIED > <!ELEMENT span %Inline;> <!ATTLIST span  %attrs; > <!--The font element is not included in the XHTML standard. --> <!ELEMENT font  (b )> <!ATTLIST font  color CDATA  #REQUIRED                 face  CDATA  #REQUIRED                 size  CDATA  #REQUIRED > <!ELEMENT form %form.content;> <!ELEMENT div %Flow;> <!ATTLIST div %attrs; > 

This might look like a completely different DTD, but it is essentially the same as the DTD we created in Chapter 4. Only one structural change has occurred: the block elements, such as the h1 element, have been moved out of the p element and now are child elements of the body element. Several elements have been added, including the form element itself and its child elements (button, label, select, and so on) and the font formatting elements, including i and b. Numerous additions have been made to the attributes, including language, id, and the scripting events. This sample DTD is also available on the companion CD.

XML documents built using this new DTD will still use a table to format and contain all of the elements that will be displayed in the browser. However, in the new DTD, the declaration for the body element is different from that in our original DTD. In our original DTD, the a (anchor) element at the top of the page is a child element of the body element. However, this element is not a child element of the body element in the XHTML standard. As we have seen, the declaration for the body element in the XHTML standard is as follows:

 <!ELEMENT body %Block;> 

As we have discussed, the Block internal parameter entity is declared as follows:

 <!ENTITY % Block " (%block; | form | %misc;)*"> 

Replacing %block; and %misc; results in the following code:

 <!ENTITY % Block " (p | %heading; | div | %lists; | %blocktext; | fieldset | table | form | script | noscript)*"> 

Replacing %heading; and %blocktext; will give you the actual declaration for the body element, as shown here:

 <!ENTITY % Block " (p | h1 | h2 | h3 | h4 | h5 | h6 | div | ul | ol | hr | blockquote | fieldset | table | form | script | noscript)*"> 

NOTE
It would be worth your time to go through the DTD and replace the entities with their actual values. You may also find it interesting to download the latest version of the XHTML standard and do all of the replacements in that document, too.

Creating this expanded declaration manually took some time, but any of the DTD tools could have done this work for you in just a few moments. For example, Figure 5-2 shows our sample XHTML DTD as it appears in XML Authority.

click to view at full size.

Figure 5-2. The Body element of the XHTML DTD displayed in XML Authority.

The child elements of the Body element are readily visible. (You can scroll down to see the complete list.)

NOTE
You do not have to include all of these child elements in your DTD to be compatible with the XHTML standard; instead, you can include only those elements that you need for your projects. If you want to be compliant with the standard, however, you cannot add elements to the body element that are not included in the standard.

Notice that the a element is not a child element of the XHTML body element; it is actually a child element of the p element. Therefore, you cannot use the declaration included in the original DTD we discussed in Chapter 4, shown here:

 <!ELEMENT body (basefont? , a? , table)> 

In this declaration, the a element is a child element of the body element, which does not comply with the standard. To solve this problem, you will need to use the p element, as shown here:

 <!ELEMENT body (basefont? , (p)? , table)> 

While this declaration makes the DTD conform to the XHTML standard, it also means that any of the inline elements, not just the a element, can be used in the body element as long as they are contained within a p element.

Many child elements that are included in the body element of the XHTML standard are not included in the example DTD. This is because you are using the table to hold most of the content and do not need most of these child elements. You can think of the XML documents defined by the example DTD as a subset of the XML documents defined by the more general XHTML DTD. The example DTD includes only the structure you need for your documents.

The XHTML standard declaration for the table cell element (td) is shown here:

 <!ELEMENT td %Flow;> 

If you replace the Flow parameter entity and all of the parameter entities contained within %Flow; as you did earlier for the body element, your final td declaration will look like this:

 <!ELEMENT td #PCDATA | p | h1|h2|h3|h4|h5|h6| div | ul | ol | hr | blockquote | fieldset | table | form | a | br | span | img | tt | i | b | big | small | em | strong | q | sub | sup |input | select | textarea | label | button | script | noscript> 

As you can see, the Flow entity includes virtually everything. You can use a td element as a container for all of the block and inline elements, which is exactly what you want to do.

In the example DTD, the following declaration is created for the td element and the cellcontent element:

 <!ELEMENT td (cellcontent)> <!ELEMENT cellcontent (%Block;)+> 

This declaration doesn't comply with the XHTML standard. The cellcontent element does not belong to the standard; it was created for marking up the text. When you use custom elements, such as the cellcontent element in this example, you will need to remove them using Extensible Stylesheet Language (XSL). Using XSL, you can transform the preceding definitions to be:

 <!ELEMENT td (%Block;)+> 

This declaration will be compliant with the XHTML standard. We'll have a detailed discussion about XSL in Chapter 12.

The New HelpHTM.htm Document

Because of the changes in the DTD, you will have to make some minor changes to the sample HelpHTM.htm document we created in Chapter 4. You will now have to delete all the p elements because the block elements are no longer child elements of the p elements. You will also have to add several p elements to wrap the a elements. Change the a element at the beginning of the document as shown here:

 <p><a name="Top"><!--Top tag--></a></p> 

Then wrap all the links in the lists using the p element. For example, you can wrap the first link in the HelpHTM.htm document as follows:

 <p> <a href="FirstTimeVisitorInfo.html" target=""> First-Time Visitor Information</a> </p> 

If you do this and then reference the new DTD, the document is valid.

NOTE
The new version of the HelpHTM.htm file is included on the companion CD.

Possible Problems with Parameter Entities

The parameter entities have made the overall DTD more compact, but have they made it more readable? In general, grouping items into parameter entities can make the document more readable, but keep in mind that if you go too far and create too many parameter entities, it might be nearly impossible for a human to read your DTD. For example, most developers would consider the basic form objects (button, label, textArea, and so on) to be the primary child elements of a form element. However, you will need to dig through many layers of the XHTML DTD to discover that these elements are actually child elements of the form element.

In the XHTML DTD, the form objects are defined in an internal parameter entity named inline.forms, which is included in the inline parameter entity. The inline entity is used in the Inline parameter entity, which in turn is used in the p element's declaration. The p element is included in the block parameter entity's declaration, and the block entity is included in the form.content parameter entity. Finally, the form.content entity is included in the form element's declaration, as shown here:

 <!ENTITY % inline.forms "input | select | textarea | label | button"> <!ENTITY % inline "a | %special; | %fontstyle; | %phrase; | %inline.forms;"> <!ENTITY % Inline "(%inline;| %misc;)*"> <!ELEMENT p %Inline;> <!ENTITY % block "p | %heading; | div | %lists; | %blocktext; | fieldset | table"> <!ENTITY % form.content "(%block; | % inline; | %misc;)*"> <!ELEMENT form %form.content;> 

To use a form object such as select, you will need to include the following statement in your XML document:

 <form><p><select/></p></form> 

There is another path to the form objects. Notice that the block entity declaration includes a fieldset element. The fieldset element also contains the inline element, just as the p element did, as shown here:

 <!ENTITY % inline.forms "input | select | textarea | label | button"> <!ENTITY % inline "a | %special; | %fontstyle; | %phrase; | %inline.forms;"> <!ELEMENT fieldset (#PCDATA | legend | %block; | form | %inline; | %misc;)*> <!ENTITY % block "p | %heading; | div | %lists; | %blocktext; | fieldset | table"> <!ENTITY % form.content "(%block; | %misc;)*"> <!ELEMENT form %form.content;> 

To use a form object such as select in this case, you would include the following statement in your XML document:

 <form><fieldset><select/></fieldset></form> 

You can use an XML tool to view this relationship. An excellent tool for viewing the structure of an XML DTD is Near and Far, available at http://www.microstar.com. Without an XML tool, the parameter entities make the DTD nearly impossible to read. Try to strike a balance by using enough parameter entities to create reusable groups that make your DTD neater but not so many parameter entities that your DTD is unreadable.

You must also be careful that the document is still valid and well formed once the parameter entity has been substituted. For example, consider the following declaration:

 <!ENTITY % Inline " (#PCDATA | %inline; | %misc;)*"> 

As you can see, this declaration is missing the closing parenthesis. When the Inline parameter entity is substituted, it will create an invalid declaration. Be sure that all your components are properly nested, opened, and closed after the entities are substituted.

A common problem when working with XML is finding errors in your XML documents and your DTDs. Often XML tools display cryptic error messages that leave you with no idea as to the real source of a problem. XML Notepad, which was used to write the code in this book, can be used for writing and debugging XML documents that have no external DTDs. XML Authority works well with DTDs and usually provides clear error messages that help you locate errors in your DTD. If you are working with an XML document that references an external DTD, Web Writer usually provides helpful error messages. All of these products provide trial versions. Try them all, and then choose the tools that best meet your needs. Be aware that sometimes a small error in a DTD could take a long time to track down (for example, using Block instead of block in the preceding DTD will cause an error that might take several hours to track down).



Developing XML Solutions
Developing XML Solutions (DV-MPS General)
ISBN: 0735607966
EAN: 2147483647
Year: 2000
Pages: 115
Authors: Jake Sturm

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net