Appendix F: Reading a Document Type Definition

 <  Day Day Up  >  


This appendix presents the Document Type Definitions (DTDs) for XHTML 1.0. Traditional HTML "dialects" are defined using SGML (Standard Generalized Markup Language), a complex language with many nuances . Modern XHTML dialects are developed in XML (eXtensible Markup Language), which is a subset of SGML and slightly easier to work with. This appendix presents the small amount of SGML or XML knowledge needed to read the various DTDs directly.

Element Type Declarations

Two common types of declarations should be familiar to Web developers: element type declarations and attribute list declarations. Beyond these, the less familiar declarations for general and parameter entities are not very complicated.

An element type declaration defines three characteristics:

  • The element type's name , also known as its generic identifier

  • Whether start and end tags are required, forbidden (end tags on empty elements), or may be omitted

  • The element type's content model, or what content it can enclose

All element type declarations begin with the keyword ELEMENT and have the following form:

  <!ELEMENT name content_model >  

The declaration for the XHTML br element gives a simple example:

  <!ELEMENT br EMPTY>  

This case says we have a br element that contains no content at all-it is empty, as shown by the keyword EMPTY .

In the case of traditional HTML, which is defined using SGML, we see a different syntax that defines

  <!ELEMENT name minimization content_model >  

In the traditional HTML 4.0 DTD we see

  <!ELEMENT BR - O EMPTY>  

Here, tag minimization is declared by two parameters that indicate the start and end tags. These parameters may take one of two values. A hyphen indicates the tag is required. An uppercase "O" indicates it may be omitted. The combination of "O" for the end tag and the content model EMPTY means the end tag is forbidden. Thus, under traditional HTML the <br> tag requires a start tag but not an end tag. Because the <br> tag does not contain content, its content model is defined by the keyword EMPTY just as it did in the XHTML specification.

Most HTML and XHTML elements enclose content. If a content model is declared, it is enclosed within parentheses and known as a model group . The HTML 4.0 declaration for a selection list option gives an example:

  <!ELEMENT OPTION - O (#PCDATA)*>  

The XHTML equivalent is almost identical save the casing of the element itself and the lack of the minimization information.

  <!ELEMENT option (#PCDATA)>  

Note in both cases the content model group contains the keyword #PCDATA . This stands for parsed character data -character content that contains no element markup but that may contain entity symbols for special characters . Keywords such as #PCDATA and CDATA are discussed in the section "SGML and XML Keywords."



 <  Day Day Up  >  


HTML & XHTML
HTML & XHTML: The Complete Reference (Osborne Complete Reference Series)
ISBN: 007222942X
EAN: 2147483647
Year: 2003
Pages: 252
Authors: Thomas Powell

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net