XML borrows the concept of Document Type Definitions (DTDs) from SGML. A DTD is a formal description of a particular class of XML documents. It defines what the XML document is supposed to mean. It is what specifies the particular XML markup that is being used within that class of XML documents. The XML markup specific to a given class of XML documents that is being defined with a DTD is specified using an XML declaration syntax. A DTD specifies what structures are permissible within an XML document. A DTD thus spells out what
Every element that can appear within an XML document needs to be declared within that document s DTD with an element declaration statement. The basic structure of an element declaration statement looks like:
<!ELEMENT element_name (content_description) ['? '* '+]>
where the
?
,
*
, and
+
are wild-card references, with
?
indicating that the
Thus, a simple DTD for an XML document containing contact information, as in the example used in the previous sections, may look like:
<!ELEMENT person (name, company*)> <!ELEMENT name (salutation?, first_name, middle_name*, last_name)> <!ELEMENT salutation (#PCDATA)> <!ELEMENT first_name (#PCDATA)> <!ELEMENT middle_name (#PCDATA)> <!ELEMENT last_name (#PCDATA)> <!ELEMENT company (#PCDATA)>
The
A DTD would typically be stored as a file (with a .dtd suffix), separate from the XML documents it describes. It could also be included within the XML document it describes. The location of the DTD that describes a given XML document is specified within that document via a Document Type Declaration. A typical Document Type Declaration would look like:
<!DOCTYPE contact_info SYSTEM http://www.wownh.com/ dtds/contactinfo.dtd>
The
<
!DOCTYPE
>
declaration usually
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE person [ <!ELEMENT person (name, company*)> <!ELEMENT name (salutation?, first_name, middle_name*, last_name)> . ]> <person> <name> <salutation>Mr.</salutation> <first_name>Anura</first_name> .. .. </person>
The DTD for the XML document describing a Wordsworth poem, as shown in Figure 2.2, provided by Rutgers State University of New Jersey, would look like:
<!ELEMENT POEM (TITLE, AUTHOR, STANZA*)> <!ELEMENT TITLE (#PCDATA)> <!ELEMENT AUTHOR (FIRSTNAME, LASTNAME)> <!ELEMENT FIRSTNAME (#PCDATA)> <!ELEMENT LASTNAME (#PCDATA)> <!ELEMENT STANZA (LINE*)> <!ELEMENT LINE (#PCDATA)> <!ATTLIST LINE N CDATA #REQUIRED>
As repeatedly mentioned in this chapter, the success of XML is totally contingent on a common understanding of what an XML document represents at both ends of a transaction. Without that mutual understanding, XML is but an unnecessary overhead. DTDs are one of the ways to achieve this common understanding. XML schema is the other. Given the imperativeness of the mutual understanding if XML is to be
|
|
<!-- Copyright 2000 The HR-XML Consortium (TM) -->
<!-- version 1.0 October 17 2000 -->
<!-- 11/05/2000
Changed all elements to UpperCamelCase
-->
<!ELEMENT PersonName (FormattedName* , GivenName* , PreferredGivenName? , MiddleName? , FamilyName* , Affix*)>
<!ELEMENT FormattedName (#PCDATA)>
<!ATTLIST FormattedName type (presentation legal sortOrder) 'presentation' >
<!ELEMENT GivenName (#PCDATA)>
<!ELEMENT PreferredGivenName (#PCDATA)>
<!ELEMENT MiddleName (#PCDATA)>
<!ELEMENT FamilyName (#PCDATA)>
<!ATTLIST FamilyName primary (true false
undefined
) '
undefined
' >
<!ELEMENT Affix (#PCDATA)>
<!ATTLIST Affix type (academicGrade
aristocraticPrefix
aristocraticTitle
familyNamePrefix
familyNameSuffix
formOfAddress
generation) #REQUIRED >
|
|
Since the roots of DTDs go back to SGML, their forte is that of describing conventional text documents. Consequently, DTDs just specify the structure of an XML document ”in terms of the elements that make up that document. DTDs, however, do not have a mechanism for
XML schema are written in standard XML. The XML
As with DTDs, industry- and application-specific XML schema are already available and many new ones are in the process of being defined. There are vendor-independent initiatives sponsored by the likes of OASIS, such as ebXML (for electronic business XML for e-business) and tpaML (for Trading Partner Agreement Markup Language). Industry-specific XML dialects are also being promoted by the likes of RosettaNet.org ”a self-funding,