Document Type Definitions

for RuBoard

There are two types of XML parsers: validating and nonvalidating parsers. A nonvalidating parser checks an XML document to be sure that it's well formed, and returns it to you as a tree of objects. A validating parser, on the other hand, ensures the document is well formed , then checks it against its DTD or schema to determine whether it's valid. In this section, we'll discuss the first of these validation methodsthe DTD.

A DTD is a somewhat antiquated although still widely used method of validating documents. DTDs have a peculiar and rather limited syntax, but are still found in lots of XML implementations . Over time, it's likely that XML schemas will become the tool of choice for setting up data validation. That said, there's still plenty of DTD code out there (and there are a few things that DTDs can do that XML schemas can't), so DTDs are still worth knowing about.

A DTD can formalize and codify the tags used in a particular type of document. Because XML itself allows you to use virtually any tags you want as long as the document itself is well formed, a facility is needed to bring structure to documents, to ensure that they make sense. DTDs were the first attempt at doing this. And because DTDs define which tags can and cannot be used in a document, as well as certain characteristics of those tags, they're also used to define new XML dialects, formalized subsets of XML tags and validation rules. Originally, DTDs put the X in XML: They were the means by which new applications of XML were designed.

Let's have a look at a DTD for our earlier Recipe example. Here's what it might look like (Listing 12-4):

Listing 12-4 A DTD for our recipe data.
 <!-- Recipe.DTD, an example DTD for Recipe.XML -->  <!ELEMENT Recipe (Name, Description?, Ingredients?, Instructions?, Step?)>  <!ELEMENT Name (#PCDATA)> <!ELEMENT Description (#PCDATA)> <!ELEMENT Ingredients (Ingredient)*> <!ELEMENT Ingredient (Qty, Item)> <!ELEMENT Qty (#PCDATA)> <!ATTLIST Qty unit CDATA #REQUIRED> <!ELEMENT Item (#PCDATA)> <!ATTLIST Item optional CDATA "0"> <!ELEMENT Instructions (Step)+> <!ELEMENT Step (#PCDATA)> 

This DTD defines several characteristics of the document that are worth discussing. First, note the topmost noncomment line in the file (in bold type). It indicates the elements that can be represented by a document that uses this DTD. A question mark after an element indicates that it's optional.

Second, notice the #PCDATA flags. They indicate that the element or attribute can contain character data and nothing else.

Third, take note of the #REQUIRED flag. This indicates that the unit attribute of the Qty element is required. Documents that use this DTD may not omit it.

Fourth, note the default value supplied for the Item element's optional attribute. Rather than being required, this attribute can be omitted, as its name suggests. Moreover, for elements that omit the attribute, it defaults to "0."

From Listing 12-4, you can see that DTD syntax is not an XML dialect , nor is it terribly intuitive. That's why people are increasingly using schemas instead. We'll discuss XML schemas shortly.

You link a DTD and a document together using a document type declaration element at the top of the document (immediately after the <?xml > line). The document type declaration can contain either an inline copy of the DTD or a reference to its filename using a URI (Universal Resource ID). The one for recipe.xml looks like this:

 <!DOCTYPE Recipe SYSTEM "recipe.dtd"> 

Here's the document again with the DTD line included (Listing 12-5):

Listing 12-5 The recipe XML document with the DTD reference included.
 <?xml version="1.0" ?> <!DOCTYPE Recipe SYSTEM "recipe.dtd"> <Recipe>        <Name>Henderson&apos;s Hotter-than-Hell Habaero Sauce</Name>        <Description> Homegrown from stuff in my garden (you don&apos;t want to know exactly what).</Description>        <Ingredients>              <Ingredient>                    <Qty unit="each">6</Qty>                    <Item>Habanero peppers</Item>              </Ingredient>              <Ingredient>                    <Qty unit="each">12</Qty>                    <Item>Cowhorn peppers</Item>              </Ingredient>              <Ingredient>                    <Qty unit="each">12</Qty>                    <Item>Jalapeno peppers</Item>              </Ingredient>              <Ingredient>                    <Qty unit="dash" />                    <Item optional="1">Tequila</Item>              </Ingredient>        </Ingredients>        <Instructions>              <Step> Chop up peppers, removing their stems, then grind to a liquid.</Step>              <!-- and so forth... -->       </Instructions> </Recipe> 

Validating the data against the DTD can be done through a number of means. If you're using Internet Explorer 5.0 or later, you can use Microsoft's built-in DTD validator simply by loading an XML document into the browser, right-clicking it, and selecting Validate. A number of GUI and command-line tools exist to do the same thing. Several of them are listed on the World Wide Web Consortium site (http://www.w3c.org).

for RuBoard


The Guru[ap]s Guide to SQL Server[tm] Stored Procedures, XML, and HTML
The Guru[ap]s Guide to SQL Server[tm] Stored Procedures, XML, and HTML
ISBN: 201700468
EAN: N/A
Year: 2005
Pages: 223

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net