Well-Formed XML

 <  Day Day Up  >  


Well- Formed XML

Writing simple XML documents is fairly easy. For example, suppose that you have a compelling need to define a document with markup elements to represent a fast-food restaurant's combination meals, which contain a burger, drink, and fries . You might do this because this information will be sent to your suppliers, you might expect to receive electronic orders from customers via e-mail this way, or it might just be a convenient way to store your restaurant's data. Regardless of the reason why, the question is how you can do this in XML. You would simply create a file such as burger.xml that contains the following markup:

 <?xml version="1.0" encoding="UTF-8" standalone="yes" ?> <combomeal>    <burger>    <name>Tasty Burger</name>    <bun bread="white">       <meat />       <cheese />       <meat />    </bun>    </burger>    <fries size="large" />    <drink size="large">        Cola    </drink> </combomeal> 

A rendering of this example under Internet Explorer is shown in Figure 18-1.

click to expand
Figure 18-1: Well-formed XML under Internet Explorer

Notice that the browser shows a structural representation of the markup, not a screen representation. You'll see how to make this file actually look like something later in the chapter. First, take a look at the document syntax. In many ways, this example "Combo Meal Markup Language" (or CMML, if you like) looks similar to HTML-but how do you know to name the element <combomeal> instead of <mealdeal> or <lunchspecial> ? You don't need to know, because the decision is completely up to you. Simply choose any element and attribute names that meaningfully represent the domain that you want to model. Does this mean that XML has no rules? It has rules, but they are few, simple, and relate only to syntax:

  • The document must start with the appropriate XML declaration, like so:

     <?xml version="1.0" encoding="UTF-8" standalone="yes" ?> 

    or, more simply, just

     <?xml version="1.0" ?> 
  • A root element must enclose the entire document. For example, in the previous example notice how the <combomeal> element encloses all other elements. In fact, not only must a root element enclose all other elements, the internal elements should close properly.

  • All elements must be closed. The following

     <burger>Tasty 

    is not allowed under XML, but

     <burger>Tasty</burger> 

    would be allowed. Even when elements do not contain content, they must be closed properly, as discussed in the next rule, for a valid XML document.

  • All elements with empty content must be self-identifying, by ending in " /> " just like XHTML . An empty element is one such as the HTML <br> , <hr> , or <img src="test.gif"> tags. In XML and XHTML, these would be represented, respectively, as <br /> , <hr /> , and <img src="test.gif" /> .

  • Just like well-written HTML and XHTML, all elements must be properly nested. For example,

     <outer><inner>ground zero</inner></outer> 

    is correct, whereas this isn't:

     <outer><inner>ground zero</outer></inner> 
  • All attribute values must be quoted. In traditional HTML, quoting is good authoring practice, but it is required only for values that contain characters other than letters (A-Z, a-z), numbers (0-9), hyphens (-), or periods (.). Under XHTML, quoting is required as it is in XML as well. For example,

     <blastoff count="10" ></blastoff> 

    is correct, whereas this isn't:

     <blastoff count=10></blastoff> 
  • All elements must be cased consistently. If you start a new element such as <BURGER> , you must close it as </BURGER> , not </burger> . Later in the document, if the element is in lowercase, you actually are referring to a new element known as <burger> . Attribute names also are case sensitive.

  • A valid XML file may not contain certain characters that have reserved meanings. These include characters such as & , which indicates the beginning of a character entity such as &amp; , or < , which indicates the start of an element name such as <sunny> . These characters must be coded as &amp; and &lt; , respectively, or can occur in a section marked off as character data. In fact, under a basic stand-alone XML document, this rule is quite restrictive as only &amp; , &lt; , &gt; , &apos; , and &quot; would be allowed.

A document constructed according to the previous simple rules is known as a well-formed document. Take a look in Figure 18-2 at what happens to a document that doesn't follow the well-formed rules presented here.

click to expand
click to expand
Figure 18-2: Documents that aren't well-formed won't render

Markup purists might find the notion of well-formed-ness somewhat troubling. Traditional SGML has no notion of well-formed documents; instead, it uses the notion of valid documents- documents that adhere to a formally defined document type definition (DTD). For anything beyond casual applications, defining a DTD and validating documents against that definition are real benefits. XML supports both well-formed and valid documents. The well-formed model that just enforces the basic syntax should encourage those not schooled in the intricacies of language design and syntax to begin authoring XML documents, thus making XML as accessible as traditional HTML has been. However, the valid model is available for applications in which a document's logical structure needs to be verified . This can be very important when we want to bring meaning to a document.



 <  Day Day Up  >  


HTML & XHTML
HTML & XHTML: The Complete Reference (Osborne Complete Reference Series)
ISBN: 007222942X
EAN: 2147483647
Year: 2003
Pages: 252
Authors: Thomas Powell

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net