In the preceding sections, Ive mentioned some of the rules for creating XML documents. In this section, we look at these rules in more detail. Documents that meet the requirements are said to be well formed .
XHTML provides us with a standard set of predefined tags. We have to use the <ul> <li> </li> </ul> tags when we want to create a list. Because there are no predefined tags in XML documents, its important that the rules for creating documents are strict. You can create any tags you like, providing that you stick to these rules.
Well-formed documents meet the following criteria:
The document contains one or more elements.
The document contains a single root element, which may contain other nested elements.
Each element closes properly.
Start and end tags have matching case.
Elements nest correctly.
Attribute values are contained in quotes.
Ill look at each of these rules in a little more detail.
An XML document must have at least one element: the document root. It doesnt have to have any other content, although in most cases it will.
The following XML document is well formed as it contains a single element <phoneBook> :
<?xml version="1.0"?> <phoneBook/>
Of course, this document doesnt contain any information so its not likely to be very useful.
Its more likely that youll create an XML document where the root element contains other elements. The following listing shows an example of this structure:
<?xml version="1.0"?> <phoneBook> <contact id="1"> <name>Sas Jacobs</name> <address>123 Some Street, Some City, Some Country</address> <phone>123 456</phone> </contact> </phoneBook>
As long as all of the elements are contained inside a single root element, the document is well formed.
This listing shows a document without a root element. This document is not well formed.
<?xml version="1.0"?> <contact id="1"> <name>Sas Jacobs</name> </contact> <contact id="2"> <name>John Smith</name> </contact>
You must close all elements correctly. The way you do this depends on whether or not the element is empty, i.e., whether it contains text or other elements.
You can close empty elements by adding a forward slash to the opening tag:
In the case of a nonempty element, you have to add a closing tag, which must appear after the opening tag:
You can also write empty elements with a closing tag:
As XML is case sensitive, start and end tag names must match exactly. The following examples are incorrect:
<name>Sas Jacobs</Name> <Name>Sas Jacobs</name>
You would rewrite them as
The following example is also incorrect:
<name>Sas Jacobs <name>John Smith
The elements have an opening tag but no corresponding closing tag. This rule also applies to XHTML. In XHTML, you cant use the following code, which was acceptable in HTML:
<p>A paragraph of information. <p>Another paragraph.
Ill talk about the differences between XML, HTML, and XHTML a little later in this chapter.
You must close elements in the correct order. In other words, child elements must close before their parent elements.
This line is incorrect:
and should be rewritten as
All attribute values must be contained in quotes. You can either use single or double quotes; these two lines are equivalent:
<contact id="1"> <contact id='1'>
If your attribute value contains a single quote, you have to use double quotes, and vice versa:
<contact nickname='John "Bo bo" Smith'/>
You can also replace the quote characters inside an attribute value with character entities:
<contact name="0'Malley"/> contact nickname='John "Bo bo" Smith'/>
If you try to view an XML document that is not well formed, youll see an error. For example, opening a document that isnt well formed in a web browser will cause an error message similar to the one shown in Figure 2-4. This is quite different from HTML documents; most web browsers will ignore any HTML errors such as missing </p> tags.
An XML editor such as XMLSpy often provides more detailed information about the error. You can see the same XML document displayed in XMLSpy in Figure 2-5.
The error message shows that the closing element name </phoneBook> was expected.
You can make sure that your XHTML documents are well formed by adding an XML declaration at the top of the file before the DOCTYPE declaration. The DOCTYPE declaration should contain a reference to the appropriate XHTML DTD. The DTD can specify strict or transitional conformance by including one of the following two declarations:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
Strictly conforming documents must meet the mandatory requirements in the XHTML specification. If you are declaring a strictly conforming document, you should include a namespace in the <html> tag. The following listing shows the W3C recommendation for well-formed XHTML documents:
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
You can see this W3C recommendation at www.w3.org/TR/xhtml11/conformance.html.