The name of the game in XML is data storage, and that’s what XML excels at.XML got to be so popular because it’s a text-based way of storing your data, and the Internet is based on text transfer. So XML became the Internet’s way of slinging data around, as you already know now that you’re an Ajax developer.
When you create an XML document, you also create the tags that go into that document. Unlike HTML, XML has no set element tag names that you have to work with; you’re free to create your own tags and structure your data as you like. However, there are a number of rules to creating XML, and you’ll see the most important ones here.
Tip | For the full XML story, take a look at the XML specification published by the World Wide Web Consortium, the people responsible for the XML specs, at www.w3.org/TR/REC-xml. |
To start an XML document, you need to use an XML declaration, which indicates which version of XML you’re using. Currently, only two versions are possible: version 1.0, the most common, and version 1.1. Version 1.1 is different from version 1.0 largely in the number of Unicode characters that are legal, but that doesn’t concern this discussion very much.
Here’s what an XML declaration looks like (bear in mind that every XML document must begin with an XML declaration):
<? xml version = "1.0" ?> . . .
In addition, every XML document must contain a document element, that is, a single element that contains all the other elements in the XML document:
<? xml version = "1.0" ?> <party> . . . </party>
Every other element inside the document must be enclosed inside the document element:
<? xml version = "1.0" ?> <party> <frank> . . . </frank> <mary> . . . </mary> <tom> . . . </tom> </party>
Each XML document starts with a start tag and ends with an end tag:
<tom> . . . </tom>
and can contain other elements:
<tom> <address> . . . </address> <phone> . . . </phone> </tom>
and/or text:
<tom> Tom is a pretty good guy. </tom>
unless the element is an empty element, in which case it can’t contain any content-other elements or text. In XML, the shortcut for an empty element is this:
<tom />
Note | See the /> at the end of the element? You use this to avoid having to use both a start tag and an end tag for empty elements. |
You create your own tag names in XML, but there are some rules about what tag names are legal. Tag names can’t start with a number, can’t contain spaces, and can’t contain a few other illegal characters, such as quotation marks. Here are some illegal tag names:
<12steps> <big dog> <"ok">
Elements can also contain attributes, just as in HTML, which you use to give more information about the element. You give the attributes in the start tag of an element. Here are some examples:
<wife name = "Sally Rogers"> <movie title = "Withnail and I"> <dinner type = "turkey" side = "potatoes">
Empty elements can also have attributes:
<tom phone = "555-1212"/>
Note the syntax here. If you use an attribute, you always have to assign a value to that attribute (unlike in HTML), and that value has to be quoted (also unlike HTML, where browsers are very tolerant of unquoted values). So in XML, attributes have to be attribute name/value pairs, used like this:
type = "Swedish"
In addition, each XML element has to be nested properly. For example, you can’t mix start and end tags for different elements, like this:
<? xml version = "1.0" ?> <party> <frank> . . . <mary> </frank> . . . </mary> <tom> . . . </tom> </party>
Here’s an example, event.xml, of a well-nested XML document:
<?xml version="1.0"?> <events> <event type="fundraising"> <event_title>National Awards</event_title> <event_number>3</event_number> <subject>Pet Awards</subject> <date>5/5/2007</date> <people> <person attendance="present"> <first_name>June</first_name> <last_name>Allyson</last_name> </person> <person attendance="absent"> <first_name>Virginia</first_name> <last_name>Mayo</last_name> </person> <person attendance="present"> <first_name>Jimmy</first_name> <last_name>Stewart</last_name> </person> </people> </event> </events>
There are two primary correctness criteria for XML documents: well-formedness and validity. The XML specification contains the rules for well-formedness, and the primary one is that elements must be nested properly.
When you create an XML document, you can specify its syntax rules (for example, which elements are allowed to nest inside which others, and in which sequence), and a document that adheres to the syntax rules you specify for it is called valid. There are two ways of specifying the syntax rules of a document: you can use a document type definition or an XML schema.
On the Web | For more information on DTD, see www.w3.org/TR/REC-xml. For more information on XML schema, see www.w3.org/XML/Schema. |
How you create DTDs and schema is beyond the scope of this book, but some browsers, like Internet Explorer, let you validate XML if you supply a DTD or a schema. Following is an example that illustrates what a DTD would look like for event.xml:
<?xml version="1.0"?> <!DOCTYPE events [ <!ELEMENT events (event*)> <!ELEMENT event (event_title, event_number, subject, date, people*)> <!ELEMENT event_title (#PCDATA)> <!ELEMENT event_number (#PCDATA)> <!ELEMENT subject (#PCDATA)> <!ELEMENT date (#PCDATA)> <!ELEMENT first_name (#PCDATA)> <!ELEMENT last_name (#PCDATA)> <!ELEMENT people (person*)> <!ELEMENT person (first_name,last_name)> <!ATTLIST event type CDATA #IMPLIED> <!ATTLIST person attendance CDATA #IMPLIED> ]> <events> <event type="fundraising"> <event_title>National Awards</event_title> <event_number>3</event_number> <subject>Pet Awards</subject> <date>5/5/2007</date> <people> <person attendance="present"> <first_name>June</first_name> <last_name>Allyson</last_name> </person> <person attendance="absent"> <first_name>Virginia</first_name> <last_name>Mayo</last_name> </person> <person attendance="present"> <first_name>Jimmy</first_name> <last_name>Stewart</last_name> </person> </people> </event> </events>