Tags and Elements

You give structure to an XML document by using markup, which consists of elements. An XML element, in turn , consists of a start tag and an end tag, except in the case of elements that are defined to be empty, which consist only of one tag.

A start tag (also called an opening tag) starts with < and ends with > . End tags (also called closing tags) begin with </ and end with > .

Tag Names

The XML specification is very specific about tag names; you can start a tag name with a letter, an underscore , or a colon . The next characters may be letters , digits, underscores, hyphens, periods, and colons (but no whitespace).

Avoid Colons in Tag Names

Although the XML 1.0 recommendation does not say so, you should definitely avoid using colons in tag names because you use a colon when specifying namespaces in XML. I'll discuss this later in the chapter.

Here are some allowed XML tags:

 <DOCUMENT>  <document> <_Record> <customer> <PRODUCT> 

Note that because XML processors are case sensitive, the <DOCUMENT> tag is not the same as a <document> tag. (In fact, you can even have <DOCUMENT> and <document> and even <DoCuMeNt> tags as different tags in the same document, but I strongly recommend against it.)

Here are the corresponding closing tags:

 </DOCUMENT>  </document> </_Record> </customer> </PRODUCT> 

Here are some tags that XML considers illegal:

 <2003DOCUMENT>  <.document> <Record Number> <customer*name> <PRODUCT(ID)> 

Using start and end tags, you can create elements, as in this example, which has three elements, the <DOCUMENT> , <GREETING> , and <MESSAGE> elements. The <DOCUMENT> element contains the <GREETING> and <MESSAGE> elements:

 <?xml version = "1.0" standalone="yes"?>  <DOCUMENT>     <GREETING>         Hello From XML     </GREETING>     <MESSAGE>         Welcome to the wild and woolly world of XML.     </MESSAGE> </DOCUMENT> 

You can also create elements without using end tags if the elements are explicitly declared to be empty.

Empty Elements

Empty elements have only one tag, not a start and end tag. You may be familiar with empty elements from HTML; for example, the HTML <IMG> , <LI> , <HR> , and <BR> elements are empty, which is to say that they do not enclose any content (either character data or markup).

Empty elements are represented with only one tag (in HTML, there are no closing </IMG> , </LI> , </HR> , and </BR> tags). In XML, you can declare elements to be empty in the document's DTD, as we'll see in the next chapter.

In XML, you close an empty element with /> . For example, if the <GREETING> element is empty, it might look like this in an XML document:

 <?xml version = "1.0" standalone="yes"?>  <DOCUMENT>  <GREETING TEXT = "Hello From XML" />  </DOCUMENT> 

This usage might seem a little strange at first, but this is XML's way of making sure that an XML processor isn't left searching for a nonexistent closing tag. In fact, in XHTML, which is a derivation of HTML in XML, the <IMG> , <LI> , <HR> , and <BR> tags are actually used as <IMG /> , <LI /> , <HR /> , and <BR /> (except that XHTML tags actually use lower case). The additional / doesn't seem to give the major browsers any trouble. We'll see how to declare empty tags in the next chapter.



Real World XML
Real World XML (2nd Edition)
ISBN: 0735712867
EAN: 2147483647
Year: 2005
Pages: 440
Authors: Steve Holzner

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net