XML Document Structure


Listing 14.1 is an XML document that describes multiple companies and their employees.

Listing 14.1. CompaniesAndEmployees.xmlAn XML Document
 <?xml version="1.0" ?> <companies>   <company name="ABC MegaCorp, Inc." location="NY">     <comments>A very big company that does many different things.</comments>     <employee ssn="123-45-6789">       <first-name>Ed</first-name>       <last-name>Johnson</last-name>       <department>Human Resources</department>       <children>         <child name="Sean" />         <child name="Polly" />       </children>     </employee>     <employee ssn="541-29-8376">       <first-name>Maria</first-name>       <last-name>Smith</last-name>       <department>Accounting</department>     </employee>   </company>   <company name="Baker &amp; Associates" location="CA">     <comments>A midsized investment firm.</comments>     <employee ssn="568-73-1924">       <first-name>Eric</first-name>       <last-name>Masters</last-name>       <department>Accounting</department>     </employee>     <employee ssn="714-52-6938">       <first-name>Tanya</first-name>       <last-name>Peeples</last-name>       <department>Public Relations</department>       <children>         <child name="Sandra" />       </children>     </employee>   </company> </companies> 

Let's break down the composition of this document.

The first line is called an XML Declaration. This tells an XML parser that the ensuing content is properly formed XML and that it conforms to version 1.0 of the XML specification. Though this declaration is not required in all XML documents, it's a good habit to always put it at the top of the document.

The second line is the opening tag for this document's root element. Every XML document must have one and only one element at the root of its hierarchy, and in this case the root element is companies.

Between the opening and closing tags of the companies element, there are company elements, one for each of the companies in this collection. Each company element contains a comments element that describes the company, and one or more employee elements for each of the company's employees. The employee element contains first-name, last-name, and department elements, as well as an optional children element to describe the employee's children.

The structure of an XML document is more formal than that of an HTML document. XML is case-sensitive (so <a> and <A> are two different tags), whereas HTML is not (so <a> and <A> are equivalent). XML also requires that all attribute values be surrounded with double quotes. In HTML, we could write location=CA, but XML requires that we write location="CA" instead.

Probably the biggest problem people have with XML concerns the requirement for closing tags. In an XML document, every tag must be closed. Even if the tag has no content (such as the child elements in Listing 14.1), you must close the tag; if you don't, the XML parser that processes the document will throw an error. In the case of the <child> tags in Listing 14.1, we used a special shorthand closing syntax: <child name="Sandra"></child> is equivalent to <child name="Sandra" />.

One last notethe following characters are illegal in attribute values and element content and must be escaped using their equivalent entity escape codes:

  • Ampersand (&): &amp;

  • Greater-than sign (>): &gt;

  • Less-than sign (<): &lt;

  • Double quote ("): &quot;

Entity references will be discussed in more detail later in the chapter.

Following the example and guidelines outlined in this section ensures that an XML document is well formed, meaning that it follows the standard rules of XML document structure. Any XML parser should be able to read a well-formed document.

Elements and Their Attributes

There is an ongoing discussion in the XML world about when to use elements to store data and when to use attributes. For instance, this portion of Listing 14.1:

 <employee ssn="568-73-1924">   <first-name>Eric</first-name>   <last-name>Masters</last-name>   <department>Accounting</department> </employee> 

could also have been represented like this:

 <employee ssn="568-73-1924" first-name="Eric" last-name="Masters" department="Accounting" /> 

Some people like using elements; others prefer attributes. Different groups claim that one method is inherently superior to the other, but this is not really true. The decision as to whether to use an attribute or a child element in any given circumstance is one that should be made according to the developer's opinion of what will work for that situation.

In making your decision, here a few rules to keep in mind:

  • All attributes of a single element must have unique names.

  • Attributes cannot contain embedded tags. If something has a substructure of its own, it must be represented as an element.

  • Child elements are often more difficult to handle in code than are attributes. This is because accessing an element's content means using a property, whereas accessing an attribute can be done directly.

In the earlier Listing 14.1, the only reason we chose to use elements rather than attributes was to be consistent with earlier listings. That listing could easily have been written like this:

[View full width]

<companies> <company name="ABC MegaCorp, Inc." location="NY" comments="A very big company that does many different things."> <employee ssn="123-45-6789" first-name="Ed" last-name="Johnson" department="Human Resources"> <children> <child name="Sean" /> <child name="Polly" /> </children> </employee> <employee ssn="541-29-8376" first-name="Maria" last-name="Smith" department="Accounting" /> </company> <company name="Baker &amp; Associates" location="CA" comments="A midsized investment firm."> <employee ssn="568-73-1924" first-name="Eric" last-name="Masters" department="Accounting" /> <employee ssn="714-52-6938" first-name="Tanya" last-name="Peeples" department="Public Relations"> <children> <child name="Sandra" /> </children> </employee> </company> </companies>

Using attributes or elements is not an all-or-nothing deal. It is up to you and your development team to find the proper mix of data in attributes versus data in elements.

Naming Conventions

There seem to be as many "standard" XML naming conventions as there are XML developers. Some people would name my first-name element from Listing 14.1 FirstName; others would name it first_name; yet others would name it fn. All of these are valid names, but first-name is still the simplest and easiest for most of us to understand.

There is no one standard naming convention. Rather, there is the naming convention that is most comfortable for you and the other developers on your team. The conventions presented here is merely a guideline that has helped me cut down on confusion in the past.

  • Make all element and attribute names lowercase, because XML is case-sensitive. On most platforms, if you have a first-name element and you look for an element named First-Name, the application will not find the element you're looking for. Because different people have different capitalization rules, it's best to eliminate the issue entirely and use lowercase for all names.

  • Use hyphens to separate multiple words in an element or attribute name. Some teams use underscores, but I avoid them because it's often difficult to see an underscore when code is underlined or outlined in an IDE. Hyphens are always easy to see.

  • Don't abbreviate element names unless you absolutely must. XML is known for being a very verbose format. Because of this, developers often abbreviate the names of elements, using fn, for instance, instead of the more verbose first-name. However, fn does not describe what the element does, whereas first-name describes it perfectly. Remember that XML was invented to make data understandable by both machines and humans.



Advanced Macromedia ColdFusion MX 7 Application Development
Advanced Macromedia ColdFusion MX 7 Application Development
ISBN: 0321292693
EAN: 2147483647
Year: 2006
Pages: 240
Authors: Ben Forta, et al

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net