Listing 14.1 is an XML document that describes multiple companies and their employees. Listing 14.1. CompaniesAndEmployees.xmlAn XML Document<?xml version="1.0" ?> <companies> <company name="ABC MegaCorp, Inc." location="NY"> <comments>A very big company that does many different things.</comments> <employee ssn="123-45-6789"> <first-name>Ed</first-name> <last-name>Johnson</last-name> <department>Human Resources</department> <children> <child name="Sean" /> <child name="Polly" /> </children> </employee> <employee ssn="541-29-8376"> <first-name>Maria</first-name> <last-name>Smith</last-name> <department>Accounting</department> </employee> </company> <company name="Baker & Associates" location="CA"> <comments>A midsized investment firm.</comments> <employee ssn="568-73-1924"> <first-name>Eric</first-name> <last-name>Masters</last-name> <department>Accounting</department> </employee> <employee ssn="714-52-6938"> <first-name>Tanya</first-name> <last-name>Peeples</last-name> <department>Public Relations</department> <children> <child name="Sandra" /> </children> </employee> </company> </companies> Let's break down the composition of this document. The first line is called an XML Declaration. This tells an XML parser that the ensuing content is properly formed XML and that it conforms to version 1.0 of the XML specification. Though this declaration is not required in all XML documents, it's a good habit to always put it at the top of the document. The second line is the opening tag for this document's root element. Every XML document must have one and only one element at the root of its hierarchy, and in this case the root element is companies. Between the opening and closing tags of the companies element, there are company elements, one for each of the companies in this collection. Each company element contains a comments element that describes the company, and one or more employee elements for each of the company's employees. The employee element contains first-name, last-name, and department elements, as well as an optional children element to describe the employee's children. The structure of an XML document is more formal than that of an HTML document. XML is case-sensitive (so <a> and <A> are two different tags), whereas HTML is not (so <a> and <A> are equivalent). XML also requires that all attribute values be surrounded with double quotes. In HTML, we could write location=CA, but XML requires that we write location="CA" instead. Probably the biggest problem people have with XML concerns the requirement for closing tags. In an XML document, every tag must be closed. Even if the tag has no content (such as the child elements in Listing 14.1), you must close the tag; if you don't, the XML parser that processes the document will throw an error. In the case of the <child> tags in Listing 14.1, we used a special shorthand closing syntax: <child name="Sandra"></child> is equivalent to <child name="Sandra" />. One last notethe following characters are illegal in attribute values and element content and must be escaped using their equivalent entity escape codes:
Entity references will be discussed in more detail later in the chapter. Following the example and guidelines outlined in this section ensures that an XML document is well formed, meaning that it follows the standard rules of XML document structure. Any XML parser should be able to read a well-formed document. Elements and Their AttributesThere is an ongoing discussion in the XML world about when to use elements to store data and when to use attributes. For instance, this portion of Listing 14.1: <employee ssn="568-73-1924"> <first-name>Eric</first-name> <last-name>Masters</last-name> <department>Accounting</department> </employee> could also have been represented like this: <employee ssn="568-73-1924" first-name="Eric" last-name="Masters" department="Accounting" /> Some people like using elements; others prefer attributes. Different groups claim that one method is inherently superior to the other, but this is not really true. The decision as to whether to use an attribute or a child element in any given circumstance is one that should be made according to the developer's opinion of what will work for that situation. In making your decision, here a few rules to keep in mind:
In the earlier Listing 14.1, the only reason we chose to use elements rather than attributes was to be consistent with earlier listings. That listing could easily have been written like this:
Using attributes or elements is not an all-or-nothing deal. It is up to you and your development team to find the proper mix of data in attributes versus data in elements. Naming ConventionsThere seem to be as many "standard" XML naming conventions as there are XML developers. Some people would name my first-name element from Listing 14.1 FirstName; others would name it first_name; yet others would name it fn. All of these are valid names, but first-name is still the simplest and easiest for most of us to understand. There is no one standard naming convention. Rather, there is the naming convention that is most comfortable for you and the other developers on your team. The conventions presented here is merely a guideline that has helped me cut down on confusion in the past.
|