A Few Different Document Approaches | Using XML with Legacy Business Applications

I have long maintained that you have to understand the data before you can really understand the system. A similar truth holds for schemas; the best way to understand a particular schema is to first understand an instance document that conforms to the schema. A few major variations in how instance documents look have predictable impacts on the schemas.

Naming conventions : You are going to see a wide variety in style, lengths, and abbreviations. You'll also see some variety in capitalization and word separators, though upper camel case (e.g., UpperCamelCase) seems to be the preferred style. Just hope that whoever wrote the schema you are using picks one style and uses it consistently. You'll probably feel like shopping for rope if they didn't.
Elements and Attributes : Especially for the early adopters who were somewhat constrained by the limits of DTDs and the first parsers, many people developing schemas exclusively use Elements to convey business data. However, some use a mixture of Elements and Attributes. Most who do the latter have a semantic basis for deciding which should be which, but unfortunately not all do. You will also see a style of using Attributes strictly to convey metadata (data about data) or control information, rather than for data produced or consumed by a business application. For awhile Microsoft was promoting a style that used Attributes exclusively. This allowed a few bytes to be dropped since empty Elements didn't need end tags. This style seems to have passed.
Specific names versus qualifier/value pairs : An example of this contrast can be found in phone numbers . The specific name approach uses a construct like the following.
```
 <BusinessPhoneNumber>972-555-1212</BusinessPhoneNumber> 
```
The qualifier/value pair approach shown below uses a generic Element name with a qualifier to convey more specific meaning. In this example the qualifier is an Attribute, but both the qualifier and value could just as easily be sibling Elements under the same parent.
```
 <PhoneNumber phoneNumberType="BIZ">972-555-1212</PhoneNumber> 
```
Again, in this chapter I'm trying not to comment too much on the merits. Just be aware that you're probably going to run into both styles.
Structure : This topic is a bit problematical. XML lends itself very naturally to expressing the relationships between different data items by putting them under the same parent in a hierarchy. However, in some cases this leads to duplicated or repeated data. A good example of this is a purchase order with "ship to" locations at the line item level, with several lines going to the same locations. There are proposals in some organizations to use a more relational model. In this example the "ship to" locations might be listed in a header area, each with an ID number. Then, each line item would reference the "ship to" ID rather than duplicating the complete address. I've yet to see any widely used schemas that take this type of approach, but don't be surprised if you see it.