Schema Examples


As we discussed in Chapter 6, schemas are optional in our approach. We can successfully run the utilities without them. However, if we want to be sure that certain business constraints are enforced, such as making specific fields mandatory, schemas are required. Schema validation is the mechanism we have chosen to enforce those constraints. In Chapter 6 I talked about an approach for designing these schemas. The basic recommendation is to take an instance document of the desired type, such as an invoice or purchase order, load it into a tool like XMLSPY, and have the tool create the schema. Here we'll review some schemas for our sample documents, the invoice and purchase order.

In this book I'm trying, for the most part, to avoid discussing specific products and how to use them. I'm not going to go through detailed instructions of how to use XMLSPY or TurboXML to generate a schema. Both products have good online help; use it. However, in this section I do show you the schemas for the invoice and purchase order as I modified them after using XMLSPY to generate them. Then I briefly discuss the kinds of corrections and cleanup I had to perform. First, here are the corrected schemas.

Invoice Schema (CSVInvoice.xsd)
 <?xml version="1.0" encoding="UTF-8"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"     elementFormDefault="unqualified">   <xs:element name="Invoice">     <xs:complexType>       <xs:sequence>         <xs:element name="InvoiceLine" maxOccurs="unbounded">           <xs:complexType>             <xs:sequence>               <xs:element name="CustomerNumber" type="xs:token"/>               <xs:element name="InvoiceNumber" type="xs:token"/>               <xs:element name="InvoiceDate" type="xs:date"/>               <xs:element name="PONumber" type="xs:token"/>               <xs:element name="DueDate" type="xs:date"/>               <xs:element name="ShipToName" type="xs:token"/>               <xs:element name="ShipToStreet1" type="xs:token"/>               <xs:element name="ShipToStreet2" type="xs:token"                   minOccurs="0"/>               <xs:element name="ShipToCity" type="xs:token"/>               <xs:element name="ShipToStateOrProvince"                   type="xs:token"/>               <xs:element name="ShipToPostalCode"                   type="xs:token"/>               <xs:element name="ShipToCountry" type="xs:token"                   minOccurs="0"/>               <xs:element name="ItemID" type="xs:token"/>               <xs:element name="ItemQuantity"                   type="xs:positiveInteger"/>               <xs:element name="UnitPrice" type="xs:decimal"/>               <xs:element name="ItemDescription"                   type="xs:token"/>               <xs:element name="ExtendedPrice"                   type="xs:decimal"/>             </xs:sequence>           </xs:complexType>         </xs:element>       </xs:sequence>     </xs:complexType>   </xs:element> </xs:schema> 
Purchase Order Schema (CSVPurchaseOrder.xsd)
 <?xml version="1.0" encoding="UTF-8"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"     elementFormDefault="unqualified">   <xs:element name="PurchaseOrder">     <xs:complexType>       <xs:sequence>         <xs:element name="POLine" maxOccurs="unbounded">           <xs:complexType>             <xs:sequence>               <xs:element name="CustomerNumber" type="xs:token"/>               <xs:element name="PONumber" type="xs:token"/>               <xs:element name="PODate" type="xs:date"/>               <xs:element name="RequestedDeliveryDate"                   type="xs:date"/>               <xs:element name="ShipToName" type="xs:token"/>               <xs:element name="ShipToStreet1" type="xs:token"/>               <xs:element name="ShipToStreet2" type="xs:token"                   minOccurs="0"/>               <xs:element name="ShipToCity" type="xs:token"/>               <xs:element name="ShipToStateOrProvince"                   type="xs:token"/>               <xs:element name="ShipToPostalCode"                   type="xs:token"/>               <xs:element name="ShipToCountry" type="xs:token"                   minOccurs="0"/>               <xs:element name="ItemID">                 <xs:simpleType>                   <xs:restriction base="xs:token">                     <xs:enumeration value="HCDUC"/>                     <xs:enumeration value="HCMIN"/>                     <xs:enumeration value="HCMOC"/>                     <xs:enumeration value="HCVAN"/>                   </xs:restriction>                 </xs:simpleType>               </xs:element>               <xs:element name="OrderedQty"                   type="xs:positiveInteger"/>               <xs:element name="UnitPrice" type="xs:decimal"                   minOccurs="0"/>               <xs:element name="ItemDescription" type="xs:token"                   minOccurs="0"/>             </xs:sequence>           </xs:complexType>         </xs:element>       </xs:sequence>     </xs:complexType>   </xs:element> </xs:schema> 

The exact schema generated and the corrections you must make depend on the tool you use and the particular options you select. Here are some of the changes I had to make to the invoice schema that XMLSPY 4.3 generated for me using my default options.

  • A value of "qualified" was assigned to the root Element's elementForm Default Attribute. I changed it to "unqualified" since we don't want to enforce namespace prefixes.

  • InvoiceNumber was enumerated and assigned a type of int. I changed the type to token and removed the enumeration.

  • A data type of string was assigned to ShipToName, ShipToStreet1, ShipTo Street2, ShipToCity, ShipToStateOrProvince, ShipToPostalCode, ShipToCountry, and ItemDescription. I wanted the schema validation to enforce a business constraint that there must be at least one nonspace character in these fields. The schema language string data type is insufficient for this purpose. At least two alternatives will enforce this constraint. I discussed one at the end of Chapter 6 (restricting the string data type with a pattern facet). For this example I used a somewhat easier (and lazier) approach of just using the token data type. We can justify this for our fictional scenario by saying we know that our data for these columns never contains two consecutive spaces. If the data in your real-life situation doesn't conform to this contrivance, you'll need to use the pattern facet.

  • Enumerations were generated for CustomerNumber, PONumber, ShipToCity, and ShipToStateOrProvince, restricting the string data type that XMLSPY had assigned them. I removed the enumerations.

  • ItemID was assigned a data type of string, further restricted by the enumerated IDs. I changed the data type to token and removed the enumeration for ItemID. Since the source of the data is our invoicing system, we can assume we're using valid item IDs and we don't need our schema to validate them. We have a somewhat different situation for the purchase order schema. We can't have the same confidence that our customers will send us the correct IDs for the items they order. Let's assume that we've decided for our application architecture that we would like the schemas to enforce validation of item IDs. (Several considerations for schema validation are discussed in Chapter 12.) In the standards world there has been an ongoing debate about what constitutes a "code" that might generally be schema validated and what constitutes a "unique identifier" that might not generally be schema validated. From a strict semantic perspective, our item IDs ( especially when expressed as UPC codes, as we'll see in Chapters 9 and 10) are unique identifiers rather than codes. However, we'll bypass these theoretical arguments with a more pragmatic consideration. Big Daddy's Gourmet Cocoa sells only about a dozen different items, and it is very easy and practical to enumerate their IDs in a schema. So, we will.

  • XMLSPY also created enumerations for ItemQuantity, UnitPrice, and ExtendedPrice, which I removed. ItemQuantity was assigned a data type of short, which I changed to positiveInteger.

Other than these corrections, XMLSPY did a pretty good job. It set up the root and row Elements, Invoice and InvoiceLine, as anonymous types. This does not particularly foster reusability, but that's not really an issue for this application. It correctly determined that ShipToStreet2 and ShipToCountry are optional, assigning a minOccurs value of "0" to each, and that all the other columns are mandatory. It gave us the default minOccurs value of "1" on InvoiceLine and a maxOccurs value of "unbounded."

So, the tools can do most, but not all, of the schema creation work for you. As simple as these schemas are, you might be able to code them by hand. However, I would not recommend that approach for more complex cases, particularly for those we'll see in the next two chapters.



Using XML with Legacy Business Applications
Using XML with Legacy Business Applications
ISBN: 0321154940
EAN: 2147483647
Year: 2003
Pages: 181

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net