XML Schemas

for RuBoard

NOTE

As I write this (January 2001), the XML Schema syntax is still in the process of being approved. The W3C is currently considering input from several sources and will likely publish a standard later this year. It's likely that the final XML Schema syntax will differ semantically from XML Data-Reduced (XDR) schemas, the schema syntax currently supported by Microsoft's XML-enabled products (including SQL Server). Microsoft has announced that it will support whatever the final syntax is, so keep an eye out for changes in the technology.


I mentioned earlier that DTDs were somewhat old-fashioned. The reason for this is that there's a newer , better technology for validating XML documents. It's called XML Schema. Unlike DTDs, you build XML Schema documents using XML. They consist of elements and attributes, just like the XML documents they validate. They have a number of other advantages over DTDs, including the following:

  • DTDs cannot control the kind of information a given element or attribute can contain. Merely being able to specify that an element stores text is not precise enough for most business needs. We may want to specify the format that text should have, or whether the text is a date or a number. XML Schema has extensive support for data domain control.

  • DTDs feature only ten stock data types. XML Schema features more than 44 base data types, plus you can create your own.

  • All declarations in a DTD are global. This means that you can't define multiple elements with the same name , even if they exist in completely different contexts.

  • Because DTD syntax is not XML, it requires special handling. It cannot be processed by an XML parser. This adds complexity to documents with associated DTDs and potentially slows down their processing.

A complete discussion on XML Schema is outside the scope of this book, but we should still touch on a few of the high points. Let's have a look at a validation schema for the recipe.xml document we built earlier. Here's what it might look like (Listing 12-6):

Listing 12-6 An XML schema for our recipe document.
 <?xml version="1.0" ?> <xsd:schema xmlns:xsd="http://www.w3.org/2000/10/XMLSchema" elementFormDefault="qualified">    <xsd:element name="Recipe">       <xsd:complexType>          <xsd:sequence>             <xsd:element name="Name" type="xsd:string"/>             <xsd:element name="Description" type="xsd:string"/>             <xsd:element name="Ingredients">                <xsd:complexType>                   <xsd:sequence>                      <xsd:element name="Ingredient" maxOccurs="unbounded">                         <xsd:complexType>                            <xsd:sequence>                               <xsd:element name="Qty">                                  <xsd:complexType>                                     <xsd:simpleContent>                                        <xsd:restriction base="xsd:byte">                                           <xsd:attribute name="unit" use="required">                                              <xsd:simpleType>                                                 <xsd:restriction base="xsd:NMTOKEN">                                                    <xsd:enumeration value="dash"/>                                                    <xsd:enumeration value="each"/>                                                    <xsd:enumeration value="dozen"/>                                                    <xsd:enumeration value="cups"/>                                                    <xsd:enumeration value="teasp"/>                                                    <xsd:enumeration value="tbls"/>                                                 </xsd:restriction>                                              </xsd:simpleType>                                           </xsd:attribute>                                        </xsd:restriction>                                     </xsd:simpleContent>                                  </xsd:complexType>                               </xsd:element>                               <xsd:element name="Item">                                  <xsd:complexType>                                     <xsd:simpleContent>                                        <xsd:restriction base="xsd:string">                                           <xsd:attribute name="optional" type="xsd:boolean"/>                                        </xsd:restriction>                                     </xsd:simpleContent>                                  </xsd:complexType>                               </xsd:element>                            </xsd:sequence>                         </xsd:complexType>                      </xsd:element>                   </xsd:sequence>                </xsd:complexType>             </xsd:element>             <xsd:element name="Instructions">                <xsd:complexType>                   <xsd:sequence>                      <xsd:element name="Step" type="xsd:string"/>                   </xsd:sequence>                </xsd:complexType>             </xsd:element>          </xsd:sequence>       </xsd:complexType>    </xsd:element> </xsd:schema> 

Look a little daunting? It's a bit longer than the DTD we looked at earlier, isn't it?!

It's not as bad as it might seem. Most of the document consists of opening and closing tags. The schema itself is not that complex.

The first thing you should notice is that each of the elements and attributes in the XML document is assigned a data type. When the document is validated with this schema, each piece of data in the document is checked to see whether it's valid for its assigned data type. If it isn't, the document fails the validation test.

Next, take a look at the maxOccurs element. Via a schema, you can specify a number of ancillary properties for elements, including how many (or how few) times an element can appear in a document. The default for both minOccurs and maxOccurs is 1. You can make an element optional by setting its minOccurs attribute to 0.

Next, notice the xsd:enumeration elements under the unit attribute. In an XML schema, you can specify a list of valid values for an element or attribute. If an element or attribute attempts to store a value not in the list, the document fails validation.

Last, notice the new data type for the Item element's optional attribute. I've changed it from an integer to a Boolean valueone of the stock data types supported by XML Schema. I point this out because I want you to realize the very rich data type set that XML Schema offers. Not only that, but understand that you can create new types by extending the existing one. Furthermore, you can create complex typeselements that contain other elements and attributes. In the schema listed earlier, the Qty data type is a complex data type, as are the Ingredients and Instructions types. Any schema element that contains other elements or attributes is, by definition, a complex data type.

You may be wondering how you associate a schema with an XML document. You do so by adding a couple of attributes to the document's root element. For example, the root element in our recipe.xml document now looks like this:

 Recipe xmlns:xsi="http://www.w3.org/2000/10/XMLSchema-instance" xsi:noNamespaceSchemaLocation="C:\ _data\ ggssp\ Ch12\ code\ recipe.xsd"> 

The first attribute makes the elements in the xsi (the XML Schema Instance) namespace available to the document. A namespace is a collection of names identified by a URI reference. You can define your own, or you can do as we've done here and refer to a namespace defined on the W3C Web site. As in many programming disciplines, an XML namespace provides name scoping to an application so that names from different sources do not collide with one another. Unlike traditional namespaces, the names within an XML namespace do not have to be unique. Without veering off into why this is, for now, just understand that a namespace gives scope to the names you use in XML. In this particular case, it provides access to the names in the xsi namespace, which is where XML Schema Instance elements reside. By referring to the namespace in this way, we can use XML Schema Instance elements in the document by prefixing them with xsi:.

The second attribute describes the location of the XML Schema document. This is the document listed earlier. It contains the schema info for our document.

Once these attributes are in place, "XML Schema-aware" tools will validate the document using the schema identified by the attribute.

for RuBoard


The Guru[ap]s Guide to SQL Server[tm] Stored Procedures, XML, and HTML
The Guru[ap]s Guide to SQL Server[tm] Stored Procedures, XML, and HTML
ISBN: 201700468
EAN: N/A
Year: 2005
Pages: 223

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net