Creating a Schema | XML and ASP.NET

only for RuBoard

When creating a schema, you need to decide on what namespace you want to use to reference the document. As you saw before, namespaces are not necessarily URLs to documents on the web, although they can be. Web URLs are typically used because they are unique across the Internet, but you can use a URN or any other unique identifier. In fact, the only requirements for namespaces are that the value for the namespace attribute matches the namespace for the XSD document and they follow the recommendation for naming elements and attributes. URLs are used in this example for clarity.

XML Schemas contain a top-level element named schema and use the namespace www.w3.org/2001/XMLSchema (hereafter referred to as xsd namespace or the XML Schema namespace) so that processors will know how to process XSD schemas. A schema element contains type declarations using simple and complex types as well as element and attribute declarations as its children. Begin by creating the top-level schema element and declaring the namespaces that are used in the document. The convention for working with XML Schemas is to use the prefix xsd when referring to the XML Schema elements.

Should I Use the Default Namespace or a Namespace Prefix?

You can use the XML Schema namespace as the default namespace so that the xsd prefix is not required for all elements and attributes, requiring less typing. I use the xsd prefix on all elements and attributes because this is consistent with most available documentation and tools.

Listing 2.2 shows a sample schema for the XML document in Listing 2.1. Notice that a default namespace is used for this document by declaring the namespace http://www.xmlandasp.net with no namespace prefix.

Listing 2.2 A Sample Purchase Order Schema

 <xsd:schema id="PURCHASEORDER"  xmlns="http://www.xmlandasp.net"   targetNamespace="http://www.xmlandasp.net/schema/Sales.xsd"   xmlns:xsd="http://www.w3.org/2001/XMLSchema"   attributeFormDefault="unqualified"   elementFormDefault="qualified">  <xsd:simpleType name="emailType">       <xsd:restriction base="xsd:string">            <xsd:pattern value=""/>       </xsd:restriction>  </xsd:simpleType>  <xsd:complexType name="customerType">       <xsd:sequence>            <xsd:element name="NAME" type="xsd:string" />            <xsd:element name="PHONE" type="xsd:string" />            <xsd:element name="EMAIL" type="emailType" />       </xsd:sequence>  </xsd:complexType>  <xsd:element name="PURCHASEORDER">       <xsd:complexType>            <xsd:choice maxOccurs="unbounded">                 <xsd:element name="CUSTOMER" type="customerType"/>                 <xsd:element name="ORDER">                      <xsd:complexType>                           <xsd:sequence>                                <xsd:element name="ITEM" minOccurs="1"  maxOccurs="unbounded">                                     <xsd:complexType>                                           <xsd:sequence>                                                 <xsd:element name="ITEMNAME"  type="xsd:string" />                                                 <xsd:element name="DESCRIPTION"  type="xsd:string" />                                                 <xsd:element name="SIZE"  type="xsd:integer" />                                                 <xsd:element name="PRICE"  type="xsd:string" />                                          </xsd:sequence>                                      </xsd:complexType>                                </xsd:element>                           </xsd:sequence>                           </xsd:complexType>                 </xsd:element>            </xsd:choice>       </xsd:complexType>  </xsd:element>  </xsd:schema>

The highlighted section in Listing 2.2 shows the various namespaces in use. The default namespace used in this XML document is http://www.xmlandasp.net. You can also see the XML Schema namespace declared and aliased with the xsd prefix. The attributeFormDefault and elementFormDefault attributes signify whether the elements and attributes must be qualified using the namespace prefix. (This is discussed in more detail in the section, "Creating Attributes.") The targetNamespace attribute signifies to the processor the namespace that is used when referencing this schema.

After declaring the namespaces that are used in the document, the elements that are valid for this schema are declared, the order they must be used in, and the data types for the values contained in the elements.

Creating Elements

Creating an element in an XML Schema document is fairly straightforward. You can use the xsd:element tag to declare an element. Suppose that you want to create a document with a single root node, REPORTS , that contains character data, such as the following:

 <xsd:schema id="REPORTSCHEMA"  xmlns="http://www.xmlandasp.net"   targetNamespace="http://www.xmlandasp.net/schema/sales.xsd"   xmlns:xsd="http://www.w3.org/2001/XMLSchema">  <xsd:element name="REPORTS" type="xsd:string"/>  </xsd:schema>

Elements can be declared with a type attribute to signify the character data that they contain. The PURCHASEORDER element in Listing 2.2 is not defined with a type attribute because it does not hold character data; it is simply the top-level element of the document that contains other elements. You can also define elements within hierarchies or define complex types of your own that are element hierarchies, as discussed in the section, "Complex Types."

Creating Attributes

Creating attributes are just as simple as creating elements. An attribute cannot exist on its own; it must be contained within an element. In XML Schemas, you can use the xsd:attribute tag to signify the presence of an attribute. However, the xsd:attribute tag cannot be a direct descendant of the xsd:element tag; it must be a child of a complex type tag (for more information on complex types, see the section, "Complex Types"). Therefore, you need to use the xsd:attribute tag as a child of an xsd:complexType tag to create an attribute. Add a LOCATION attribute to your REPORTSCHEMA example:

 <xsd:schema id="REPORTSCHEMA"  xmlns="http://www.xmlandasp.net"   targetNamespace="http://www.xmlandasp.net/schema/sales.xsd"   xmlns:xsd="http://www.w3.org/2001/XMLSchema">  <xsd:element name="REPORTS" type="xsd:string">       <xsd:complexType>                      <xsd:attribute name="LOCATION" type="xsd:string"/>                 </xsd:complexType>  </xsd:element>  </xsd:schema>

In the PURCHASEORDER schema, you specified attributeFormDefault="qualified" , but left it out in this example. If the attribute is left out, its default value is unqualified . This means that you do not have to explicitly use the namespace prefix when referencing an attribute. If this had been specified as qualified, the following instance document would be schema-valid:

 <xmlandasp:REPORTS xmlns:xmlandasp="http://www.xmlandasp.net"  xmlandasp:LOCATION="blah"/>

Notice that the REPORTS node is prefixed with the namespace prefix xmlandasp , and the namespace prefix is associated with the proper namespace definition using the xmlns attribute.

Because the elementFormDefault element specified the value qualified, you must use a namespace prefix when referencing the element name. The following would be schema-invalid because no namespace prefix is used to distinguish the element's namespace:

 <REPORTS xmlns="http://www.xmlandasp.net" LOCATION="blah">

Each element and attribute declaration supports the xsd:form attribute to override at the item level that the element's qualification should be:

 <xsd:attribute name="LOCATION" type="xsd:string" form="qualified"/>

To avoid confusion with namespaces and qualified locals, the unqualified form of attributes are going to be used.

Elements and attributes are the easiest part of schemas. The more challenging and artistic aspect of schemas is the ability to define your own types and structures for reuse throughout your document.

Declaring Types

One of the most important aspects of XML Schemas is the ability to define and validate types within an XML document. In the PURCHASEORDER example, you specified that the PHONE element is of type xsd:string. More specifically , it declares that an element named PHONE conforms to the definition of string contained in the xsd namespace. The xsd namespace contains a set of predefined simple types that you can use to define your own types. Types need not be constrained to data types, however. A type can be defined as a set of elements and attributes, as well as other complex types.

Using the Built-In Types

As previously mentioned, the xsd namespace contains data types that are already built into it. Included in this namespace are definitions for string , integer , date , double , and many other data types. An example of using the built-in data types in the following example is the EMAIL element:

 <xsd:element name="EMAIL" type="xsd:string" />

You can easily see that the type is string . A host of data types are built into the XML Schema namespace. Figure 2.1 shows the data types and their hierarchy.

Figure 2.1. The type hierarchy for built-in types.

Note that anyType is at the root of this type hierarchy. All built-in types are simple types based (directly or indirectly) on anySimpleType. May, 2, 2001, World Wide Web Consortium (Massachusetts Institute of Technology, Institut National de Recherche en Informatique et en Automatique, Keio University). All Rights Reserved. www.w3.org/Consortium/Legal/.

Creating Simple Types

Schemas provide developers with a mechanism to not only use the built-in types, but also to declare your own types. A simple type is exactly what it sounds like: It is the simplest form of a type, describing a single value that is not compound. The easiest way to think of a simple type is as a data type in a programming language. C++, C#,Visual Basic, Java, Pascal, and most other languages support the concept of an integer or a string. Simple types can be thought of in the same manner: They describe a single value. A good representation of a simple type might be a type declaration for a numeric zip code or a U.S. Social Security number stored as a string with special formatting. An example of using a simple type in the following example is the EMAIL element:

 <xsd:element name="EMAIL" type="emailType" />

We declared the type as emailType , which was declared elsewhere in the schema document. Creating your own types in schemas is similar to declaring a type in a programming language:You define the new type based on other types and give it a name. With XML Schemas, you can also specify different content for the type, including a restriction, list, or union.

A content type of union provides a union between two or more simple types. A content type of list provides a way to give a list of space-separated values. A content type of restriction allows you to provide constraints on a simple type. I focus on using the content type of restriction to provide constraints for a simple type.

You can have multiple constraints for a type, defined by the use of facets. The xsd:pattern facet is specified in the following code, which allows you to specify a regular expression to validate the email address:

 <xsd:simpleType name="emailType">       <xsd:restriction base="xsd:string">            <xsd:pattern value=" \w+([-+.]\w+)*@\w+([-.]\w+)*\.\w+([-.]\w+)*"/>       </xsd:restriction>  </xsd:simpleType>

What Are Regular Expressions?

Regular expressions (also called regex or regexp) are sets of symbols and elements used to match patterns in text. They are a powerful method of defining valid XML text element and attribute values.

Besides the data type and facets, you can also use element attributes to further explain an element or attribute. For example, you can specify the maxOccurs and minOccurs elements to find out how many times an element can occur within a sequence of elements. The following declaration shows that the EMAIL element is optional, but can appear more than once:

 <xsd:element name="EMAIL" type="xsd:string" minOccurs="0"  maxOccurs="unbounded"/>

You can use the default attribute to specify a default value for an attribute. If you were defining an attribute, you might use the use attribute to specify whether the attribute is required, optional, or prohibited .

Attributes and Occurrence Constraints

By definition, an attribute can only occur once within a single element. Therefore, the minOccurs and maxOccurs attributes are not valid when declaring a type that will be used within an attribute node.

Furthering the example, you can specify the formatting rules more explicitly by adding validation rules, or facets, to the type definition. The following shows how a regular expression is used to validate the email address:

 <xsd:simpleType name="emailType">       <xsd:restriction base="xsd:string">  <xsd:pattern value="\w+([-+.]\w+)*@\w+([-.]\w+)*\.\w+([-.]\w+)*"/>  </xsd:restriction>  </xsd:simpleType>

After the type is properly defined in the schema, you can declare an element or attribute to use the simple type:

 <xsd:element name="EMAIL" type="emailType" minOccurs="0"  maxOccurs="unbounded"/>

By breaking the definition of the simple type apart from the rest of the document, you can reuse the definition of an email address throughout your document.

Simple Type Facets

In the previous email example, a regular expression was used to validate what an email address should look like. To do this, a facet was declared to describe an aspect or characteristic of the data type. A facet is a validation rule to express the validation of a given type in more detail than the built-in simple types allows. There are two types of facets: fundamental and constraining . Constraining facets are the focus of this section. A constraining facet validates a value against the given set of rules. For example, suppose that you define a Quantity property and determine that the minimum quantity value allowed is 1:

 <= Quantity

To express this in XML Schema, use the minInclusive facet. You can now write the expression as an XML Schema element:

 <xsd:simpleType name="creditScoreType">  <xsd:restriction base="xsd:integer">  <xsd:minInclusive value="1">  </xsd:restriction>  </xsd:simpleType>

Similarly, the minExclusive facet denotes that an attribute or element of this type must be greater than the specified value. The minInclusive and maxExclusive elements work in the same manner. We want to express that the maximum value for the Quantity element is up to, but not including, 10. Here's an example:

 <= Quantity < 10

To express this in XML Schemas, you can combine several facets to form the expression, as follows :

 <xsd:simpleType name="creditScoreType">  <xsd:restriction base="xsd:integer">  <xsd:minInclusive value="1">   <xsd:maxExclusive value="10">  </xsd:restriction>  </xsd:simpleType>

Not all facets are applicable to all types, however. For example, the minLength facet does not apply to a numeric type, and the minInclusive facet does not apply to a string. Figure 2.2 shows the different facets and how they apply.

Figure 2.2. Constraining facets and applicable data types.

Steven Holzner. Inside XML . New Riders Publishing, 2001 (Indianapolis, IN). ISBN 0-7357-1020-1.

So far, the focus has been on some basics, such as declaring types, creating types, and performing validation. You have been using elements and attributes all along, but they have not been explained. Refer to Chapter 3, "XML Presentation," for more information on elements and attributes. The following sections show you how to define elements and attributes within an XML Schema.

Complex Types

Suppose that you want to describe an address element that contains child elements describing each part of an address, such as street, city, state, and zip code. You can simply use the following code line:

 <xsd:element name="Address" type="xsd:string"/>

For some uses where the presence and format of an address is not at all critical, this can be an acceptable representation of an address. For other applications that need to differentiate between different components of an address, this would not be acceptable. Using the hypothetical purchase order XML instance document, you might want to require not only a full address, but also that the zip code and state fields within the address conform to a certain format. You can approach this by breaking out the different components of an address into different elements or attributes:

 <xsd:element name="Name" type="xsd:string" />  <xsd:element name="Street" type="xsd:string" />  <xsd:element name="City" type="xsd:string" />  <xsd:element name="State" type="xsd:string" />  <xsd:element name="Zip" type="xsd:string" />

This works for defining a single address for a single use within your document. However, what if the same definition of an address is used in multiple places throughout your document? You cannot reuse the definition if you have more than one address defined in your schema unless you redefine the concept of an address multiple times in your document. For times when a simple type will not suffice, XML Schemas provide the concept of complex types to group data into manageable sections or logical chunks . Simple types cannot contain elements or attributes; they only define a single data type. Complex types, however, can contain both elements and attributes, as well as convey order of precedence for element hierarchies.

Instead of using the xsd:string type for the Address element, you can create your own addressType data type by using the xsd:complexType element.

Begin by defining what a State element would look like. A State is a simple type, so you can create a simple type called stateType . Specify its base type as xsd:string . Also, you can use a facet to specify that the state can only contain two letters by specifying a regular expression in the pattern attribute. Facets are covered in more detail in the following section.

Here's a definition of what the simple type representing a state might look like:

 <xsd:simpleType name="stateType">       <xsd:restriction base="xsd:string">            <xsd:pattern value="^\w{2}$" />       </xsd:restriction>  </xsd:simpleType>

Similar to the State type, you also want to create a type to describe a zip code. To do this, you can use a simple integer data type. You might want to develop a regular expression to handle an optional four-digit extension for a zip code with a hyphen as a separator. For our purposes, base your new simple type on the built-in integer data type. Also, use a facet to specify that the value can only be a five-digit numerical value, as follows:

 <xsd:simpleType name="zipType">       <xsd:restriction base="xsd:integer">  <xsd:pattern value="^\d{5}$" />  </xsd:restriction>  </xsd:simpleType>

Declaring XML Schemas can be somewhat of an art rather than a science. For example, you could have specified the base type for the preceding xsd:restric tion element as xsd:string . This is because you specified the restriction in the regular expression to only allow five numerical digits. It is clearer to see that the value holds a five-digit number when the type is based on a numeric instead of a string, which allows for easier readability. The following section looks at different types of facets.

You can finish your address example by defining what a complete address looks like. The State and Zip elements are now based on your new simple types:

 <xsd:complexType name="addressType">       <xsd:sequence>            <xsd:element name="Name" type="xsd:string" minOccurs="1" maxOccurs="1" />            <xsd:element name="Street1" type="xsd:string" minOccurs="1" maxOccurs="1" />            <xsd:element name="Street2" type="xsd:string" minOccurs="0" maxOccurs="1"/>            <xsd:element name="City" type="xsd:string" minOccurs="1" maxOccurs="1" />  <xsd:element name="State" type="stateType" minOccurs="1" maxOccurs="1" />   <xsd:element name="Zip" type="zipType" minOccurs="1" maxOccurs="1" />  </xsd:sequence>  </xsd:complexType>

Notice the use of the minOccurs and maxOccurs attributes. These specify the minimum number of times an element can occur within a given sequence. (Sequences are discussed in the section, "Sequence Groups.") A value of 1 for each element means that the element is required and is unique. A value of 0 for minOccurs means that the element is optional. Because you declared this as a type, it can also be reused throughout your schema document:

 <xsd:element name="Billing" type="addressType"/>  <xsd:element name="Mailing" type="addressType"/>

The xsd:element element cannot be directly contained as a child of the xsd:complexType element. Instead, you must declare a particle for the complex type. A particle is the content within a complex type. The particle consists of one of several types of groups, such as attribute groups, choice groups, sequence groups, and all groups.

Attribute Groups

Throughout your document, you might find that a set of attributes is common for different element types. You can define them in one place and reference them by using the xsd:attributeGroup element so that you don't have to retype them throughout your document. This helps minimize the size of the document, which decreases the time required to parse it. Let's stop working with purchase orders and reports, and move on to something more fun. Let's work with pizzas. Listing 2.3 shows an example of specifying an attribute group to reuse some complex attribute definitions.

Listing 2.3 A Schema Using an Attribute Group

 <xsd:schema id="PIZZASCHEMA"  xmlns="http://www.xmlandasp.net/pizza"   targetNamespace="http://www.xmlandasp.net/schema/pizza.xsd"   xmlns:xsd="http://www.w3.org/2001/XMLSchema"><xsd:attributeGroup   name="pizzaType">   <xsd:attribute name="size">   <xsd:simpleType base="xsd:string">   <xsd:enumeration value="small"/>   <xsd:enumeration value="medium"/>   <xsd:enumeration value="large"/>   </xsd:simpleType>   </xsd:attribute>   <xsd:attribute name="method">   <xsd:simpleType base="xsd:string">   <xsd:enumeration value="delivery"/>   <xsd:enumeration value="take out"/>   </xsd:simpleType>   </xsd:attribute>  </xsd:attributeGroup>  <xsd:element name="PIZZAVENDORS">       <xsd:complexType>            <xsd:choice>                 <xsd:element name="Dominos">                      <xsd:complexType>                           <xsd:attributeGroup ref="pizzaType"/>                      </xsd:complexType>                 </xsd:element>                 <xsd:element name="PapaJohns">                      <xsd:complexType>                           <xsd:attributeGroup ref="pizzaType"/>                      </xsd:complexType>                 </xsd:element>                 <xsd:element name="MellowMushroom">                      <xsd:complexType>                           xsd:attributeGroup ref="pizzaType"/>                      </xsd:complexType>                 </xsd:element>            </xsd:choice>       </xsd:complexType>  </xsd:element>  </xsd:schema>

You can see that using attribute groups can significantly reduce the amount of code within a document. Without them, you would need to replace each line <xsd:attributeGroup ref="pizzaType"/> with the actual attributeGroup 's definition, which gives you a bloated document.

Sequence, Choice, and All Groups

The brief REPORTS example used earlier when explaining elements and attributes was basic. In fact, it's rare for an XML document to exist that only contains a single element and a single attribute. Instead, you create hierarchies of information. These hierarchies can be validated as well.

A big advantage to using XML Schemas is that it's easy to define cardinality and frequency. Suppose that you want to create an XML Schema to validate a REPORTS document that is similar to the code in Listing 2.4.

Listing 2.4 Sample XML Document Requiring Validation

 <REPORTS>       <REPORT Name="Customer Orders By Date">            <WEBREPORT StyleSheet="CustOrdersByDate.xsl" StoredProc="spCustOrdersDate"/>            <PARAMETERS>                 <PARAMETER Name="@CustomerID" Description="Customer ID" DataType="int"/>                 <PARAMETER Name="@LowDate" Description="Beginning Date Range"/>                 <PARAMETER Name="@HighDate" Description="Ending Date Range"/>            </PARAMETERS>       </REPORT>       <REPORT Name="Customer Orders">            <CUSTOMREPORT View="vwCustOrders">                  <TABLES>                      <TABLE Name="Customer">                           <COLUMN>CustomerID</COLUMN>                           <COLUMN>CustomerName</COLUMN>                      </TABLE>                      <TABLE Name="Orders">                           <COLUMN>OrderID</COLUMN>                           <COLUMN>OrderDate</COLUMN>                      </TABLE>                   </TABLES>            </CUSTOMREPORT>            <PARAMETERS>                <PARAMETER Name="@CustomerID" Description="Customer ID" DataType="int"/>                <PARAMETER Name="@LowDate" Description="Beginning Date Range"/>                <PARAMETER Name="@HighDate" Description="Ending Date Range"/>            </PARAMETERS>       </REPORT>  </REPORTS>

You can easily imagine how you might use such a document to create a generic application to display reports. Two different types of reports exist: custom reports and web reports . A web report requires an associated stylesheet and is based on a stored procedure. A custom report provides a database view and details what tables and columns the user can choose to filter.

The new schema needs to validate much information that you normally might just assume. For example, the Name attribute values of each REPORT node would be required for the consuming application to work correctly. Web reports require a stylesheet, where custom reports require a set of tables and columns that can be presented to the user. A parameter element is optional because a report might not require any parameters; however, if a parameter element is defined, it must contain Name and Description attributes. Something that is also intimated is that the default value for the DataType attribute for each PARAMETER element is a string data type, unless it's explicitly stated otherwise . (This is because the default type is xsd:string .)

To perform this type of validation, XML Schemas provide the complexType element to define hierarchical relationships between elements or attributes of an element. As previously mentioned, complexType elements have particles that define what the complexType should validate. These particles contain groups that define how to validate the particle of the complexType .

Sequence Groups

A sequence group enables developers to define a sequence of elements that appear as a child of another element. Elements must appear in the order that they are defined in the schema. In the report schema example, the root node, REPORTS , contains child nodes called REPORT . To validate this hierarchy, create a sequence group by using the xsd:sequence element:

 <xsd:schema id="REPORTSCHEMA"  xmlns="http://www.xmlandasp.net"  targetNamespace="http://www.xmlandasp.net/schema/reports.xsd"  xmlns:xsd="http://www.w3.org/2001/XMLSchema">  <xsd:element name="REPORTS">       <xsd:complexType>  <xsd:sequence minOccurs="1">  <xsd:element name="REPORT"/>            </xsd:sequence>       </xsd:complexType>  </xsd:element>  </xsd:schema>

The sequence group is probably the most widely applicable grouping structure because it ensures that the document conforms to the structure stated explicitly.

Attribute Order Cannot Be Controlled

Schemas can specify the presence and data type of an attribute but not the order of attributes. If the order in which attributes can occur is significant, consider revising your schema to use nested elements instead of attributes.

Choice Groups

The difference between a CUSTOMREPORT and WEBREPORT in the updated REPORTS example is that a custom report requires table information for the user to be able to filter on. You can build the query on the fly from a view. A web report, on the other hand, is a static report for which the user enters parameters and is drawn from a stored procedure in the database. A report can be one or the other; it cannot be both. A choice group allows a choice between one element and another. To validate that the child of a REPORT element is either a CUSTOMREPORT node or a WEBREPORT node, use a choice group, as follows:

 <xsd:schema id="REPORTSCHEMA"       xmlns="http://www.xmlandasp.net"       targetNamespace="http://www.xmlandasp.net/schema/reports.xsd"       xmlns:xsd="http://www.w3.org/2001/XMLSchema">       <xsd:element name="REPORTS">            <xsd:complexType>                 <xsd:sequence minOccurs="1">                      <xsd:element name="REPORT">                           <xsd:complexType>  <xsd:choice>   <xsd:element name="WEBREPORT">  <xsd:complexType>                                              <xsd:attribute name="StyleSheet"  type="xsd:string"/>                                              <xsd:attribute name="StoredProc"  type="xsd:string"/>                                          </xsd:complexType>                                      </xsd:element>  <xsd:element name="CUSTOMREPORT">  <xsd:complexType>                                             <xsd:attribute name="View" type="xsd:string"/>                                          </xsd:complexType>                                      </xsd:element>                                </xsd:choice>                           </xsd:complexType>                      </xsd:element>                 </xsd:sequence>            </xsd:complexType>       </xsd:element>  </xsd:schema>

All Groups

All the elements in an all group can be present (or none at all). You use this group to validate the PARAMETERS node, because it might or might not contain child PARAMETER nodes:

 <xsd:schema id="REPORTSCHEMA"       xmlns="http://www.xmlandasp.net"       targetNamespace="http://www.xmlandasp.net/schema/reports.xsd"       xmlns:xsd="http://www.w3.org/2001/XMLSchema">  <xsd:complexType name="parameterType">   <xsd:all>  <xsd:element name="PARAMETER">                      <xsd:complexType>                           <xsd:attribute name="Name" type="xsd:string"/>                           <xsd:attribute name="Description" type="xsd:string"/>                           <xsd:attribute name="DataType" type="xsd:string"  default="varchar"/>                     </xsd:complexType>                 </xsd:element>            </xsd:all>       </xsd:complexType>       <xsd:element name="REPORTS">            <xsd:complexType>                 <xsd:sequence minOccurs="1">                      <xsd:element name="REPORT">                           <xsd:complexType>  <xsd:element name="PARAMETERS" type="parameterType"  minOccurs="1" maxOccurs="1"/>  </xsd:element>                 </xsd:sequence>            </xsd:complexType>       </xsd:element>  </xsd:schema>

You can see that a complex type named parameterType was declared that uses the xsd:all group element. We also enforced that the PARAMETERS element can only appear once as a child of each REPORT element by using the minOccurs and maxOccurs attributes. A document validated against this schema would require the PARAMETERS element, but might or might not contain any PARAMETER elements.

The following is schema-valid:

 <REPORTS>       <PARAMETERS/>  </REPORTS>

But the following is not schema-valid:

 <REPORTS/>

Declaring Mixed Content

Terms you might have heard when working with XML is data-centric and document-centric modeling. The examples used here have been data-centric. They break the document into fine-grained bits of data. A document-centric view of XML would look much more like an email or a form with fields to fill in, as shown in the following code:

 <Reminder>       <To>Pi Kappa Alpha, Epsilon Nu Mailing List</To>       <From>Kirk Allen Evans</From>       <Message>Just a reminder... The <Event>Atlanta Area Alumni Charity Golf  Outing</Event> event is scheduled for <ScheduleDate>2001-08- 27</ScheduleDate>. For information, call <Contact>Keth Bunn</Contact> at  <ContactInfo>kefbum@hotmail.com</ContactInfo>.  </Message>  </Reminder>

As you can see, the Message node contains mixed content. This is an example of a well- formed XML document, but how would you express this in terms of an XML Schema? The answer is to use the mixed attribute to signify that a section of data contains mixed content, as follows:

 <?xml version="1.0" encoding="utf-8" ?>  <xsd:schema id="memo" targetNamespace="http://tempuri.org/memo.xsd"  elementFormDefault="qualified" xmlns="http://tempuri.org/memo.xsd"  xmlns:xsd="http://www.w3.org/2001/XMLSchema">       <xsd:element name="Reminder">            <xsd:complexType>                 <xsd:sequence>                      <xsd:element name="To" type="xsd:string" minOccurs="1"  maxOccurs="unbounded"/>                      <xsd:element name="CC" type="xsd:string" minOccurs="0"  maxOccurs="unbounded"/>                      <xsd:element name="From" type="xsd:string" minOccurs="1"  maxOccurs="1"/>                      <xsd:element name="Message">  <xsd:complexType mixed="true">  <xsd:sequence>                                     <xsd:element name="Event" type="xsd:string" />                                     <xsd:element name="ScheduleDate" type="xsd:date" />                                     <xsd:element name="Contact" type="xsd:string" />                                     <xsd:element name="ContactInfo" type="xsd:string" />                                 </xsd:sequence>                           </xsd:complexType>                      </xsd:element>                 </xsd:sequence>            </xsd:complexType>       </xsd:element>  </xsd:schema>

The default value for the mixed attribute is false, which means that you cannot intermix character data with child elements unless it's specifically stated.

Clearing Confusion in Specifying Mixed Content Models

Earlier versions of the XML Schema working draft state that the markup for mixed content models was to use the content="mixed" attribute value, and most available online documentation still reflects this. The recommendation specifies the markup to be mixed="true" .

Names and Anonymous Types

Throughout this chapter, both anonymous types and named types have been used without a real explanation of the difference. The reason is because it is fairly intuitive when you are reading a schema, but can be confusing when you sit down for the first time to manually code a schema.

A named type is declared with the name attribute and can be referenced using its name. Listing 2.5 shows the revised purchase order schema document from Listing 2.2.

Listing 2.5 Revised Sample Purchase Order Schema

 <xsd:schema id="PURCHASEORDER"  xmlns="http://www.xmlandasp.net"  targetNamespace="http://www.xmlandasp.net"  xmlns:xsd="http://www.w3.org/2001/XMLSchema" attributeFormDefault="qualified"  elementFormDefault="qualified">  <xsd:simpleType name="emailType">   <xsd:restriction base="xsd:string">   <xsd:pattern value=""/>   </xsd:restriction>   </xsd:simpleType>   <xsd:complexType name="customerType">   <xsd:sequence>   <xsd:element name="NAME" type="xsd:string" />   <xsd:element name="PHONE" type="xsd:string" />   <xsd:element name="EMAIL" type="emailType" />   </xsd:sequence>   </xsd:complexType>  <xsd:element name="PURCHASEORDER">       <xsd:complexType>            <xsd:choice maxOccurs="unbounded">                 <xsd:element name="CUSTOMER" type="customerType"/>                 <xsd:element name="ORDER">                               <xsd:complexType>                                    <xsd:sequence>                                         <xsd:element name="ITEM" minOccurs="1"  maxOccurs="unbounded">                                              <xsd:complexType>                                                    <xsd:sequence>                                               <xsd:element name="ITEMNAME"  type="xsd:string" />                                               <xsd:element name="DESCRIPTION"  type="xsd:string" />                                                           <xsd:element name="SIZE"  type="xsd:integer" />                                                           <xsd:element name="PRICE"  type="xsd:string" />                                                     </xsd:sequence>                                             </xsd:complexType>                                     </xsd:element>                              </xsd:sequence>                        </xsd:complexType>                  </xsd:element>            </xsd:choice>       </xsd:complexType>  </xsd:element>  </xsd:schema>

The types emailType and customerType are examples of named types. They are referenced by using the type attribute of an xsd:element or xsd:attribute tag.

An anonymous type, on the other hand, is a simple or complex type declared as a child of an xsd:element or xsd:attribute tag without naming the type. For example, the child of the PURCHASEORDER root element in the previous code is a complexType containing both a CUSTOMER and an ORDER node. You don't have to explicitly declare this type and reference it by name because it is only used once in the document. If it were going to be used more than once, you'd want to create a named type to reuse the structure.

Using Annotations

A great part about XML Schemas is that you can comment them in such a way that they can document themselves . Instead of embedding comments into the document to be parsed, you can use the three annotation elements provided with schemas. The xsd:appinfo element can contain any well-formed XML content, so you can derive your own information to describe the schema, as shown here:

 <xsd:annotation>       <xsd:appinfo>            <DocumentVersion value="1.6.7" />            <Author>Kirk Allen Evans</Author>            <DateCreated>2001-08-15</DateCreated>       </xsd:appinfo>       <xsd:documentation>            Validates a submission memo.       </xsd:documentation>  </xsd:annotation>

Child Elements

The XML Schemas documentation in Beta 2 states that any well-formed XML content is valid as a child of the appinfo or documentation element. However, at the time of this writing, the XML Schema Designer only validates elements without child elements.

Because the schema is an XML document, it can be parsed with an XML parser to determine information about the schema. You could also apply a stylesheet to the schema itself to provide automatically generated HTML help for working with your schema using the annotations to provide meaningful content.

only for RuBoard