Structuring DataSets with Schema | Visual C#. NET 2003 Unleashed

As mentioned in the introduction, this chapter isn't just about XML schemas and XSD documents. This chapter is about how XSD documents relate to DataSets and the additional power and features you can enable when combining XML schemas with the power and flexibility of DataSets.

Defining Tables and Columns Using XML Schema

The first thing we need to do is figure out how to create XSD documents that will describe the data structure of a DataSet. When you describe XML data, you can very easily describe semi-structured, hierarchical data that might or might not have a consistent format. However, with a DataSet, you are talking about a consistent format of tables, rows, columns, relationships, keys and constraints.

The first thing you'll see are tables and columns. To create a DataSet, the first element you need in your XSD is a DataSet. As you'll see, there are some Microsoft-supplied extensions to XSD that control the behavior of XSD-structured DataSets. The following piece of XSD is the bare definition of an empty DataSet:

 <xs:schema               xmlns=""              xmlns:xs="http://www.w3.org/2001/XMLSchema"              xmlns:msdata="urn:schemas-microsoft-com:xml-msdata">     <xs:element name="MyDataSet" msdata:IsDataSet="true">     </xs:element>   </xs:schema>

It should look somewhat familiar. You'll notice that there is a new namespace declaration: msdata. The DataSet class knows about the various tags that are part of that namespace and knows how to modify DataSets according to the msdata instructions in the XSD document.

When a DataSet is obtaining its structure from an XSD document, only the top-level complex types are used to generate tables. Items below that are used to generate the columns within those tables. Simple types such as integers and strings cannot generate tables.

TIP

Just because an XSD is valid doesn't mean that it will generate a valid DataSet structure. DataSets are inherently relational data, and require a structure that can be interpreted as a list of tables, with columns that are constrained or related to each other in some way. If that structure cannot be extrapolated, the XSD will not be able to generate a valid DataSet.

Given this information, the following XSD should represent a DataSet that has two tables (Book and Author), each with two columns:

 <xs:schema               xmlns=""              xmlns:xs="http://www.w3.org/2001/XMLSchema"              xmlns:msdata="urn:schemas-microsoft-com:xml-msdata">     <xs:element name="MyDataSet" msdata:IsDataSet="true">       <xs:complexType>         <xs:choice maxOccurs="unbounded">           <xs:element name="Book" >             <xs:complexType >               <xs:sequence>                 <xs:element name="Title" type="xs:string"                               minOccurs="0" />                 <xs:element name="ISBN" type="xs:string"                               minOccurs="0" />               </xs:sequence>             </xs:complexType>            </xs:element>            <xs:element name="Author" >              <xs:complexType>                <xs:sequence>                  <xs:attribute name="Name" type="xs:string" />                  <xs:attribute name="Age" type="xs:positiveInteger" />                </xs:sequence>              </xs:complexType>            </xs:element>         </xs:choice>       </xs:complexType>     </xs:element>   </xs:schema>

Defining DataSet Keys and Constraints with XML Schema

Now that you've seen the basics of defining DataSet structure with XSD documents that can create tables and columns, let's move on to something slightly more advanced.

Tables and columns are great, but most relational databases and relational data containers have the capability to set keys and constraints. A key is an indicator of uniqueness. The key is the list of columns whose unique value distinguishes one row from the next in any given table.

A constraint is a restriction on a column within the table. When you define a key on a table, there is automatically an implied unique constraint. You can have a unique constraint on a nonunique column, but you can also have other kinds of constraints on other columns.

Key Constraints

The following XSD will create a key on a given column in a table (element within a top-level complex type):

 <xs:key  msdata:PrimaryKey="true"              msdata:ConstraintName="KeyConstraintISBN"              name="KeyISBN" >      <xs:selector xpath=".//Books" />      <xs:field xpath="ISBN" />    </xs:key>

This key element creates a key on the ISBN column within the Books table. The reason that the <xs:selector> and <xs:field> elements use XPath notation is that you can have elements that are columns, and you can also have attributes that function as columns as well, and they can be defined anywhere in the document. To allow for this kind of flexibility, the key element uses XPath to locate the elements for which the key is applicable. The msdata attributes control the way the DataSet key is created such as whether the key is a primary key, the name of the key and the constraint, and so forth. KeyConstraintISBN creates a unique constraint on the column, and the other msdata attribute sets that key as a primary key.

Unique Constraints

Unique constraints work very much like key constraints. You define one against an attribute or element that will be interpreted as a column within the XSD as follows:

 <xs:unique msdata:ConstraintName="UniqueAuthorName"      name="UniqueConstraintAuthorName" >    <xs:selector xpath=".//Books" />    <xs:field xpath="Author" /> </xs:unique>

Relationships

You can't call it relational data unless you can relate one set of data with another. That's where relationships come in. XSD doesn't call them relationships, it calls them references. There are, however, some msdata tags that you can apply to a reference that will create a proper DataSet relationship between rows in tables.

Keyref Constraints

As you might expect, a keyref is an element that refers to a key. These references from one key to another create relationships between tables. Probably the most well-known (and overused) example of a parent-child relationship between tables is the Order and OrderDetail tables. The Order table contains header or summary information, and the OrderDetail table contains individual line items. Rather than continuing that tradition, I'll continue with the books and author data I've been using so far.

Consider the following schema:

    <xs:schema  xmlns=""                xmlns:xs="http://www.w3.org/2001/XMLSchema"                xmlns:msdata="urn:schemas-microsoft-com:xml-msdata">     <xs:element name="MyDataSet" msdata:IsDataSet="true">      <xs:complexType>        <xs:choice maxOccurs="unbounded">       <xs:element name="Book">          <xs:complexType>               <xs:sequence>               <xs:element name="ISBN" type="xs:string" />               <xs:element name="Title" type="xs:string" />               <xs:element name="Author" type="xs:string" />             </xs:sequence>           </xs:complexType>          </xs:element>         <xs:element name="Author">            <xs:complexType>              <xs:sequence>                <xs:element name="Name" type="xs:string" />                  <xs:element name="Age" type="xs:integer" />              </xs:sequence>            </xs:complexType>          </xs:element>        </xs:choice>      </xs:complexType>      <xs:key name="AuthorKey"  >        <xs:selector xpath=".//Author" />        <xs:field xpath="Name" />      </xs:key>      <xs:keyref name="AuthorBooksKeyRef" refer="AuthorKey">        <xs:selector xpath=".//Book" />        <xs:field xpath="Author" />      </xs:keyref>     </xs:element>    </xs:schema>

Most of the schema above you should be able to recognize. It creates a DataSet with two tables: Author and Book . In addition, there is a key called Authorkey on the Name field in the Author table. Keys are vitally important to relationships between tables because you cannot create a relationship without a keyref, and you cannot create a keyref without a key.

When the code sets the keyref, it is creating a relationship between two tables. In the preceding case, the keyref refers to the AuthorKey, indicating that the AuthorKey is the parent. The child is indicated by the <xs:selector> and <xs:field> elements.

A DataSet whose structure was created with the preceding XSD will have a parent-child relationship between the Author table and the Book table, as well as a nonprimary, unique key on the Author table. By virtue of the relationship, a foreign key will be created on the Book table.