Using XML-Data for Schema Definitions | Designing for Scalability with Microsoft Windows DNA (DV-MPS Designing)

[Previous] [Next]

Comparing XML-Data with DTD, we find that the advantages of DTD are the disadvantages of XML-Data, just as the disadvantages of DTD are the advantages of XML-Data. The biggest advantage of XML-Data is probably that an XML-Data schema is an XML document that can be loaded in an XML processor. The XML-Data schema's language is purely XML, so there's no need to learn a separate language, as you must for DTD, to be able to create a schema. Another advantage of XML-Data is its ability to represent a multitude of data types, far exceeding those available with DTD. You'll see examples of these many data types in Chapter 21.

The one big disadvantage of XML-Data, at least right now and for some time to come, is that it's not yet an established Web standard. Its first W3C specification is just a note; you can find it at www.w3.org/TR/1998/NOTE-XML-data. The designation note means that the World Wide Web Consortium has made the document available as a basis for discussions, but they don't guarantee that they're going to assign any resources to standardize the content or parts of the content.

Microsoft is one of the companies behind the effort to make XML-Data a Web standard. Some of the document's authors do come from Microsoft, but others come from such diverse entities as ArborText, DataChannel, Inso Corporation, and the University of Edinburgh, which proves that XML-Data is a broad initiative, not just something that comes from Microsoft.

Before we go on, we must warn you not to expect that the XML-Data presented in the previously mentioned document mirrors what Microsoft has implemented today. The best source for information about XML-Data, or XML-Schema as it's being called more and more, is Microsoft's Web site.

An XML-Data Schema Is an XML Document

We've already said it, and we'll say it again: An XML-Data schema is an XML document. Therefore, it has a root element; that root element is always a Schema element, as the following sample shows:

 <Schema name="Horserace" xmlns="Urn:schemas-microsoft-com:xml-data"> <ElementType name="Track" content="textOnly"/> <ElementType name="Date" content="textOnly"/> <ElementType name="Raceno" content="textOnly"/> <ElementType name="Distance" content="textOnly"/> <ElementType name="Horserace" content="eltOnly"> <element type="Track"/> <element type="Date"/> <element type="Raceno"/> <element type="Distance"/> </ElementType> </Schema>

In contrast with DTD, an XML-Data schema must first define the contained elements. In the preceding example, the four elements that define a horse race are first defined as "text only" elements. The containing Horserace element, being an "element only" element, isn't defined until all the contained elements are already defined. This is the order you must follow; you can't include an element in another if that included element isn't already defined in the schema.

If you set the content attribute of an element type to textOnly, that type can only contain text. eltOnly means that it can't contain any text, only other elements. If you want an element type to be able to hold both text and other elements, which is entirely possible, you set its content attribute to mixed.

Again in contrast with DTD, XML-Data doesn't let you store the schema and the XML document in the same file. An attempt to load the following document into an XML processor fails:

 <?xml version="1.0" encoding="windows-1252" ?> <Schema name="Horserace" xmlns="urn:schemas-microsoft-com:xml-data"> <ElementType name="Track" content="textOnly"/> <ElementType name="Date" content="textOnly"/> <ElementType name="Raceno" content="textOnly"/> <ElementType name="Distance" content="textOnly"/> <ElementType name="Horserace" content="eltOnly"> <element type="Track"/> <element type="Date"/> <element type="Raceno"/> <element type="Distance"/> </ElementType> </Schema> <Horserace> <Track>Täby Galopp</Track> <Date>1999-10-15</Date> <Raceno>3</Raceno> <Distance>1600</Distance> </Horserace>

It's easy to see why it fails; it contains two root elements, so it breaks one of the basic XML rules. The first root element is the Schema element; the second root element is the Horserace element. Having two root elements disqualifies an XML document from being well formed, and no decent XML processor is going to load it. You must separate the two root elements from each other, saving each root in its own file. The schema document can be saved without any changes, as follows:

 <?xml version="1.0" encoding="windows-1252" ?> <Schema name="Horserace" xmlns="urn:schemas-microsoft-com:xml-data"> <ElementType name="Track" content="textOnly"/> <ElementType name="Date" content="textOnly"/> <ElementType name="Raceno" content="textOnly"/> <ElementType name="Distance" content="textOnly"/> <ElementType name="Horserace" content="eltOnly"> <element type="Track"/> <element type="Date"/> <element type="Raceno"/> <element type="Distance"/> </ElementType> </Schema>

The schema now contains one root element only. As Figure 20-6 shows, you can load it in Internet Explorer without any problems.

click to view at full size.

Figure 20-6. An XML-Data schema is an XML document and can be loaded in Internet Explorer.

The second document that comes out of the partitioning of the combined document can't be left alone; you must make it refer to the file that contains the schema. To set up this reference, you declare an XML namespace, a declaration that also defines an alias for the schema:

 <?xml version="1.0" encoding="windows-1252" ?> <hr:Horserace xmlns:hr="x-schema:HorseraceSchemaXMLData.xml"> <hr:Track>Täby Galopp</hr:Track> <hr:Date>1999-10-15</hr:Date> <hr:Raceno>3</hr:Raceno> <hr:Distance>1600</hr:Distance> </hr:Horserace>

As you can see in the preceding code snippet, the alias of the namespace is hr. After declaring this alias, the code uses it as a prefix to every element name. Prefixing an element name with an alias, as in the example, makes it possible to relate every element to a specific schema. In the example, only one schema or namespace is used, but you can easily imagine an XML document in which different element types refer to different schemas and thus also to different namespaces.

Figure 20-7 shows the preceding document opened in Internet Explorer 5.0.

click to view at full size.

Figure 20-7. This XML document is opened in Internet Explorer as a file that refers to an XML-Data schema.

Invalid XML Code

Let's experiment by making the Horserace XML document invalid. All we have to do is add a Class element, just as we did in the section about DTD. In that section, we then used HTML and JavaScript code to load the document in the XML DOM, an action that didn't succeed. This time, let's just open the file in Internet Explorer 5.0 to see what happens. Figure 20-8 shows that Internet Explorer displayed the document, including the Class element, even though the document isn't valid.

click to view at full size.

Figure 20-8. Internet Explorer displays an XML document, even though the XML-Data schema it refers to proves that the document isn't valid.

Recall that a previous experiment made in this chapter proved that the XML DOM didn't accept an invalid document. The result of that experiment obviously stands in sharp contrast with the result of this experiment. What's the difference? One difference between the two invalid documents is that this one refers to an XML-Data schema and the previous one used a DTD schema. Is this difference the reason for the different behaviors? No, it isn't! There's another difference between the two experiments. For the experiment shown in Figure 20-8, we loaded the XML document in the browser by just opening a file, using the File menu of Internet Explorer. For the previous experiment, we used JavaScript to load the XML document in the XML DOM.

At least for the time being, Internet Explorer doesn't validate XML documents, regardless of whether DTD or XML-Data schemas are used, when the document is loaded in the browser by way of opening a file. If you modify the HTML page we used in the first experiment and make it load this document, which is protected by an XML-Data schema, you'll get the same result as in the first test. (See Figure 20-9.)

click to view at full size.

Figure 20-9. The XML DOM didn't accept this invalid document, checked by an XML-Data schema.

We could say a lot more about XML-Data schemas, but we must move on to other XML-related issues. Chapter 21 includes more XML-Data examples in the section about ADO 2.5. If you want to know more about XML-Data schemas in general, we particularly recommend Microsoft's Web site. Microsoft has also published a very interesting CD named Microsoft Windows DNA XML Resource Kit, which you'll find in the \DNAXMLRK folder of the companion CD-ROM.