4.1. Creating and using schemas
The real power of XML lies in using a vocabulary that describes the meaning of the document, not its outward appearance. For example, Example 4-1 shows Doug's article
marked
up using a custom schema called
article
.
Example 4-1. An article using the
article
schema (article.xml)
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<article xmlns="http://xmlinoffice.com/article"
type="sales" id="A123">
<title>Sales Update</title>
<author>Doug Jones</author>
<date>February 3, 2004</date>
<body>
<section>
<header>A great month!</header>
<para>This month's figures are a <em>huge</em>
improvement over this month last year. We sold 1,342 widgets for a
total revenue of ,327.</para>
</section>
<section>
<header>More work to do</header>
<para>Let's not rest on our past success. Let's get out there
and sell, sell, sell!</para>
</section>
</body>
</article>
This XML document identifies the meaning of the data elements, not just their location in the document. The author is identified by an
<author>
tag, and the title is identified by a
<title>
tag. The document uses a namespace,
http://xmlinoffice.com/article
, to identify the vocabulary used.
4.1.1 Vocabularies and schemas
A schema defines an XML vocabulary for documents of a particular type (such as
article
documents). The vocabulary includes the element-type
names
, such as
author
and
title
. The schema also constrains the order in which elements and attributes can appear.
Industry organizations and standards
committees
have defined XML vocabularies for subjects as varied as computer graphics and accounting statements, many with one or more schemas that
employ
the vocabularies.
Any of those schemas – or any other – can be used with Word; there is no specific set of "supported schemas." However, a definition of the schema must be available in the W3C XML Schema definition language (XSDL), as no other schema language is supported.
Alternatively, you can define your own vocabulary by writing your own schema. Microsoft Office does not provide a GUI editor for defining a schema; you will have to create yours with a text editor or an available schema editing tool. Chapter 22, "XML Schema (XSDL)", on page 466 explains schemas in more detail and gives some guidelines for writing your own schemas.
Schemas are used in Word both to validate documents, and to provide hints on the structure of a document while it is being edited.
4.1.2 The
article
schema
The
article
schema definition is shown in Example 4-2.
Example 4-2. Schema for
article
documents (article.xsd)
<?xml version="1.0"?>
<xs:schema targetNamespace="http://xmlinoffice.com/article"
xmlns="http://xmlinoffice.com/article"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
elementFormDefault="qualified">
<xs:element name="article" type="ArticleType"/>
<xs:complexType name="ArticleType">
<xs:sequence>
<xs:element name="title" type="xs:string"/>
<xs:element name="author" type="xs:string"/>
<xs:element name="date" type="xs:string"/>
<xs:element name="body" type="BodyType"/>
</xs:sequence>
<xs:attribute name="id" type="xs:ID"/>
<xs:attribute name="type" type="xs:string"/>
</xs:complexType>
<xs:complexType name="BodyType">
<xs:sequence>
<xs:element name="section" type="SectionType" maxOccurs=
"unbounded"/>
</xs:sequence>
</xs:complexType>
<xs:complexType name="SectionType">
<xs:sequence>
<xs:element name="header" type="xs:string"/>
<xs:element name="para" type="ParaType" maxOccurs="unbounded"/
>
</xs:sequence>
</xs:complexType>
<xs:complexType name="ParaType" mixed="true">
<xs:choice minOccurs="0" maxOccurs="unbounded">
<xs:element name="em" type="xs:string"/>
<xs:element name="cite" type="xs:string"/>
<xs:element name="url" type="xs:string"/>
</xs:choice>
</xs:complexType>
</xs:schema>
The schema definition uses
xs:element
elements to declare the element types allowed in an
article
document. For example:
<xs:element name="title" type="xs:string"/>
declares that
title
elements contain data
characters
("strings").
Some of the element types, such as
article
and
section
, are complex, which means that their elements can have child elements and/or attributes.
4.1.3 Adding a schema to the library
When Word opens an XML document it checks to see if the document is associated with the known schemas in its Schema Library. It does so by comparing the namespace of the document (i.e., of its root element) with the target namespace of each schema in the Schema Library until it finds a match.
For example, the schema in Example 4-2 has a target namespace, specified by the
targetNamespace
attribute, of:
http://xmlinoffice.com/article
This is the same namespace that is declared for Doug's
article
document in Example 4-1. By adding the schema to Word's Schema Library, we can assure that it will be associated with Doug's article or any other
article
document that is opened in Word.
To add a schema:
-
1.
On the
Tools
menu, click
Templates and Add-Ins.
-
2.
Click the
XML Schema
tab, shown in Figure 4-1.
-
3.
Click
Add Schema.
-
4.
Select the schema file, in this case
article.xsd
, and click
Open
. This will bring up the
Schema Settings
dialog shown in Figure 4-2.
-
5.
Type the word
article
in the
Alias
box. This will serve as a nickname for the
article
namespace. It is good practice to use the document type (i.e. the root element-type name) as the alias.
If you attempt to add an invalid schema, you will be advised that the schema is invalid and prevented from adding it. Once you have added the schema, it will appear in the
Available XML schemas
list, as shown in Figure 4-3. When it is selected, the pane will show the namespace URI and the
path
to the schema definition file.
These settings will be saved in your Word configuration. From now on, every time you open an XML document whose root element is in the
http://xmlinoffice.com/article
namespace, the
article.xsd
schema is automatically used for that document. It is
not
possible to add more than one schema for a given namespace.
4.1.4 Using the Schema Library
The Schema Library, shown in Figure 4-4, allows you to add and delete schemas, as well as give them mnemonic aliases (usually the document-type name). Word uses the alias as the name of the schema; without one it will use the entire namespace URI.
To access the Schema Library, click
Schema Library
on the
XML Schema
tab of the
Templates and Add-Ins
dialog.
The Schema Library also allows you to associate
solutions
with schemas. These are XSLT stylesheets that can be used to transform XML documents when Word opens or saves them, as we will see in 5.3.2.1, "Associating stylesheets with schemas, on page 108.
|