Section 22.3. Structure of a schema definition


Prev	don't be afraid of buying books	Next

22.3. Structure of a schema definition

A schema is defined by one or more schema documents. Their root element is a schema element. Its attributes can define applicable namespaces, and its content components include elements like those we have been discussing, plus annotation elements.

22.3.1 Namespaces

The schema element must have a declaration for the XML schema namespace, http://www.w3.org/2001/XMLSchema. It could either assign a prefix (xsd and xs are two popular ones) or it could make XML Schema the default namespace. The prefix (if any) is used both for schema component elements and in references to built-in datatypes.

To validate documents that use namespaces, you can specify a targetNamespace for the schema, as shown in Example 22-2. Components that are children of the schema element are called global schema components. They declare and define items in the schema's target namespace.

The example also declares the poem prefix for the schema's target namespace. It is used within the schema definition to refer to the elements, attributes and types declared (or defined) by global schema components.

The instance document in Example 22-1 utilizes the same namespace, http://www.poetry.net/poetns. However, as it is declared as the default namespace, no prefix is declared or used.

Note that the value of the name attribute of a component, such as a type definition or element declaration, does not have a namespace prefix. A component name always belongs to the target namespace. It is only when the declared or defined objects are referenced that the prefix may be used.

Within a namespace different kinds of components can normally have the same name. The exception is simple and complex types, as there are many places where they are treated interchangeably. Elements, however, are not types, so an element declaration component may use the same name as a complex or simple type. There is no more relationship between them than between a guy named Bob at your office and the guy named Bob on your favorite television show (unless you work in Hollywood!).

22.3.2 Schema components

The XSDL elements we have been discussing, such as element and simpleType, occur in the content of a schema element and are collectively known as schema components. Those, like element, that correspond to DTD declarations, are also called (surprise!) declaration components.

As XSDL schemas are themselves defined in XML documents, it was possible to provide techniques to make them self-documenting and capable of being processed by applications other than schema processors. These include unique identifiers, extension attributes, and annotation elements.

22.3.2.1 Unique identifiers

All schema components are defined with an optional id attribute. You can therefore assign unique identifiers to make the components easier to refer to using XPath expressions. Each value assigned to an id attribute must be different from any other assigned anywhere in the schema document.

22.3.2.2 Extension attributes

Schema components may be extended with arbitrary attributes in any namespace other than the XML Schema namespace. For instance you could add attributes from the XLink namespace or from the RDF namespace.

If you had software that helped you to visualize the schema, extension attributes could be used to store the graphical coordinates of the various elements. If you used software that converted XML schemas to a relational database schema, you might use the extension attributes to guide that process.

22.3.2.3 The `annotation` element

Any XSDL component may have an annotation element as its first sub-element. The schema element, however, goes above and beyond the call of duty! It may have as many annotation sub-elements as you like. It is good practice to have at least one annotation at the beginning as an introduction to the document type.

An annotation element may have zero or more documentation and appinfo children elements.

The documentation element is used to add user-readable information to the schema. Any elements are permitted; they needn't be defined in the schema. The benefit of using annotation elements rather than XML comments is that it is much easier to use rich markup such as XHTML or Dockbook. Application software can extract this documentation and use it for online help or other purposes.

An appinfo element adds some information specific to a particular application. These are extension elements; they work like the extension attributes we discussed earlier. You may use them for the same sorts of tasks, but the elements can have an internal structure while attributes can only contain data characters. Your extension elements should be in a namespace that will enable your applications to recognize them.

22.3.3 Complex types

Example 22-3 shows the definition of an address type. It also shows two element declarations that utilize it.

Example 22-3. Elements built on an address type

 <xsd:complexType name="address">    <xsd:sequence>      <xsd:element ref="myns:line1"/>      <xsd:element ref="myns:line2"/>      <xsd:element ref="myns:city"/>      <xsd:element ref="myns:state"/>      <xsd:element ref="myns:zip"/>    </xsd:sequence>    <xsd:attribute name="id" type="xsd:ID"/> </xsd:complexType> <xsd:element name="billingAddress" type="myns:address"/> <xsd:element name="shippingAddress" type="myns:address"/>

In XSDL, types are definable independently of elements and may be associated with more than one element-type name. In the example, the address type is used by both billingAddress and shippingAddress.

This example shows some of the power of complex types: we can create structural definitions as reusable units that make element declaration and maintenance easier. Types are similar to the virtual or abstract classes used in object-oriented programming.

Types do not themselves define elements that will be used directly. Example 22-3 would not permit an address element in a valid document. Instead the type is a set of reusable constraints that can be used as a building block in element declarations and other type definitions.

XSDL does not require you to give every type a name. If you only intend to use a type once, you could put the definition for it right in an element declaration, as in Example 22-4.

Example 22-4. Inline type definition

 <xsd:element name="address">   <xsd:complexType>    <xsd:sequence>      <xsd:element ref="myns:line1"/>      <xsd:element ref="myns:line2"/>      <xsd:element ref="myns:city"/>      <xsd:element ref="myns:state"/>      <xsd:element ref="myns:zip"/>    </xsd:sequence>    <xsd:attribute name="id" type="xsd:ID"/>   </xsd:complexType> </xsd:element>

Example 22-4 was created from Example 22-3 by wrapping the complexType in an element and moving the name attribute. You could do the same with a simpleType. Note that the element declaration has no type attribute. You need to choose whether to refer to a named type or embed an unnamed type definition.

To create a type that allows character data in addition to whatever is specified in its content model, you may add a mixed="true" attribute value to the complexType element.

To declare the element-type empty, we could have left out the sequence element.

22.3.4 Content models

Content models allow us to describe what content is allowed within an element.

22.3.4.1 Sequences

A sequence is specified in Example 22-5. It indicates that there must be an A element followed by a B element followed by a C element.

Example 22-5. sequence element

 <xsd:sequence>    <xsd:element ref="myns:A"/>    <xsd:element ref="myns:B"/>    <xsd:element ref="myns:C"/> </xsd:sequence>

An element element might declare things directly, or else indirectly by referencing an existing element declaration. The declarations in the example do the latter, as indicated by the use of ref attributes instead of name attributes. Note that an element reference must be prefixed if it lives in a namespace (which it will if declared in a schema with a targetNamespace).

22.3.4.2 Choices

Example 22-6 shows the XSDL code that defines a choice of element types. It means the element must contain either an A or a B or a C.

Example 22-6. `choice` element

 <xsd:choice>    <xsd:element ref="myns:A"/>    <xsd:element ref="myns:B"/>    <xsd:element ref="myns:C"/> </xsd:choice>

22.3.4.3 Nested model groups

For more complex content models, model groups can be nested. For example, we can specify a choice element within a sequence element, as shown in Example 22-7.

Example 22-7. Sequence with nested choice

 <xsd:sequence>   <xsd:element ref="poem:title"/>   <xsd:element ref="poem:picture"/>   <xsd:element ref="poem:verse" maxOccurs="unbounded"/>   <xsd:choice>     <xsd:element ref="poem:footnotes"/>     <xsd:element ref="poem:bibliography"/>   </xsd:choice> </xsd:sequence>

The declaration for verse states that it may have multiple occurrences through its maxOccurs attribute. There is a corresponding minOccurs that defaults to "1" – meaning at least one is required by default.

Inside of sequences and choices it is also possible to use any and group elements. An any element means that any content is allowed. It has various bells and whistles to allow you to narrow down what you mean by "any". Most document types do not require this feature so we will not go into any detail.

The group element allows you to refer to a named "model group definition". You can use these model group definitions to reuse parts of content models by referencing them.

22.3.4.4 `all` elements

The all element specifies that all of the contained elements must be present, but their order is irrelevant. So you could enter "A B C", "A C B", "B A C" and all of the other combinations of the three. Example 22-8 demonstrates.

Example 22-8. `all` element

 <xsd:complexType name="testAll">   <xsd:all>     <xsd:element ref="myns:A"/>     <xsd:element ref="myns:B"/>     <xsd:element ref="myns:C"/>   </xsd:all> </xsd:complexType>

all must only be used at the top level of a complex type definition. all is also unique in that it may only contain element elements, not sequences, choices, groups, etc.

22.3.5 Attributes

The poem and picture element declarations in Example 22-2 both contain attribute declarations. Example 22-9 shows the declarations for the poem element's optional publisher and pubyear attributes.

Example 22-9. Attribute declarations

 <xsd:attribute name="publisher" type="xsd:string"/> <xsd:attribute name="pubyear" type="xsd:NMTOKEN"/>

They are optional because there is no use attribute in their definitions. You can also make them required with use="required".

Example 22-10 shows two attribute declarations. One uses a built-in datatype and the other a user-defined simple type.

Example 22-10. Built-in and user-defined types

 <xsd:attribute name="href" use="required" type="xsd:anyURI"/> <xsd:attribute name="pubdate" type="myns:pubyear"/>

Attribute declarations can also occur within a named attributeGroup element, which allows them to be reused in complex type definitions and in other attribute groups.

attribute elements have a default attribute that allows you to specify a default value for optional attributes. To supply a default value that cannot be overridden, supply it using the fixed attribute rather than the default attribute.


	Amazon