XML Schema Modeling | The Official XMLSPY Handbook

In Chapter 4, I showed you how to create global elements and global complex type definitions that could be either referenced or declared repeatedly throughout an XML Schema. These global constructs, in turn, served to greatly improve the reusability and modularity of an XML Schema over a Document Type Definition. XML Schema provides facilities to further improve the development, testing, and maintainability of your XML Schema through the use of compositors and groups. A compositor enables you to define sequences, choices, or any ordering of elements. A group is a clustering of elements (that is, a section of an XML Schema’s content model) that is given a distinct name. You’ll see that compositors and groups can provide greater flexibility and modularity in terms of content model design, in comparison to parameter entities in Document Type Definitions.

Compositor models

An XML Schema can be thought of as functional groupings of XML elements and other XML components whose relationships are expressed through the use of compositors. Up until now, I have used only the sequence compositor. The Order element, for example, uses a sequence compositor to express the fact that an Order element was defined as a sequence of ShippingAddress, BillingAddress, Line-Items, and Note complex elements, as shown in Figure 5-5.

Figure 5-5: A sequence compositor is used to define the Order element.

The sequence compositor defines a strict structural hierarchy that specifies the order in which child elements must occur in an instance document—it is the most commonly used compositor because it defines a straightforward relationship that is easy for both processors and content authors to understand. As the XML Schema designer, you have three distinct compositor models to choose from when specifying the relationships between schema components: sequence, choice, and all. The following sections talk about the two types we haven’t discussed yet: choice and all.

Choice Compositors

The choice compositor allows an instance document author to make a choice of any one option from an enumeration or listing of several allowable options. Take, for example, the following XML Schema code fragment that defines a complex type called Dinner, which may contain either Hamburger or Pizza, but not both.

<xsd:complexType name=”Dinner”>    <xsd:choice>       <xsd:element name=”Hamburger” type=”xsd:string”/>       <xsd:element name=”Pizza” type=”xsd:string”/>    </xsd:choice> </xsd:complexType>

The xsd:choice element can include values for minOccurs and maxOccurs as attributes that enable you to build more flexible constructs. Take, for example, the Note element of the Purchase Order Schema that was meant to convey any additional customer remarks. Note defined three child elements Emphasis, Underline, and br (an empty element representing a line break). It was meant to allow simple text content such as

<Note>Please use the cheapest ground shipping method available</Note>

but also to be able to handle more complex mixed content such as

<Note>Dear Customer Service,<br/> the Last order arrived <Emphasis> two weeks late</Emphasis> and was <Emphasis>on Fire</Emphasis>!<br/> Please <Underline>expedite</Underline> this order.<br/>Thank You.<br/></Note>

The content just shown contains mixed element and textual content, with child elements appearing anywhere and any number of times. To achieve this result, the Note element was specified using a choice compositor. The Note element definition is shown in the following code:

<xsd:element name="Note">    <xsd:complexType mixed="true">       <xsd:choice minOccurs="0" maxOccurs="unbounded">          <xsd:element name="Underline" type="xsd:string"/>          <xsd:element name="Emphasis" type="xsd:string"/>          <xsd:element name="br">             <xsd:complexType/>          </xsd:element>       </xsd:choice>    </xsd:complexType> </xsd:element>

First, the Note element is specified to have a mixed content model that allows it to contain plain text content along side the previously mentioned child elements. A choice compositor is required to allow any one of the listed child elements to appear in any order. Next, by setting the minimum occurrence (minOccurs) of the choice compositor equal to zero, I allow for the possibility of having no child elements and possibly only having text content (because the Note element has mixed content). By setting the maximum occurrence (maxOccurs) of the choice compositor to be unbounded, I allow for the possibility of using multiple child elements, appearing in any order, which is the desired result. Figure 5-6 shows the Note element as it is graphically represented in Schema Design view.

Figure 5-6: A choice compositor.

Inserting a choice compositor into your XML Schema is done the same way that you would insert a sequence compositor:

In Schema Design view, select a schema component to edit.
Expand it so that it is displayed graphically in the Schema Editing page.
Select the element or component to which you want to add the choice compositor (such as the Note element). Right-click and select Add Child → Choice.

The choice compositor is visually represented as a switch, implying that only one of several possible choices can be selected. The choice compositor’s minimum and maximum occurrence constraints are indicated graphically beneath the compositor. A broken line indicates an optional component, and an infinity symbol indicates that the maximum occurrence is unbounded. If you don’t explicitly specify values for minOccurs and maxOccurs on the choice compositor construct, the Note element will consist of plain text content along with one of three possible child elements: Underline, Emphasis, and br.

All Compositors

A compositor of type all is a loosely defined construct that requires all of its child elements to appear in an instance document, in any order. By explicitly specifying a minOccurs value equal to zero for any of the choices, you can make them optional, relaxing the default requirement that all choices must appear. For example, the Purchase Order Schema could include an EmergencyContact complex element, which might consist of preferred methods of reaching the customer, ranked in descending order of precedence. Possible options could include HomePhone, WorkPhone, MobilePhone, or Email—of course, it is unreasonable to expect that every customer possesses home, work, and mobile numbers, as well as an e-mail address. Therefore, I modify the all compositor to allow the customer to include whatever contact methods they want, omitting others. Here is the XML Schema source listing for the EmergencyContact element:

<xsd:element name=”EmergencyContact”>    <xsd:complexType>       <xsd:all>          <xsd:element name=”HomePhone” type=”xsd:string” minOccurs=”0”/>          <xsd:element name=”WorkPhone” type=”xsd:string” minOccurs=”0”/>          <xsd:element name=”MobilePhone” type=”xsd:string” minOccurs=”0”/>          <xsd:element name=”Email” type=”xsd:string” minOccurs=”0”/>       </xsd:all>    </xsd:complexType> </xsd:element>

By specifying in each child element a minimum occurrence of zero (minOccurs = 0), a valid XML instance document may contain any or all the listed elements, in any order, a maximum of once per element. The model for an EmergencyContact complex element is shown in Figure 5-7.

Figure 5-7: An all compositor graphically represented in Schema Design view.

An all compositor is graphically represented in Figure 5-7 as a connection with an equal number of lines going in as there are going out. This graphical representation suggests a relationship in which all nodes are participating, and indeed all elements listed in the all compositor must appear in an instance document because that is the default behavior. In the case of Figure 5-7, however, the child elements of the all compositor are depicted using broken lines because I explicitly specified them to be optional elements. The complete code listing for this all component example is listed in Order_5-07.xsd. Please note that an all compositor must appear as the sole child at the top of a content model; in other words, the following is illegal because the illegalExtraElement’s presence makes the all compositor not the sole child:

<xsd:element name="EmergencyContact">    <xsd:complexType>       <xsd:all>          <xsd:element name="HomePhone" type="xsd:string" minOccurs="0"/>          <xsd:element name="WorkPhone" type="xsd:string" minOccurs="0"/>          <xsd:element name="MobilePhone" type="xsd:string" minOccurs="0"/>          <xsd:element name="Email" type="xsd:string" minOccurs="0"/>       </xsd:all>       <xsd:element name="illegalExtraElement" type="xsd:string"/>    </xsd:complexType> </xsd:element>

Changing Compositor Models

To change a compositor model in Schema Design view, click on any compositor displayed in a Schema Editing page and choose Change Model. The underlying XML Schema syntax will be changed according to your selection, as shown in Figure 5-8.

Figure 5-8: Changing a compositor model in XMLSPY.

Groups

Groups are a construct within XML Schema. By using groups, you can create smaller, more granular assemblages of elements (or attributes, as discussed in the next section) that are subsequently meant to be used when constructing complex elements. Any section of XML Schema code defined inside the Order, AddressType, Note, or ProductType complex types are possible candidates for replacement by a group construct. Take for example the AddressType complex type definition in the Purchase Order example, which defines a required element Street1 and an optional element Street2, both of type xsd:string. These two elements always appear together as a group and could potentially appear together in other complex type definitions such as a new credit card account or a driver’s license registration, or anything else for that matter. I could, therefore, combine Address1 and Address2 into a single group called Address and then reference that group construct from within the AddressType complex type definition. The schema of Order_5-08.xsd shows how the original purchase order would appear after I make this modification; as usual, I have included the important code fragments in the following listing. You can find the complete source code listing for Order_5-08.xsd on the companion CD.

<!-- Order_5-07.xsd – Using Groups to construct complex types --> <xsd:schema targetNamespace="http://www.company.com/examples/ purchaseorder" xmlns:xsd="http://www.w3.org/2001/XMLSchema"  xmlns="http://www.company.com/examples/purchaseorder"  elementFormDefault="qualified" attributeFormDefault="unqualified">    <xsd:element name="Order">       <xsd:complexType>          <!-- Omitted for Brevity -->       </xsd:complexType>    </xsd:element>    <xsd:complexType name="AddressType">       <xsd:sequence>          <xsd:group ref="Street"/>          <xsd:element name="City" type="xsd:string"/>          <xsd:element name="State">          <!-- Omitted for Brevity -->       </xsd:sequence>    </xsd:complexType>    <xsd:complexType name="ProductType">       <!-- Omitted for Brevity -->    </xsd:complexType>    <xsd:element name="Note">       <!-- Omitted for Brevity -->    </xsd:element>    <xsd:group name="Street">       <xsd:sequence>          <xsd:element name="Street1" type="xsd:string"/>          <xsd:element name="Street2" type="xsd:string" minOccurs="0"/>       </xsd:sequence>    </xsd:group> </xsd:schema>

In the Purchase Order Schema of Order_5-08.xsd just shown, I introduced a named xsd:group construct having name equal to Street; this named group definition is located at the root level (that is, a child of the xsd:schema element) and is, therefore, a global schema component. The named Street group consists of a sequence of two elements, Street1 (required) and Street2 (Optional) both of type xsd:string—these two elements must occur in an instance document in the same order as defined in the in the xsd:group construct. The AddressType definition, in turn, declares an unnamed reference to the group construct also using the same xsd:group syntax. Therefore, the XML Schema xsd:group element is used both for the definition of a group construct and for subsequent referencing of any named group. An xsd:group element, which defines a group structure, is called a named group; whereas an xsd:group, which references an existing named group, is called an unnamed group. The consequence of this is that a group element cannot contain both a name and a reference. You can use a model group to define a set of elements that can be repeated through the document. The Purchase Order Schema of Order_5-08.xsd is functionally equivalent to the Purchase Order Schema of Order_5-01.xsd and will validate instance documents in the same fashion.

Creating a group using Schema Design view is very similar to creating a global element or global complex type definition. From the Schema Overview page, you click the Add New Schema Component button (the second button from the left in the top-left corner) and select Group as shown in Figure 5-9.

click to expand
Figure 5-9: Adding a group component to an XML Schema.

Note

All top-level or global schema components such as globally defined elements, complex types, simple types, and groups (including attribute groups discussed in the next section) can be added to an XML Schema from the Schema Overview page.

After you have added the new group to the Schema Overview page, type in Street as the name of the component; then expand the component (by clicking on the tree-button adjacent to the component name) and continue editing the component in the Schema Editing page. Groups are represented in the XML Schema Editing page as an octagon-like shape with the name of the group inside of the octagon. To finish defining the Street named group component, follow these steps:

Expand and select the Street named group component, right-click, and choose Add Child → Sequence.
Select the sequence compositor, right-click, and choose Add Child → Element.
Name the new element Street1 and assign it to be of type xsd:string from the Details window.
Repeat this for the Street2 element, but also make it an optional element by specifying minOccurs (minimum occurrence) equal to zero.

The completed Street named group component should resemble the diagram shown in Figure 5-10.

Figure 5-10: Editing a named group construct using the Schema Design view.

Next, you need to modify the AddressType definition so that it references the newly created named Street group instead of declaring the two simple types Street1 and Street2. To modify the definition, follow these steps:

Expand the AddressType complex type definition and delete the old Street1 and Street2 elements.
Right-click the sequence compositor directly beneath the AddressType node and choose Add Child → Group.
Double-click the group octagon and choose Street from the drop-down list box.

The modified AddressType definition, which references the newly defined Street named group, should appear as shown in Figure 5-11.

click to expand
Figure 5-11: Developing complex type definitions using groups as building blocks.

To a certain extent, named groups mimic the functionality of parameter entities in DTDs. They allow for macro-like textual substitution in defining DTDs. In XML Schemas, however, groups go one step further. They allow you to specify cardinality constraints on the number of times elements belonging to a group may appear within an instance document. That means you may specify values for minimum and maximum occurrences by providing appropriate values for minOccurs and maxOccurs attributes. For example, consider the following unnamed group reference:

<xsd:group ref="mygroup" minOccurs="1" maxOccurs="5"/>

This code requires the hypothetical grouping of elements named mygroup to occur in an instance document at least once and up to a maximum of five times. Until now, you have seen how a named group can specify how a sequence of elements should appear within a complex type definition. Keep in mind that any group constructs may include any of the compositor models: sequence, choice, and all, covered earlier in this section. Thus, you can control or specify patterns of different element orderings and configurations. The next two sections present an example of how to use the choice and all compositors in a group.

Choice Group

A choice group is simply a named group that contains a choice compositor. The choice compositor, discussed earlier in this section, allows only one of its children to appear in an instance document. In the preceding discussion of compositors, I used a choice compositor in the definition of Note ( Order_5-07.xsd). This choice compositor can be made into a named choice group so that other element constructs requiring the same structure could reference the group. The following listing shows the code for a named group called Paragraph that can contain a choice of child elements: Emphasis, Underline, and br. If you specify a minOccurs of zero and maxOccurs of unbounded, the model allows the listed child elements to appear any number of times and in any order:

<xsd:group name="Paragraph">    <xsd:choice minOccurs="0" maxOccurs="unbounded">       <xsd:element name="Emphasis" type="xsd:string"/>       <xsd:element name="Underline" type="xsd:string"/>       <xsd:element name="br">          <xsd:complexType/>       </xsd:element>    </xsd:choice> </xsd:group>

The complete XML Schema containing the previously listed code is located in Order_5-08.xsd. The Paragraph group can be subsequently referenced by the Note element as shown in the following code:

<xsd:element name=”Note”>    <xsd:complexType mixed=”true”>       <xsd:group ref=”Paragraph”/>    </xsd:complexType> </xsd:element>

Figure 5-12 shows the global element Note, which is defined to be a complex type with mixed content. The Note definition contains a single child node (the octagon), which is a visual representation for an unnamed group that references the newly defined Paragraph group.

Figure 5-12: A Note element that references a choice group.

All Group

An all group is simply a group that uses an all compositor to constrain its respective contents. As previously explained, use of an all compositor requires by default that all the child elements defined in the group must appear once (unless you specify a minOccurs = 0), in any order. The all group must appear as a top-level element for the group, and the group’s children must all be individual elements (no nested groups permitted). Here is the EmergencyContact global element expressed as a named group instead:

<xsd:group name="EmergencyContact">    <xsd:all>       <xsd:element name="HomePhone" type="xsd:string" minOccurs="0"/>       <xsd:element name="MobilePhone" type="xsd:string" minOccurs="0"/>       <-- Omitted --/>    </xsd:all> </xsd:group>

The complete code listing for the EmergencyContact group can be found in the Order_5-08.xsd file. The EmergencyContact named all group is visually represented as a group octagon having an all compositor, as shown in Figure 5-13:

Figure 5-13: An all group, represented in XML Schema Design view.

Attribute groups

You can use an attribute group to cluster common attributes, which may potentially appear in numerous complex elements. Attribute groups in XML Schemas function in a similar manner to parameter entities in DTDs—they improve readability and maintainability. In the Purchase Order Schema, the ProductType definition contains two attributes: id and department. You can combine these as a named attribute group as shown in the following code:

<xsd:attributeGroup name=”CommonAttributes”>    <xsd:attribute name=”id” type=”xsd:integer” use=”required”/>    <xsd:attribute name=”department” type=”xsd:string” use=”optional”/> </xsd:attributeGroup>

You can then reference an attribute from within any complex type as follows:

<xsd:complexType name="ProductType">    <xsd:sequence>       <xsd:element name="Description" type="xsd:string"/>       <!-- Omitted for Brevity -->    </xsd:sequence>    <xsd:attributeGroup ref="CommonAttributes"/> </xsd:complexType>

You can edit attribute groups from the Schema Overview page. Click the Add New Global Schema Component button located at the top-left of Figure 5-14 and choose Attribute Group. The definition of an attribute group is done from the Attribute window shown at the bottom of Figure 5-14.

click to expand
Figure 5-14: Editing attribute groups in the Schema Overview page.