Section 21.2.  Defining user-derived datatypes

Prev don't be afraid of buying books Next

21.2. Defining user-derived datatypes

The XML Schema definition language (XSDL) provides a facility for defining user-derived datatypes, using the simpleType definition element.

The element must contain a child element for the desired form of derivation: list, union, or restriction.

derivation by list

Derivation by list is quite straightforward. You merely take an existing datatype (like gYear) and create a new datatype that accepts a list of them (we could call it gYears).

derivation by union

This sort of derivation merges two or more datatypes into one that supports values of either sort. For instance a datatype representing months might allow either gMonth integers or string names by deriving by union from Name and gMonth.

derivation by restriction

This form of derivation narrows down the values allowed by an existing datatype. For instance you could take the datatype gYear and restrict it to years in a particular century.

The child element might optionally be preceded by an annotation element, as shown in Example 21-1.

Note

Annotation elements can hold more than documentation, and XSDL elements in general have other properties that support self-documentation and application processing of schema definitions. As these facilities are not germane to deriving datatypes for general use, they are discussed in 22.3.2, "Schema components", on page 472.




Example 21-1. A simpleType definition
 <xsd:simpleType name="mynewType">     <xsd:annotation>         <xsd:documentation>Important new type</xsd:documentation>     </xsd:annotation>     <xsd:restriction base="someOtherType">         ...     </xsd:restriction> </xsd:simpleType> 

21.2.1 Derivation by list

You derive by list when you want to allow occurrences of the datatype in the document to contain multiple values rather than just one. For instance to allow multiple dates you could derive a dates datatype[4] from the primitive date datatype, as in Example 21-2.

[4] XSDL would call this a simple type, rather than a datatype, but in this chapter it is clearer to call it a datatype.

Caution

Items in lists are always separated by whitespace. If a datatype allows embedded whitespace (such as string), deriving a list from it may produce unexpected results.




Example 21-2. Deriving a list from a named datatype
 <xsd:simpleType name="dates">   <xsd:annotation>     <xsd:documentation>Multiple dates</xsd:documentation>   </xsd:annotation>   <xsd:list itemType="xsd:date"/> </xsd:simpleType> 

In Example 21-2, the list was derived from an existing named datatype. It is also possible to derive a list from a newly-defined anonymous datatype. Rather than putting an itemType attribute on the list, you would merely put a simpleType element in the list element content.

The embedded simpleType will be used just as if you had defined it elsewhere, given it a name and referred to it with the itemType attribute. Example 21-3 demonstrates.

Example 21-3. Deriving a list from an anonymous datatype
 <xsd:simpleType name="pubDates">   <xsd:list>     <xsd:simpleType>         ... derivation of pubDate ...     </xsd:simpleType>   </xsd:list> </xsd:simpleType> 

21.2.2 Derivation by union

Derivation by union is a way of combining existing datatypes into one datatype. For instance you might want to allow dates to be specified according to any one of:

  • the built-in Gregorian date datatype,

  • a notation based on the ancient Aztec calendar, or

  • the Hebrew calendar.

If we define our datatypes so that they are syntactically distinct then we can easily create a union of them. One way to easily distinguish Aztec and Hebrew dates is to start them with the letters A and H, respectively, because we know that built-in dates start with a number. Example 21-4 demonstrates such a union datatype.

Example 21-4. Union datatype
 <xsd:simpleType name="AnyKindaDate">   <!-- we could put an annotation here -->   <xsd:union memberTypes="myns:AztecDate myns:HebrewDate xsd:date">       <!-- we could put another annotation here -->   </xsd:union> </xsd:simpleType> 

If we carelessly defined our notations so that some Aztec dates could also be recognized as Hebrew or Gregorian dates, then the Aztec interpretation would win over the other ones because it is listed first in the memberTypes attribute.

Instead of – or in addition to – a memberTypes attribute, you could also put simpleType elements within the union to define anonymous in-line datatypes, just as we did for lists. Unlike lists, however, which are derived from a single datatype, we can define as many anonymous datatypes as we like. Each of them contributes to the union just as if it had been defined externally and referenced.

If we wanted to, we could define a union of two different kinds of lists. Perhaps we would allow a list of Hebrew dates or a list of Aztec dates. Conversely, we could define a list of a union datatype. For instance, a list derived from AnyKindaDate would be a list of a union.

Many complex combinations of different kinds of derivation are possible.

21.2.3 Deriving datatypes by restriction

Another way to derive a datatype is by restriction. In this case, you add constraints to an existing datatype, either built-in or user-derived.

By definition, a datatype derived by restriction is more constraining than its base datatype. Any value that conforms to the new derived datatype would also have to conform to the original base datatype.

Restrictions are created with the restriction element. You may have a single restriction as a child of the simpleType element. You can refer to a named base datatype with the base attribute, or define an anonymous base datatype by including a simple type in the content.

There is a fixed list of ways that you can constrain a given datatype. These are called constraining facets. There are twelve of them, each represented by a specific element type. They occur as sub-elements of the restriction element. The allowable ones depend on the base datatype.

They have several points in common:

  • They all allow a single optional annotation sub-element and no other.

  • Each has a required value attribute that specifies the constraining value. The exact meaning and allowable values of the attribute depend on the specific facet and the base datatype.

  • They work together.

The last item requires some explanation.

If you define a datatype with a minimum value and then refine it by restriction to make a new datatype with a maximum value, the new datatype has both the minimum value constraint from the base datatype and the maximum value constraint from the derived datatype. If you defined yet a third datatype by restriction based on the new datatype, it would add still other restrictions.

This principle applies even when the same facet is applied at two different levels. So you could define a datatype that represents the set of numbers greater than 10,000 by setting a minimum value constraint. In a datatype derived from that you could set another minimum value constraint raising the minimum value to 20,000.

Values must now be both greater than 10,000 and greater than 20,000, which is the same as saying just that they must be greater than 20,000. In one sense both restrictions apply, but really the more constraining derived datatype overrides the base datatype.[5]

[5] There is a way to prevent such overrides, but as it is rarely needed, we don't cover it here.

Amazon


XML in Office 2003. Information Sharing with Desktop XML
XML in Office 2003: Information Sharing with Desktop XML
ISBN: 013142193X
EAN: 2147483647
Year: 2003
Pages: 176

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net