Types in XQuery

Up to now, we have not spent much time discussing types, but the type system of XQuery is one of the most eclectic, unusual, and useful aspects of the language. XML documents contain a wide range of type information, from very loosely typed information without even a DTD, to rigidly structured data corresponding to relational data or objects. A language designed for processing XML must be able to deal with this fact gracefully; it must avoid imposing assumptions on what is allowed that conflict with what is actually found in the data, allow data to be managed without forcing the programmer to cast values frequently, and allow the programmer to focus on the documents being processed and the task to be performed rather than the quirks of the type system.

Consider the range of XML documents that XQuery must be able to process gracefully:

  • XML may be strongly typed and governed by a W3C XML Schema, and a strongly typed query language with static typing can prevent many errors for this kind of data.

  • XML may be governed by another schema language, such as DTDs or RELAX-NG.

  • XML may have an ad hoc structure and no schema, and the whole reason for performing a query may be to discover the structure found in a document. For this kind of data, the query language should be able to process whatever data exists, with no preconceived notions of what should be there.

  • XML may be used as a view of another system, such as a relational database. These systems are typically strongly typed, but do not use W3C XML Schema as the basis for their type system. Fortunately, standard mappings are emerging for some systems, such as SQL's mappings from relational schemas to W3C XML Schema. These are defined in the SQL/XML proposal, which provides standard XML extensions to SQL [SQLXML].

  • XML data sources may have very complex structure, and expressions in XQuery must be well defined in terms of all the structures to which the operands can evaluate.

To meet these requirements, XQuery allows programmers to write queries that rely on very little type information, that take advantage of type information at run-time, or that take advantage of type information to detect potential errors before a query is ever executed. Chapter 4 provides a tutorial-like look at the topic of static typing in XQuery. Chapter 2, "Influences on the Design of XQuery," looks at the intricacies of some of the typing- related issues that members of the Work Group had to resolve.

Introduction to XQuery Types

The type system of XQuery is based on [SCHEMA]. There are two sets of types in XQuery: the built-in types that are available in any query, and types imported into a query from a specific schema. We will illustrate this with a series of functions that use increasing amounts of type information. XQuery specifies a conformance level called Basic XQuery , which is required for all implementations and allows two extensions: the schema import feature allows a W3C XML Schema to be imported in order to make its definitions available to the query, and the static typing feature allows a query to be compared to the imported schemas in order to catch errors without needing to access data. We will start with uses of types that are compatible with Basic XQuery. As we explore functions that require more type information, we will point out the points at which schema import and static typing are needed.

The first function returns a sequence of items in reverse order. The function definition does not specify the type of the parameter or the return type, which means that they may be any sequence of items:

 define function reverse($items) {    let $count := count($items)    for $i in 0 to $count    return $items[$count - $i] } reverse( 1 to 5) 

This function uses the to operator, which generates sequences of integers. For instance, the expression 1 to 5 generates the sequence 1, 2, 3, 4, 5. The reverse function takes this sequence and returns the sequence 5, 4, 3, 2, 1. Because this function does not specify a particular type for its parameter or return, it could also be used to return a sequence of some other type, such as a sequence of elements. Specifying more type information would make this function less useful.

Some functions take advantage of the known structures in XML or the built-in types of W3C XML Schema but need no advanced knowledge of any particular schema. The following function tests an element to see if it is the top-level element found in a document. If it is, then its parent node will be the document node, and the expression $e/.. instance of document will be true when evaluated for that node. The parameter type is element , since this is only defined for elements, and the return type is xs:boolean , which is a predefined type in XQuery and is the type of Boolean values:

 define function is-document-element($e as element())   as xs:boolean {   if ($e/.. instance of document-node())     then true()     else false() } 

All the built-in XML Schema types are predefined in XQuery, and these can be used to write function signatures similar to those found in conventional programming languages. For instance, the query in Listing 1.21 defines a function that computes the n th Fibonacci number and calls that function to create the first ten values of the Fibonacci sequence.

Listing 1.21 Query to Create the First Ten Fibonacci Numbers
 define function fibo($n as xs:integer) {  if ($n = 0)  then 0  else if ($n = 1)  then 1  else (fibo($n - 1) + fibo($n - 2)) } let $seq := 1 to 10 for $n in $seq return <fibo n="{$n}">{ fibo($n) }</fibo> 

Listing 1.22 shows the output of that query.

Listing 1.22 Results of the Query in Listing 1.21
 <fibo n = "1">1</fibo> <fibo n = "2">1</fibo> <fibo n = "3">2</fibo> <fibo n = "4">3</fibo> <fibo n = "5">5</fibo> <fibo n = "6">8</fibo> <fibo n = "7">13</fibo> <fibo n = "8">21</fibo> <fibo n = "9">34</fibo> <fibo n = "10">55</fibo> 

Schemas and Types

On several occasions, we have mentioned that XQuery can work with untyped data, strongly typed data, or mixtures of the two. If a document is governed by a DTD or has no schema at all, then documents contain very little type information, and queries rely on a set of rules to infer an appropriate type when they encounter values at run-time. For instance, the following query computes the average price of a book in our bibliography data:

 avg( doc("books.xml")/bib/book/price ) 

Since the bibliography does not have a schema, each price element is untyped. The avg() function requires a numeric argument, so it converts each price to a double and then computes the average. The conversion rules are discussed in detail in a later section. The implicit conversion is useful when dealing with untyped data, but prices are generally best represented as decimals rather than floating-point numbers. Later in this chapter we will present a schema for the bibliography in order to add appropriate type information. The schema declares price to be a decimal, so the average would be computed using decimal numbers.

Queries do not need to import schemas to be able to use built-in types found in data ”if a document contains built-in types, the data model preserves type information and allows queries to access it. If we use the same query we used before to compute the average price, it will now compute the price as a decimal. This means that even Basic XQuery implementations, which are not able to import a schema, are able to use simple types found in the data. However, if a query uses logic that is related to the meaning of a schema, it is generally best to import the schema. This can only be done if an implementation supports the schema import feature. Consider the following function, which is similar to one discussed earlier:

 define function books-by-author($author) {   for $b in doc("books.xml")/bib/book   where some $ba in $b/author satisfies          ($ba/last=$author/last and $ba/first=$author/first)   order by $b/title   return $b/title } 

Because this function does not specify what kind of element the parameter should be, it can be called with any element at all. For instance, a book element could be passed to this function. Worse yet, the query would not return an error, but would simply search for books containing an author element that exactly matches the book. Since such a match never occurs, this function always returns the empty sequence if called with a book element.

If an XQuery implementation supports the schema import feature, we can ensure that an attempt to call this function with anything but an author element would raise a type error. Let's assume that the namespace of this schema is "urn:examples:xmp:bib" . We can import this schema into a query and then use the element and attribute declarations and type definitions of the schema in our query, as shown in Listing 1.23.

Listing 1.23 Schema Import and Type Checking
 import schema "urn:examples:xmp:bib" at "c:/dev/schemas/eg/bib.xsd" default element namespace = "urn:examples:xmp:bib" define function books-by-author($a as element(b:author))   as element(b:title)* {   for $b in doc("books.xml")/bib/book   where some $ba in $b/author satisfies          ($ba/last=$a/last and $ba/first=$a/first)   order by $b/title   return $b/title } 

In XQuery, a type error is raised when the type of an expression does not match the type required by the context in which it appears. For instance, given the previous function definition, the function call in the following expression raises a type error, since an element named book can never be a valid author element:

 for $b in doc("books.xml")/bib/book return books-by-author($b) 

All XQuery implementations are required to detect type errors, but some implementations detect them before a query is executed, and others detect them at run-time when query expressions are evaluated. The process of analyzing a query for type errors before a query is executed is called static typing, and it can be done using only the imported schema information and the query itself ”there is no need for data to do static typing. In XQuery, static typing is an optional feature, but an implementation that supports static typing must always detect type errors statically, before a query is executed.

The previous example sets the default namespace for elements to the namespace defined by the schema. This allows the function to be written without namespace prefixes for the names in the paths. Another way to write this query is to assign a namespace prefix as part of the import and use it explicitly for element names . The query in Listing 1.24 is equivalent to the previous one.

Listing 1.24 Assigning a Namespace Prefix in Schema Imports
 import schema namespace b = "urn:examples:xmp:bib"   at "c:/dev/schemas/eg/bib.xsd" define function books-by-author($a as element(b:author))   as element(b:title)* {   for $b in doc("books.xml")/b:bib/b:book   where some $ba in $b/b:author satisfies          ($ba/b:last=$l and $ba/b:first=$f)   order by $b/b:title   return $b/b:title } 

When an element is created, it is immediately validated if there is a schema definition for its name . For instance, the following query raises an error because the schema definition says that a book must have a price:

 import schema "urn:examples:xmp:bib" at "c:/dev/schemas/eg/bib.xsd" default element namespace = "urn:examples:xmp:bib" <book year="1994">   <title>Catamaran Racing from Start to Finish</title>   <author><last>Berman</last><first>Phil</first></author>   <publisher>W.W. Norton & Company</publisher> </book> 

The schema import feature reduces errors by allowing queries to specify type information, but these errors are not caught until data with the wrong type information is actually encountered when executing a query. A query processor that implements the static typing feature can detect some kinds of errors by comparing a query to the imported schemas, which means that no data is required to find these errors. Let's modify our query somewhat and introduce a spelling error ” $a/first is misspelled as $a/firt in Listing 1.25.

Listing 1.25 Query with a Spelling Error
 import schema "urn:examples:xmp:bib" at "c:/dev/schemas/eg/bib.xsd" default element namespace = "urn:examples:xmp:bib" define function books-by-author($a as element(author))   as element(title)* {   for $b in doc("books.xml")/bib/book   where some $ba in $b/author satisfies            ($ba/last=$a/last and $ba/first=$a/firt)   order by $b/title   return $b/title } 

An XQuery implementation that supports static typing can detect this error, because it has the definition for an author element, the function parameter is identified as such, and the schema says that an author element does not have a firt element. In an implementation that has schema import but not static typing, this function would actually have to call the function before the error would be raised.

However, in the following path expression, only the names of elements are stated:

 doc("books.xml")/bib/book 

XQuery allows element tests and attribute tests, node tests that are similar to the type declaration used for function parameters. In a path expression, the node test element(book) finds only elements with the same type as the globally declared book element, which must be found in the schemas that have been imported into the query. By using this instead of the name test book in the path expression, we can tell the query processor the element definition that will be associated with $b , which means that the static type system can guarantee us that a $b will contain title elements; see Listing 1.26.

Listing 1.26 Type Tests in Path Expressions
 import schema "urn:examples:xmp:bib" at "c:/dev/schemas/eg/bib.xsd" default element namespace = "urn:examples:xmp:bib" define function books-by-author($a as element(author))   as element(title)* {   for $b in doc("books.xml")/bib/element(book)   where some $ba in $b/author satisfies            ($ba/last=$a/last and $ba/first=$a/first)   order by $b/title   return $b/title } 

Sequence Types

The preceding examples include several queries in which the names of types use a notation that can describe the types that arise in XML documents. Now we need to learn that syntax in some detail. Values in XQuery, in general, are sequences, so the types used to describe them are called sequence types . Some types are built in and may be used in any query without importing a schema into the query. Other types are defined in W3C XML Schemas and must be imported into a query before they can be used.

Built-in Types

If a query has not imported a W3C XML Schema, it still understands the structure of XML documents, including types like document, element, attribute, node, text node, processing instruction, comment, ID, IDREF, IDREFS, etc. In addition to these, it understands the built-in W3C XML Schema simple types.

Table 1.4 lists the built-in types that can be used as sequence types.

In the notation for sequence types, occurrence indicators may be used to indicate the number of items in a sequence. The character ? indicates zero or one items, * indicates zero or more items, and + indicates one or more items. Here are some examples of sequence types with occurrence indicators:

 element()+                      One or more elements xs:integer?                     Zero or one integers document-node()*                Zero or more document nodes 
Table 1.4. Built-in Types That Can Be Used as Sequence Types

Sequence Type Declaration

What It Matches

element()

Any element node

attribute()

Any attribute node

document-node()

Any document node

node()

Any node

text()

Any text node

processing-instruction()

Any processing instruction node

processing-instruction("xml-stylesheet")

Any processing instruction node whose target is xml-stylesheet

comment()

Any comment node

empty()

An empty sequence

item()

Any node or atomic value

QName

An instance of a specific XML Schema built-in type, identified by the name of the type; e.g., xs:string, xs:boolean, xs:decimal, xs:float, xs:double, xs:anyType, xs:anySimpleType

When mapping XML documents to the XQuery data model, any element that is not explicitly given a simple or complex type by schema validation has the type xs:anyType . Any attribute that is not explicitly given a simple or complex type by schema validation has the type xdt:untypedAtomic . If a document uses simple or complex types assigned by W3C XML Schema, these are preserved in the data model.

Types from Imported Schemas

Importing a schema makes its types available to the query, including the definitions of elements and attributes and the declarations of complex types and simple types. We now present a schema for bibliographies , defining types that can be leveraged in the queries we use in the rest of this chapter. To support some of the examples, we have added an attribute that contains the ISBN number for each book, and have moved the publication year to an element. Listing 1.27 shows this schema ”its relevant portions are explained carefully later in this section.

Listing 1.27 An Imported Schema for Bibliographies
 <?xml version="1.0"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"      xmlns:bib="urn:examples:xmp:bib"      targetNamespace="urn:examples:xmp:bib"      elementFormDefault="qualified"> <xs:element name="bib">   <xs:complexType>     <xs:sequence>       <xs:element ref="bib:book" minOccurs="0"                      maxOccurs="unbounded" />     </xs:sequence>   </xs:complexType> </xs:element> <xs:element name="book">   <xs:complexType>     <xs:sequence>       <xs:element name="title" type="xs:string"/>       <xs:element ref="bib:creator" minOccurs="1"               maxOccurs="unbounded"/>       <xs:element name="publisher" type="xs:string"/>       <xs:element name="price" type="currency"/>       <xs:element name="year" type="xs:gYear"/>     </xs:sequence>     <xs:attribute name="isbn" type="bib:isbn"/>   </xs:complexType> </xs:element> <xs:element name="creator" type="person" abstract="true" /> <xs:element name="author" type="person" substitutionGroup="bib:creator"/> <xs:element name="editor" type="personWithAffiliation" substitutionGroup="bib:creator"/> <xs:complexType name="person">   <xs:sequence>     <xs:element name="last" type="xs:string"/>     <xs:element name="first" type="xs:string"/>   </xs:sequence> </xs:complexType> <xs:complexType name="personWithAffiliation">   <xs:complexContent>     <xs:extension base="person">      <xs:sequence>        <xs:element name="affiliation" type="xs:string"/>      </xs:sequence>     </xs:extension>   </xs:complexContent> </xs:complexType> <xs:simpleType name="isbn">   <xs:restriction base="xs:string">     <xs:pattern value="[0-9]{9}[0-9X]"/>   </xs:restriction> </xs:simpleType> <xs:simpleType name="currency">   <xs:restriction base="xs:decimal">      <xs:pattern value="\d+.\d{2}"/>   </xs:restriction> </xs:simpleType> </xs:schema> 

Here is an example of a bibliography element that conforms to this new definition:

 <bib xmlns="urn:examples:xmp:bib">   <book isbn="0201563177">     <title>Advanced Programming in the Unix Environment</title>     <author><last>Stevens</last><first>W.</first></author>     <publisher>Addison-Wesley</publisher>     <price>65.95</price>     <year>1992</year>   </book> </bib> 

We do not teach the basics of XML Schema here ”those who do not know XML Schema should look at XML Schema primer [SCHEMA]. However, to understand how XQuery leverages the type information found in a schema, we need to know what the schema says. Here are some aspects of the previous schema that affect the behavior of examples used in the rest of this chapter:

  • All elements and types in this schema are in the namespace urn:examples:xmp:bib (for local elements, this was accomplished by using the elementFormDefault attribute at the top level of the schema). All attributes are in the null namespace.

  • The following declaration says that the isbn type is a user -defined type derived from the string type by restriction and consists of nine digits followed either by a digit or by the character x:

     <xs:simpleType name="isbn">   <xs:restriction base="xs:string">     <xs:pattern value="[0-9]{9}[0-9X]"/>   </xs:restriction> </xs:simpleType> 
  • The following declaration says that the "currency" type is derived from the decimal type by restriction, and must contain two places past the decimal point:

     <xs:simpleType name="currency">   <xs:restriction base="xs:decimal">      <xs:pattern value="\d+.\d{2}"/>   </xs:restriction> </xs:simpleType> 
  • The following declarations say that creator is an abstract element that can never actually be created, and the author and editor elements are in the substitution group of creator :

     <xs:element name="creator" type="person" abstract="true" /> <xs:element name="author" type="person" substitutionGroup="bib:creator"/> <xs:element name="editor" type="personWithAffiliation" substitutionGroup="bib:creator"/> 
  • The content model for a book specifies a creator, but since creator is an abstract element, it can never be created ”it will always match an author or an editor; see Listing 1.28.

Listing 1.28 Content Model for the Book Element
 <xs:element name="book">   <xs:complexType>     <xs:sequence>       <xs:element name="title" type="xs:string"/>       <xs:element ref="bib:creator" minOccurs="1"            maxOccurs="unbounded"/>       <xs:element name="publisher" type="xs:string"/>       <xs:element name="price" type="currency"/>       <xs:element name="year" type="xs:gYear"/>     </xs:sequence>     <xs:attribute name="isbn" type="bib:isbn"/>   </xs:complexType> </xs:element> 
  • The following elements are globally declared: bib , book , creator , author , editor . The type of the bib and book elements is "anonymous," which means that the schema does not give these types explicit names.

  • All of the named types in this schema are global; in fact, in XML Schema, all named types are global.

Now let us explore the sequence type notation used to refer to constructs imported from the above schema. The basic form of an element test has two parameters: the name of the element and the name of the type:

 element(creator, person) 

To match an element, both the name and the type must match. The name will match if the element's name is creator or in the substitution group of creator ; thus, in the above schema, the names author and editor would also match. The type will match if it is person or any other type derived from person by extension or restriction; thus, in the above schema, personWithAffiliation would also match. The second parameter can be omitted; if it is, the type is taken from the schema definition. Because the schema declares the type of creator to be person , the following declaration matches the same elements as the previous declaration:

 element(creator) 

In XML Schema, element and attribute definitions may be local, available only within a specific element or type. A context path may be used to identify a locally declared element or attribute. For instance, the following declaration matches the locally declared price element, which is found in the globally declared book element:

 element(book/price) 

Although this form is generally used to match locally declared elements, it will match any element whose name is price and which has the same type as the price element found in the globally declared book element. A similar form is used to match elements or attributes in globally defined types:

 element(type(person)/last) 

The same forms can be used for attributes, except that (1) attributes never have substitution groups in XML Schema; (2) attributes are not nillable in XML Schema; and (3) the element name is preceded by the @ symbol in the XQuery syntax. For instance, the following declaration matches attributes named price of type currency :

 attribute(@price, currency) 

The following declaration matches attributes named isbn of the type found for the corresponding attribute in the globally declared book element:

 attribute(book/@isbn) 

Table 1.5 summarizes the declarations made available by importing the schema shown in Listing 1.27.

A sequence type declaration containing a name that does not match either a built-in type or a type imported from a schema is illegal and always raises an error.

There are no nillable elements in the sample schema. To indicate that an element test will also match a nilled element, the type should be declared nillable:

 element(n, person nillable) 
Table 1.5. The Effect of Importing the XML Schema in Listing 1.27

Sequence Type Declaration

What It Matches

element(creator, person)

An element named creator of type person

element(creator)

Any element named creator of type xs:string ”the type declared for creator in the schema.

element(*, person)

Any element of type person .

element(book/price)

An element named price of type currency the type declared for price elements inside a book element.

element(type(person)/last)

An element named last of type xs:string ”the type declared for last elements inside the person type.

attribute(@price, currency)

An attribute named price of type currency .

attribute(book/@isbn)

An attribute named isbn of type isbn ”the type declared for isbn attributes in a book element.

attribute(@*, currency)

Any attribute of type currency .

bib:currency

A value of the user-defined type currency"

The above declaration would match either an n element of type person or an n person which is nilled, such as this one, which uses xsi:nil :

 <n xsi:nil="true" /> 

Working with Types

This section introduces various language features that are closely related to types, including function signatures, casting functions, typed variables , the instance of operator, typeswitch , and treat as .

Function Signatures

Parameters in a function signature may be declared with a sequence type, and the return type of a function may also be declared. For instance, the following function returns the discounted price of a book:

 import schema namespace bib="urn:examples:xmp:bib" define function discount-price($b as element(bib:book))   as xs:decimal {   0.80 * $b//bib:price } 

It might be called in a query as follows :

 for $b in doc("books.xml")//bib:book where $b/bib:title = "Data on the Web" return   <result>     {       $b/bib:title,       <price>{ discount-price($b/bib:price) }</price>     }   </result> 

In the preceding query, the price element passed to the function exactly matches the declared type of the parameter. XQuery also defines some conversion rules that are applied if the argument does not exactly match the type of the parameter. If the type of the argument does not match and cannot be converted, a type error is raised. One important conversion rule is that the value of an element can be extracted if the expected type is an atomic type and an element is encountered. This is known as atomization. For instance, consider the query in Listing 1.29.

Listing 1.29 Atomization
 import schema namespace bib="urn:examples:xmp:bib" define function discount-price($p as xs:decimal)   as xs:decimal {   0.80 * $p//bib:price } for $b in doc("books.xml")//bib:book where $b/bib:title = "Data on the Web" return   <result>     {       $b/bib:title,       <price>{ discount-price($b/bib:price) }</price>     }   </result> 

When the typed value of the price element is extracted, its type is bib:currency . The function parameter expects a value of type xs:decimal , but the schema imported into the query says that the currency type is derived from xs:decimal , so it is accepted as a decimal.

In general, the typed value of an element is a sequence. If any value in the argument sequence is untyped, XQuery attempts to convert it to the required type and raises a type error if it fails. For instance, we can call the revised discount-price() function as follows:

 let $w := <foo>12.34</foo> return discount-price($w) 

In this example, the foo element is not validated, and contains no type information. When this element is passed to the function, which expects a decimal, the function first extracts the value, which is untyped. It then attempts to cast 12.34 to a decimal; because 12.34 is a legitimate lexical representation for a decimal, this cast succeeds. The last conversion rule for function parameters involves type promotion : If the parameter type is xs:double , an argument whose type is xs:float or xs:decimal will automatically be cast to the parameter type; if the parameter type is xs:float , an argument whose type is xs:decimal will automatically be cast to the parameter type.

The parameter type or the return type may be any sequence type declaration. For instance, we can rewrite our function to take a price element, which is a locally declared element, by using a context path in the sequence type declaration:

 import schema namespace bib="urn:examples:xmp:bib" define function discount-price($p as element(bib:book/bib:price))   as xs:decimal {   0.80 * $p } 

If the price element had an anonymous type , this would be the only way to indicate a price element of that type. Since our schema says a price element has the type bib:currency , the preceding function is equivalent to this one:

 import schema namespace bib="urn:examples:xmp:bib" define function discount-price($p as element(bib:price, bib:currency))   as xs:decimal {   0.80 * $p } 

The same conversion rules that are applied to function arguments are also applied to function return values. Consider the following function:

 define function decimate($p as element(bib:price, bib:currency))   as xs:decimal {     $p } 

In this function, $p is an element named bib:price of type bib:currency . When it is returned, the function applies the function conversion rules, extracting the value, which is an atomic value of type bib:currency , then returning it as a valid instance of xs:decimal , from which its type is derived.

Casting and Typed Value Construction

Casting and typed value construction are closely related in XQuery. Constructor functions can be used to do both. In XQuery, any built-in type is associated with a constructor function that is found in the XML Schema namespace and has the same name as the type it constructs. This is the only way to create some types, including most date types. Here is a constructor for a date:

 xs:date("2000-01-01") 

Constructor functions check a value to make sure that the argument is a legal value for the given type and raise an error if it is not. For instance, if the month had been 13 , the constructor would have raised an error.

Constructor functions are also used to cast values from one type to another. For instance, the following query converts an integer to a string:

 xs:string( 12345 ) 

Some types can be cast to each other, others cannot. The set of casts that will succeed can be found in [XQ-FO]. Constructor functions are also created for imported simple types ”this is discussed in the section on imported schemas.

When a schema is imported and that schema contains definitions for simple types, constructor functions are automatically created for these types. Like the built-in constructor functions, these functions have the same name as the type that is constructed . For instance, the currency type in our bibliography schema limits values to two digits past the decimal, and the isbn type restricts ISBN numbers to nine digits followed by either another digit or the letter X . Importing this schema creates constructor functions for these two types. The following expression creates an atomic value of type isbn :

 import schema namespace bib="urn:examples:xmp:bib" bib:isbn("012345678X") 

The constructor functions for types check all the facets for those types. For instance, the following query raises an error because the pattern in the type declaration says that an ISBN number may not end with the character Y :

 import schema namespace bib="urn:examples:xmp:bib" bib:isbn("012345678Y") 
Typed Variables

Whenever a variable is bound in XQuery, it can be given a type by using an as clause directly after the name of the variable. If a value that is bound to a typed variable does not match the declared type, a type error is raised. For instance, in the query shown in Listing 1.30, the let clause states that $authors must contain one or more author elements.

Listing 1.30 Declaring the Type of a Variable
 import schema namespace bib="urn:examples:xmp:bib" for $b in doc("books.xml")//bib:book let $authors as element(bib:author)+ := $b//bib:author return   <result>     {       $b/bib:title,       $authors     } </result> 

Since the schema for a bibliography allows a book to have editors but no authors, this query will raise an error if such a book is encountered. If a programmer simply assumed all books have authors, using a typed variable might identify an error in a query.

The instance of Operator

The instance of operator tests an item for a given type. For instance, the following expression tests the variable $a to see if it is an element node:

 $a instance of element() 

As you recall, literals in XQuery have types. The following expressions each return true :

 <foo/> instance of element() 3.14 instance of xs:decimal "foo" instance of xs:string (1, 2, 3) instance of xs:integer* () instance of xs:integer? (1, 2, 3) instance of xs:integer+ 

The following expressions each return false:

 3.14 instance of xdt:untypedAtomic "3.14" instance of xs:decimal 3.14 instance of xs:integer 

Type comparisons take type hierarchies into account. For instance, recall that SKU is derived from xs:string . The following query returns true :

 import schema namespace bib="urn:examples:xmp:bib" bib:isbn("012345678X") instance of xs:string 
The typeswitch Expression

The typeswitch expression chooses an expression to evaluate based on the dynamic type of an input value ”it is similar to the CASE statement found in several programming languages, but it branches based on the argument's type, not on its value. For instance, suppose we want to write a function that creates a simple wrapper element around a value, using xsi:type to preserve the type of the wrapped element, as shown in Listing 1.31.

Listing 1.31 Function Using the typeswitch Expression
 define function wrapper($x as xs:anySimpleType)   as element() { typeswitch ($x)       case $i as xs:integer            return <wrap xsi:type="xs:integer">{ $i }</wrap>       case $d as xs:decimal            return <wrap xsi:type="xs:decimal">{ $d }</wrap>       default            return error("unknown type!") } wrapper( 1 ) 

The case clause tests to see if $x has a certain type; if it does, the case clause creates a variable of that type and evaluates the associated return clause. The error function is a standard XQuery function that raises an error and aborts execution of the query. Here is the output of the query in Listing 1.31:

 <wrap xsi:type="xs:integer">1</wrap> 

The case clauses test to see if $x has a certain type; if it does, the case clause creates a variable of that type and evaluates the first return clause that matches the type of $x . In this example, 1 is both an integer and a decimal, since xs:integer is derived from xs:decimal in XML Schema, so the first matching clause is evaluated. The error function is a standard XQuery function that raises an error and aborts execution of the query.

The typeswitch expression can be used to implement a primitive form of polymorphism. For instance, suppose authors and editors are paid different percentages of the total price of a book. We could write the function shown in Listing 1.32, which invokes the appropriate function to calculate the payment based on the substitution group hierarchy.

Listing 1.32 Using typeswitch to Implement Simple Polymorphism
 import schema namespace bib="urn:examples:xmp:bib" define function pay-creator(     $c as element(bib:creator),     $p as xs:decimal) {   typeswitch ($c)       case $a as element(bib:author)            return pay-author($a, $p)       case $e as element(bib:editor)            return pay-editor($e, $p)       default            return error("unknown creator element!") } 
The treat as Expression

The treat as expression asserts that a value has a particular type, and raises an error if it does not. It is similar to a cast, except that it does not change the type of its argument, it merely examines it. Treat as and instance of could be used together to write the function shown in Listing 1.33, which has the same functionality as the function in Listing 1.32.

Listing 1.33 Using treat as and instance of to Implement Simple Polymorphism
 import schema namespace bib="urn:examples:xmp:bib" define function pay-creator(   $c as element(bib:creator),   $p as xs:decimal) { if ($c instance of element(bib:author)) then pay-author($a, $p) else if ($c instance of element(bib:editor)) then pay-editor($e, $p) else error("unknown creator element!") } 

In general, typeswitch is preferable for this kind of code, and it also provides better type information for processors that do static typing.

Implicit Validation and Element Constructors

We have already discussed the fact that validation of the elements constructed in a query is automatic if the declaration of an element is global and is found in a schema that has been imported into the query. Elements that do not correspond to a global element definition are not validated. In other words, element construction uses XML Schema's lax validation mode . The query in Listing 1.34 creates a fully validated book element, with all the associated type information.

Listing 1.34 Query That Creates a Fully Validated Book Element
 import schema namespace bib="urn:examples:xmp:bib" <bib:book isbn="0201633469">   <bib:title>TCP/IP Illustrated</bib:title>   <bib:author>     <bib:last>Stevens</bib:last>     <bib:first>W.</bib:first>   </bib:author>   <bib:publisher>Addison-Wesley</bib:publisher>   <bib:price>65.95</bib:price>   <bib:year>1994</bib:year> </bib:book> 

Because element constructors validate implicitly, errors are caught early, and the types of elements may be used appropriately throughout the expressions of a query. If the element constructor in Listing 1.34 had omitted a required element or misspelled the name of an element, an error would be raised.

Relational programmers are used to writing queries that return tables with only some columns from the original tables that were queried. These tables often have the same names as the original tables, but a different structure. Thus, a relational programmer is likely to write a query like the following:

 import schema namespace bib="urn:examples:xmp:bib" for $b in doc("books.xml")//bib:book return   <bib:book>     {       $b/bib:title,       $b//element(bib:creator)     } </bib:book> 

This query raises an error, because the bib:book element that is returned has a structure that does not correspond to the schema definition. Validation can be turned off using a validate expression, as shown in Listing 1.35, which uses skip .

Listing 1.35 Using validate to Disable Validation
 import schema namespace bib="urn:examples:xmp:bib" for $b in doc("books.xml")//bib:book return  validate skip  {   <bib:book>     {       $b/bib:title,       $b//element(bib:creator)     }     </bib:book> } 

The validate expression can also be used to specify a validation context for locally declared elements or attributes. For instance, the price element is locally declared:

 import schema namespace bib="urn:examples:xmp:bib" validate context bib:book  {   <bib:price>49.99</bib:price>  } 

If an element's name is not recognized, it is treated as an untyped element unless xsi:type is specified. For instance, the following query returns a well- formed element with untyped content, because the bib:mug element is not defined in the schema:

 import schema namespace bib="urn:examples:xmp:bib" <bib:mug>49.99</bib:mug> 

A query can specify the type of an element using the xsi:type attribute; in this case, the element is validated using the specified type:

 import schema namespace bib="urn:examples:xmp:bib" <bib:mug xsi:type="xs:decimal">49.99</bib:mug> 

If a locally declared element is not wrapped in a validate expression that specifies the context, it will generally be treated as a well-formed element with untyped content, as in the following query:

 import schema namespace bib="urn:examples:xmp:bib" <bib:price>49.99</bib:price> 

To prevent errors like this, you can set the default validation mode to strict , which means that all elements must be defined in an imported schema, or an error is raised. This is done in the prolog. The following query raises an error because the bib:price element is not recognized in the global context:

 import schema namespace bib="urn:examples:xmp:bib" validation strict <bib:price>49.99</bib:price> 

The validation mode may be set to lax , which is the default behavior, strict , as shown above, or skip if no validation is to be performed in the query.



XQuery from the Experts(c) A Guide to the W3C XML Query Language
Beginning ASP.NET Databases Using VB.NET
ISBN: N/A
EAN: 2147483647
Year: 2005
Pages: 102

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net