Operators

The queries we have shown up to now all contain operators, which we have not yet covered. Like most languages, XQuery has arithmetic operators and comparison operators, and because sequences of nodes are a fundamental datatype in XQuery, it is not surprising that XQuery also has node sequence operators. This section describes these operators in some detail. In particular, it describes how XQuery treats some of the cases that arise quite easily when processing XML; for instance, consider the following expression: 1 * $b . How is this interpreted if $b is an empty sequence, untyped character data, an element, or a sequence of five nodes? Given the flexible structure of XML, it is imperative that cases like this be well defined in the language. (Chapter 2, "Influences on the Design of XQuery," provides additional background on the technical complexities that the working group had to deal with to resolve these and similar issues.)

Two basic operations are central to the use of operators and functions in XQuery. The first is called typed value extraction. We have already used typed value extraction in many of our queries, without commenting on it. For instance, we have seen this query:

 doc("books.xml")/bib/book/author[last='Stevens'] 

Consider the expression last='Stevens' . If last is an element, and 'Stevens' is a string, how can an element and a string be equal? The answer is that the = operator extracts the typed value of the element, resulting in a string value that is then compared to the string Stevens . If the document is governed by a W3C XML Schema, then it may be associated with a simple type, such as xs:integer . If so, the typed value will have whatever type has been assigned to the node by the schema. XQuery has a function called data() that extracts the typed value of a function. Assuming the following element has been validated by a schema processor, the result of this query is the integer 4:

 data( <e xsi:type="xs:integer">4</e> ) 

A query may import a schema. We will discuss schema imports later, but schema imports have one effect that should be understood now. If typed value extraction is applied to an element, and the query has imported a schema definition for that element specifying that the element may have other elements as children, then typed value extraction raises an error.

Typed value extraction is defined for a single item. The more general form of typed value extraction is called atomization , which defines how typed value extraction is done for any sequence of items. For instance, atomization would be performed for the following query:

 avg( 1, <e>2</e>, <e xsi:type="xs:integer">3</e> ) 

Atomization simply returns the typed value of every item in the sequence. The preceding query returns 2, which is the average of 1, 2, and 3. In XQuery, atomization is used for the operands of arithmetic expressions and comparison expressions. It is also used for the parameters and return values of functions and for cast expressions, which are discussed in other sections.

Arithmetic Operators

XQuery supports the arithmetic operators + , - , * , div , idiv , and mod . The div operator performs division on any numeric type. The idiv operator requires integer arguments, and returns an integer as a result, rounding toward 0. All other arithmetic operators have their conventional meanings. If an operand of an arithmetic operator is a node, atomization is applied. For instance, the following query returns the integer 4:

 2 + <int>{ 2 }</int> 

If an operand is an empty sequence, the result of an arithmetic operator is an empty sequence. Empty sequences in XQuery frequently operate like nulls in SQL. The result of the following query is an empty sequence:

 2 + () 

If an operand is untyped data , it is cast to a double, raising an error if the cast fails. This implicit cast is important, because a great deal of XML data is found in documents that do not use W3C XML Schema, and therefore do not have simple or complex types. Many of these documents however contain data that is to be interpreted as numeric. The prices in our sample document are one example of this. The following query adds the first and second prices, returning the result as a double:

 let $p := doc("books.xml")//price return $p[1] + $p[2] 

Comparison Operators

XQuery has several sets of comparison operators, including value comparisons, general comparisons, node comparisons, and order comparisons. Value comparisons and general comparisons are closely related ; in fact, each general comparison operator combines an existential quantifier with a corresponding a value comparison operator . Table 1.3 shows the value comparison operator to which each general comparison operator corresponds.

The value comparisons compare two atomic values. If either operand is a node, atomization is used to convert it to an atomic value. For the comparison, if either operand is untyped, it is treated as a string. Here is a query that uses the eq operator:

 for $b in doc("books.xml")//book where $b/title eq "Data on the Web" return $b/price 
Table 1.3. Value Comparison Operators vs. General Comparison Operators

Value Comparison Operator

General Comparison Operator

eq

=

ne

!=

lt

<

le

<=

gt

>

ge

>=

Using value comparisons, strings can only be compared to other strings, which means that value comparisons are fairly strict about typing. If our data is governed by a DTD, then it does not use the W3C XML Schema simple types, so the price is untyped. Therefore, a cast is needed to cast price to a decimal in the following query:

 for $b in doc("books.xml")//book where xs:decimal($b/price) gt 100.00 return $b/title 

If the data were governed by a W3C XML Schema that declared price to be a decimal , this cast would not have been necessary. In general, if the data you are querying is meant to be interpreted as typed data, but there are no types in the XML, value comparisons force your query to cast when doing comparisons ”general comparisons are more loosely typed and do not require such casts. This problem does not arise if the data is meant to be interpreted as string data, or if it contains the appropriate types.

Like arithmetic operators, value comparisons treat empty sequences much like SQL nulls. If either operand is an empty sequence, a value comparison evaluates to the empty sequence. If an operand contains more than one item, then a value comparison raises an error. Here is an example of a query that raises an error:

 for $b in doc("books.xml")//book where $b/author/last eq "Stevens" return $b/title 

The reason for the error is that many books have multiple authors, so the expression $b/author/last returns multiple nodes. The following query uses = , the general comparison that corresponds to eq , to return books for which any author's last name is equal to Stevens:

 for $b in doc("books.xml")//book where $b/author/last = "Stevens" return $b/title 

There are two significant differences between value comparisons and general comparisons. The first is illustrated in the previous query. Like value comparisons, general comparisons apply atomization to both operands, but instead of requiring each operand to be a single atomic value, the result of this atomization may be a sequence of atomic values. The general comparison returns true if any value on the left matches any value on the right, using the appropriate comparison.

The second difference involves the treatment of untyped data ”general comparisons try to cast to an appropriate "required type" to make the comparison work. This is illustrated by the following query:

 for $b in doc("books.xml")//book where $b/price = 100.00 return $b/title 

In this query, 100.00 is a decimal, and the = operator casts the price to decimal as well. When a general comparison tests a pair of atomic values and one of these values is untyped, it examines the other atomic value to determine the required type to which it casts the untyped operand:

  • If the other atomic value has a numeric type, the required type is xs:double .

  • If the other atomic value is also untyped, the required type is xs:string .

  • Otherwise, the required type is the dynamic type of the other atomic value. If the cast to the required type fails, a dynamic error is raised.

These conversion rules mean that comparisons done with general comparisons rarely need to cast when working with data that does not contain W3C XML Schema simple types. On the other hand, when working with strongly typed data, value comparisons offer greater type safety.

You should be careful when using the = operator when an operand has more than one step, because it can lead to confusing results. Consider the following query:

 for $b in doc("books.xml")//book where $b/author/first = "Serge"   and $b/author/last = "Suciu" return $b 

The result of this query may be somewhat surprising, as Listing 1.17 shows.

Listing 1.17 Surprising Results
 <book year = "2000">   <title>Data on the Web</title>   <author>     <last>Abiteboul</last>     <first>Serge</first>   </author>   <author>     <last>Buneman</last>     <first>Peter</first>   </author>   <author>     <last>Suciu</last>     <first>Dan</first>   </author>   <publisher>Morgan Kaufmann Publishers</publisher>   <price>39.95</price> </book> 

Since this book does have an author whose first name is "Serge" and an author whose last name is "Suciu," the result of the query is correct, but it is surprising. The following query expresses what the author of the previous query probably intended:

 for $b in doc("books.xml")//book,     $a in $b/author where $a/first="Serge"   and $a/last="Suciu" return $b 

Comparisons using the = operator are not transitive. Consider the following query:

 let $a := ( <first>Jonathan</first>, <last>Robie</last> ),        $b := ( <first>Jonathan</first>, <last>Marsh</last> ),        $c := ( <first>Rodney</first>, <last>Marsh</last> ) return <out>   <equals>{ $a = $b }</equals>   <equals>{ $b = $c }</equals>   <equals>{ $a = $c }</equals> </out> 

Remember that = returns true if there is a value on the left that matches a value on the right. The output of this query is as follows :

 <out>   <equals>True</equals>   <equals>True</equals>   <equals>False</equals> </out> 

Node comparisons determine whether two expressions evaluate to the same node. There are two node comparisons in XQuery, is and is not . The following query tests whether the most expensive book is also the book with the greatest number of authors and editors:

 let $b1 := for $b in doc("books.xml")//book            order by count($b/author) + count($b/editor)            return $b let $b2 := for $b in doc("books.xml")//book            order by $b/price            return $b return $b1[last()] is $b2[last()] 

This query also illustrates the last() function, which determines whether a node is the last node in the sequence; in other words, $b1[last()] returns the last node in $b1 .

XQuery provides two operators that can be used to determine whether one node comes before or after another node in document order. These operators are generally most useful for data in which the order of elements is meaningful, as it is in many documents or tables. The operator $a << $b returns true if $a precedes $b in document order; $a >> $b returns true if $a follows $b in document order. For instance, the following query returns books where Abiteboul is an author, but is not listed as the first author:

 for $b in doc("books.xml")//book let $a := ($b/author)[1],     $sa := ($b/author)[last="Abiteboul"] where $a << $sa return $b 

In our sample data, there are no such books.

Sequence Operators

XQuery provides the union , intersect , and except operators for combining sequences of nodes. Each of these operators combines two sequences, returning a result sequence in document order. As we have discussed earlier, a sequence of nodes that is in document order, never contains the same node twice. If an operand contains an item that is not a node, an error is raised.

The union operator takes two node sequences and returns a sequence with all nodes found in the two input sequences. This operator has two lexical forms: and union . Here is a query that uses the operator to return a sorted list of last names for all authors or editors:

 let $l := distinct-values(doc("books.xml")//(author  editor)/last) order by $l return <last>{ $l }</last> 

Here is the result of the above query:

 <last>Abiteboul</last> <last>Buneman</last> <last>Gerbarg</last> <last>Stevens</last> <last>Suciu</last> 

The fact that the union operator always returns nodes in document order is sometimes quite useful. For instance, the following query sorts books based on the name of the first author or editor listed for the book:

 for $b in doc("books.xml")//book let $a1 := ($b/author union $b/editor)[1] order by $a1/last, $a1/first return $b 

The intersect operator takes two node sequences as operands and returns a sequence containing all the nodes that occur in both operands. The except operator takes two node sequences as operands and returns a sequence containing all the nodes that occur in the first operand but not in the second operand. For instance, the following query returns a book with all of its children except for the price:

 for $b in doc("books.xml")//book where $b/title = "TCP/IP Illustrated" return    <book>     { $b/@* }     { $b/* except $b/price }    </book> 

The result of this query contains all attributes of the original book and all elements ”in document order ”except for the price element, which is omitted:

 <book year = "1994"> <title>TCP/IP Illustrated</title> <author>     <last>Stevens</last>     <first>W.</first> </author> <publisher>Addison-Wesley</publisher> </book> 


XQuery from the Experts(c) A Guide to the W3C XML Query Language
Beginning ASP.NET Databases Using VB.NET
ISBN: N/A
EAN: 2147483647
Year: 2005
Pages: 102

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net