Hack 24 Execute an XQuery with Saxon | XML Hacks: 100 Industrial-Strength Tips and Tools

So you know how to write an XQuery? Great! But can you execute an XQuery? This hack shows you how.

When executing XQuery you have a wide range of options. Nearly every vendor from the well-known old guard (IBM, Oracle, BEA, and Microsoft) to the plucky upstarts (Mark Logic, X-Hive/DB, and Qizx/open) to the open source projects led by individuals (Saxon and Qexo) has expressed their support for XQuery and have XQuery implementations to offer. The implementations vary widely in purpose as well as in performance and scalability.

XQuery implementations tend to fall into one of three camps. First, there's the streaming transformation model. In this fairly simple application, XQuery defines the mapping from one file format to another. XQuery as a language has certain optional features (such as reverse axes) not needed for implementations doing simple streaming transformations where you can forget nodes right after you read them. In this use XQuery is similar to XSLT, but XQuery adds data typing and static type-analysis. Letting you statically verify your XQuery code will always generate a properly constructed document conformant to a specific schema. A good example of the streaming engine is BEA's WebLogic Integration product (http://www.bea.com/framework.jsp?CNT=index.htm&FP=/content/products/integrate).

Second, XQuery can be used as a meta query language executing against one or more relational databases. In this scenario, XQuery accesses the relational stores as if their tables were XML documents, pushing query predicates to the database as SQL for optimized execution and then merging and manipulating the results within the XQuery code as XML. Theoretically any data store, not just relational data, can be made accessible to the XQuery environment. Here XQuery becomes a lingua franca query language, the X standing more for "plug your data format in here" than for XML. XQuery also can be used to query columns containing XML typed data. A good example of the relational approach is BEA's Liquid Data for WebLogic product (http://www.bea.com/framework.jsp?CNT=index.htm&FP=/content/products/liquid_data) or Microsoft's upcoming version of SQL Server code-named Yukon (http://www.microsoft.com/sql/yukon/productinfo/).

Third, there's the pure-play XQuery implementation where instead of mapping XQuery to another query language, it's used directly against a content database designed from the ground up for XQuery. This approach works well for managing data that hasn't yet been put into a database or that doesn't fit neatly into the rectangular boxes imposed by relational databases; for example, medical records, textbook content, office documents, and web pages. In this model, you store the documents directly into the XQuery database possibly going through a conversion to XML, but without any complicated shredding to a relational format needed. Then you query the documents to extract the bits and pieces deemed important. Personally, I'm most interested in the pure-play approach because it's the most likely to change the world, as they say. A good example of this is Mark Logic's Content Interaction Server (http://www.marklogic.com/prod.html). Most open source products are also following this model, but without the real database backing.

The mechanism to execute an XQuery depends on which of these camps your product falls into. Streaming XQuery engines may be triggered by web service requests (a convenient XML input) and typically generate web service responses. Relational XQuery engines may include XQuery in their SQL calls or in lieu of SQL calls. And pure-play implementations can execute from files, via Java interfaces, or over networks.

2.15.1 Executing XQuery from a File Using Saxon

Executing XQuery from a file can be the easiest way to get started. Many engines allow file-based execution. One of the best open source engines for this is Saxon, written by Michael Kay and available at http://saxon.sourceforge.net. In its current 8.0 release, it exposes the following command-line interface (with saxon8.jar explicitly in the classpath):

java -cp saxon8.jar net.sf.saxon.Query [options] queryfile        [params...]

It uses the files in the filesystem as its backend data store. (The data is naturally unindexed so the scalability isn't comparable to an XQuery engine running as a database, but it works great for learning.) By placing the Shakespeare files from [Hack #23] (available at http://www.oasis-open.org/cover/bosakShakespeare200.html) in the current directory and using the file speakers.xqy, we can run the example against Saxon like this:

java -cp saxon8.jar net.sf.saxon.Query speakers.xqy

2.15.2 Piping Queries to Saxon

A neat trick is to pipe XQuery expressions to Saxon via the command line using a hyphen as the query filename to indicate reading from standard input.

Of course, you can also pipe the results of your XQuery to another program, here counting the characters in the title:

echo "doc('hamlet.xml')/PLAY/TITLE/text( )" | java -cp       saxon8.jar net.sf.saxon.Query - | wc -c

Let your mind work on that one for a minute. The possibilities are interesting.

2.15.3 Executing XQuery from Java Using XQJ

XQuery vendors often provide custom Java interfaces to their products. Right now every vendor has a different API, but there's an effort underway in the Java Community Process to standardize the Java interface to an XQuery engine. The effort is called XQJ (XQuery API for Java) and is led by Oracle and IBM under JSR 225 (http://www.jcp.org/en/jsr/detail?id=225). It's expected that XQJ will be to XQuery what JDBC is to SQL. Example 2-16 demonstrates how to use XQJ to execute a simple XQuery expression from Java (this is not a complete Java program).

Example 2-16. Using XQJ to portably access an XQuery engine

...     XQConnection conn = null;  XQExpression expr = null;  XQResultSequence result = null;      try {   // Use JNDI to get an initial XQDataSource to build Connections   InitialContext initCtx = new InitialContext( );   XQDataSource source =     (XQDataSource) initCtx.lookup("java:comp/env/xqj/primary");       // XQDataSource to XQConnection to XQExpression to XQResultSequence   conn = source.getConnection( );    expr = conn.createExpression( );       String query =     "declare function prime($i as xs:integer) as xs:boolean {" +     "  $i = 2 or not(some $denom in (2 to $i - 1) satisfies $i mod                      $denom = 0)" +     "}; " +     "for $i in (1 to 100)" +     "return" +     "  if (prime($i)) then $i else ( )";       result = expr.executeQuery(query);        // Iterate over the result sequence pulling answers one at a time   while (result.next( )) {      int prime = result.getInt( );      System.out.println("Prime: " + prime);    }  }  catch (XQException e) {    e.printStackTrace( );  }  // Free the resources whether or not there was an exception  finally {    if (conn != null) {      try { conn.close( ); } catch (XQException ignored) { }    }  }

As demonstrated in the first line of the try block, JNDI provides the standard mechanism to get an initial XQDataSource. The XQDataSource heads the chain of objects that goes from a connection to an expression to an executed query. The query here manually calculates prime numbers between 1 and 100. Java gets the result as an XQResultSequence that can be iterated over, fetching each returned value in the result.

Note that XQJ, like XQuery, is still under active development. The code shown here is based on the May 2004 early draft release. Details are highly likely to change before final release.

2.15.4 Executing XQuery on the Web

One final and interesting way to execute XQuery is by placing it directly on the Web. Some vendors, such as Mark Logic and Qexo, let you place a query script file directly under an HTTP server document root. When a client requests the query file with its special extension (such as http://example.com/request.xqy), the server executes the query file content and returns the result. It's basically CGI for XQuery. And because XQuery so easily constructs dynamic XHTML output, it's an amazingly quick development and deployment model. There's no need to use Java classes in processing the result.

The recipe:

Write an XQuery program that outputs XHTML.
Save it under the server's document root with a special file extension.
Have clients request the file, causing the query to run.

For vendors without built-in support, it's possible to write a servlet that handles the *.xqy file pattern and uses XQJ to execute the file's contents, streaming the results back to the client.

2.15.5 See Also

X-Hive/DB, a native XML database with XQuery support: http://www.x-hive.com/products/db/
Qizx/open, an open source Java implementation of XQuery: http://www.xfra.net/qizxopen/
Qexo, the GNU Kawa implementation of XQuery: http://www.gnu.org/software/qexo/

Jason Hunter

Table of content

XML Hacks: 100 Industrial-Strength Tips and Tools

ISBN: 0596007116
EAN: 2147483647

Year: 2006
Pages: 156

Authors: Michael Fitzgerald

BUY ON AMAZON

Interprocess Communications in Linux: The Nooks and Crannies

101 Microsoft Visual Basic .NET Applications

PMP Practice Questions Exam Cram 2

The Oracle Hackers Handbook: Hacking and Defending Oracle

DNS & BIND Cookbook

Python Standard Library (Nutshell Handbooks) with