Why Was a New Query Language Needed?

Historically, the work that led to the development of XQuery started long before XSLT 1.0 and XPath 1.0 were published. At first there was little contact between the two groups. During 1998 and 1999 there was some cross-influence between XQL, one of the precursors of XQuery, and the emerging XPath language (probably each language influenced the other, though this is hard to verify). But neither group would have seen the other language as being directly relevant to the requirements they were addressing ”the degree of overlap only became apparent later.

The differences between XSLT and XQuery are of two kinds. First, they have different requirements, and therefore a design decision that was appropriate for XSLT would not necessarily be right for XQuery, and vice versa. The second kind of difference results from their being designed by different people from different communities and computing traditions, with different beliefs about what constitutes good design, and different experiences as to what works well and what doesn't.

Differing Requirements

As we have seen, XSLT was produced as a spin-off from the XSL (eXtensible Stylesheet Language) activity, whose primary focus was the rendition (for human consumption) of information contained in XML documents. Although the concept of transformation was seen as having much more general applicability, and the language was clearly designed to be capable of performing a wide variety of transformation tasks , styling of XML remained the primary use case. The fact that the working group chose to concentrate on this requirement is evident in a statement right at the start of the XSLT 1.0 specification: "XSLT is not intended as a completely general-purpose transformation language. Rather, it is designed primarily for the kinds of transformation that are needed when XSLT is used as part of XSL" [XSLT, p. 1]

I was not a member of the working group at the time, but I find it easy to imagine the members agreeing on this statement as a matter of policy, and then using it to reject the inclusion of features that were considered outside this scope; for example, the inclusion of advanced mathematical or text-manipulation operators. But it is also easy to imagine that some members of the group knew in their hearts that a general-purpose transformation language was needed and were determined that XSLT should be capable of fulfilling this role; indeed, if there had not been people who believed this, it is hard to see why the policy statement reproduced above would have been inserted.

The concept of a transformation language implies certain assumptions about the processing environment. Transformation essentially takes one document (or a few documents) as input, and produces one document (or a few documents) as output. The documents, although processed as trees, will typically be parsed from serial files immediately before they are transformed; they will not have been preloaded into a database providing specialized indexing or access methods . The source document is not modified by the transformation process, and it generally fits in main memory. [3]

[3] Actually, this assumption is not always true, but XSLT is essentially designed to handle those cases where it is true. There is no widely used implementation of XSLT that avoids the need to build the source tree in memory

The fact that the working group focused on the transformations that occur during document styling added further assumptions. Document-oriented XML would be encountered more often than data-oriented XML. The source documents might or might not be valid according to a DTD. Stylesheets would typically be written to process a variety of source documents with differences in structure. The processing would most often be serial in nature: The order of elements in the result tree would usually be the same as the order of corresponding elements in the source. The language should probably be permissive in its error handling: Errors in the stylesheet should result in as much of the source document as possible being displayed, rather than causing a run-time error message that would mean nothing to the end user .

The intended role and purpose of the language also created expectations about the user who would be writing the transformations. The user would probably be authoring XML documents as well as stylesheets, so use of common XML editing tools, as well as the ability to copy and paste chunks of XML into a stylesheet, would be convenient . Stylesheets would typically exist as free-standing documents, accessible by URL, and compiled on demand; sometimes they might be embedded in the source documents themselves .

The scenario addressed by XQuery was very different. As a database query language, XQuery was concerned with the extraction of information from large collections of documents (or large individual documents), which would normally be held on disk, in databases with physical storage structures, such as indexes, designed to enable rapid retrieval. Such collections of documents would often be subject to some kind of central design control, which means they would usually have a uniform schema, and they would typically be validated against this schema before being loaded into the database. Indeed, some vendors see XQuery being used essentially to query an XML view of a conventional relational database.

This different scenario leads to different requirements, or at least to differing emphasis among the requirements. Documents were more likely to be data-oriented than document-oriented, although the query language was supposed to be able to handle both. Optimization of queries was essential if performance was to be acceptable, and this optimization would involve an analysis of the query against the schema of the target database, if only to discover what indexes might be available. Because the documents would often be data-oriented, preserving order was less important, and would in many cases be unnecessary. Error handling would probably need to be strict: If a query was incorrect, it would be better to produce an error message as early as possible, rather than to execute a perhaps lengthy query and produce results to a question that the user did not mean to ask.

The expected usage scenario for XQuery would be similar to that for other database query languages like SQL. Occasionally, expert users might use the query language directly from a terminal; but much more frequently, queries would be embedded in programs written in a host language such as Java or C#, delivering their results into host language variables for further processing by the application. Some people even see XQuery being embedded within SQL, as a sublanguage for querying XML held within relational databases. Serializing the query results as an XML document might be one option for delivering the results, but by no means the only option.

Thus, although there is a considerable overlap in what XSLT and XQuery actually need to do (they both select data from input XML documents and construct new XML documents from this data), substantial differences exist in the usage scenarios where the two languages were primarily targeted , and these have led to some genuine differences in the optimum design parameters for the two languages.

Differing Cultures

At the beginning of this section I described two reasons for the differences between XSLT and XQuery. We have looked at the differences in technical requirements for the two languages; now let's examine the differences resulting from differing cultures. These are no less valid: Just as an architect designing a building in Tokyo has to take into account the fact that the lifestyle is not the same as that in Los Angeles, so the designers of a computer language have to work within a tradition that sets implicit criteria for what is good design and what is not acceptable. The design of software, as with music or physical architecture, is essentially a creative intellectual activity, and the outcome depends a great deal on the experiences and creative preferences of the people doing the design and the peer group who provide them with feedback.

The designers of XSLT came predominantly from an SGML (Standard Generalized Markup Language) background. They were familiar with document processing, with the abstractions of the formal model underlying SGML and its stylesheet language DSSSL (Document Style Semantics and Specification Language), itself heavily based on functional programming languages such as Scheme. They understood the complexities of pagination, wordwrap , and hyphenation algorithms, and the way in which these varied depending on the natural language of the text and local typographical traditions. But few of them had any background in database technology. They were not experts in the optimization of relational algebra, nor were they immersed in the traditions of database report writers or the calculations involved in data visualization.

By contrast, the designers of XQuery came solidly from the database world. Several of the leading figures in the XQuery working group (including several authors represented in this book) have also played a significant role in the development of SQL and of object database languages such as OQL. These people brought with them the knowledge gained from thirty years of progress in database technology ”progress primarily in the design of query languages and the associated optimization strategies, together with the gradual evolution of data models to handle richer structures than the traditional "punched card" model of the 1970s relational database. Few of these people, however, had much previous exposure to the SGML or XML culture, with its very different way of thinking about structural constraints and validation, or to the kind of structural manipulations required to handle the trees that result from markup of a linear text.

There's another difference in the culture behind the two languages that is worth mentioning. The group that developed XSLT 1.0 was much smaller, in terms of active participants , than the XQuery group, and one individual, James Clark, had the unofficial role of chief designer, with the rest of the group essentially acting as a steering group and review body. The XQuery group never had a single individual who could be identified as the chief architect in the same kind of way. It had (and has still) a much broader base of talented individuals, each of them highly capable, who do not always share the same vision. The result is that while the group is less exposed to the mistakes that can be made by one individual, it is much harder for the team to maintain a consistency of approach across the whole language, to ensure that different decisions in different areas are made on the same criteria, and above all to keep the language small and simple. In short, XQuery is a language designed by committee in a way that XSLT is not.



XQuery from the Experts(c) A Guide to the W3C XML Query Language
Beginning ASP.NET Databases Using VB.NET
ISBN: N/A
EAN: 2147483647
Year: 2005
Pages: 102

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net