In their book The Definition of Standard ML , Robin Milner, Mads Tofte, and Robert Harper have the following to say about the value of formally describing a programming language:
The XML Query working group has taken the unusual step of writing both a prose and a formal description of XQuery. [1] These two descriptions benefit both language designers and implementers. For language designers, two descriptions offer twice as many opportunities to expose inconsistency, errors, or bad design. History has shown that it is useful to formalize a language's semantics before the language goes into widespread use and becomes hard to change. After the Java programming language was released, several formal semantics of aspects of the language were written. Some of these semantics revealed errors in the type system, which in turn could lead to security holes in browsers that run Java programs. In some cases, revisions were then made to the Java release to plug the holes that were discovered .
XQuery has already benefited from its formal semantics. During the design of XQuery, the working group chose to make the type system depend more closely upon the type names assigned to elements and attributes by XML Schema validation. This change was written in prose in the XQuery language document and later written in the formal semantics. The process of formalization revealed ten issues that had not been addressed by the prose description. For language implementers, a formal description provides more details than a prose description alone, and those details are necessary to develop complete, correct, and conforming implementations. The formal semantics contains the only complete description of the XQuery static type system, so implementers of that feature depend heavily on the formal semantics. Several early implementations of XQuery are actually modeled after the formal semantics. Our own implementation of XQuery, Galax [GALAX], corresponds so closely to the formal semantics that the formal-semantics rules that define an expression and the code that implements those rules often resemble each other. One reason that XQuery is such an interesting language to us is that implementers are pursuing many alternative implementation strategies. For example, some relational database vendors implement XQuery within relational query engines extended with XML-aware operators, and some data-integration vendors focus on the difficult problem of decomposing XQuery queries into queries that can be implemented by legacy database or business-process systems. We refer you to Chapters 6 “8, which investigate the relationship between XQuery and various evaluation environments. These alternatives make the formal semantics even more valuable to implementers, as it codifies XQuery's semantics independently of any particular implementation strategy. This chapter is divided into two main sections: "Getting Started with the Formal Semantics" and "Learning More about XQuery." The first section introduces the key techniques used in the formal semantics and describes the formal rules and notations used by each technique. We illustrate the techniques on a tiny subset of XQuery ”the focus here is on learning how to read and use the formal rules rather than learning the details of the language. The second section applies the techniques introduced in the first section to XQuery itself ”the focus here is on specific XQuery expressions and how the formal rules clarify the expression's semantics and enhance the prose description. |