Validation Mode | Beginning ASP.NET Databases Using VB.NET

XML is known as a semi-structured data format because it accommodates both valid and well-formed data. Valid data corresponds to a given schema (or DTD). Well- formed data is syntactically correct (all open tags have matching end tags), but is otherwise unconstrained.

XML Schema provides three modes of validation, to allow mixing of valid and well-formed data.

strict requires that a declaration be available for the element, and the element must validate with respect to that declaration. The element is labeled with the declared type, as are all elements and attributes within the element.
skip has no constraints; the element must simply be well formed. The element and all elements within it are labeled with type xs: anyType , and all attributes within it are labeled with type xs:anySimpleType .
lax behaves as strict if there is a declaration available; otherwise, it behaves as skip .

In addition to keeping track of the types of variables and the validation context, the static environment also contains a validation mode, which is one of strict , skip , or lax . The initial validation mode may be specified by including a default declaration in the prolog. Our running example in the corrected query of Listing 4.6 includes the following declaration (see line [2a]), so all element constructors in the query perform strict validation:

 default validation strict

Just as with context, it would be impossible to determine static types if the mode used to validate an element constructor was not known statically. Therefore, the validation mode is part of the static environment: It is determined by lexical scope, not by the schema as it traverses the final document. When there is a mismatch between the two, this may be fixed by adding a mode to a validate expression. For example, recall that the description element defined in Listing 4.2 contains attribute and element wildcards processed with skip validation. Previously, we wrote a function bargain that takes an article that has failed to sell, and yields a new article with new dates and a reduced reserve price. Listing 4.13 shows a modified function that also inserts a bargain element inside the description element.

Listing 4.13 Validation Mode Used to Skip Validate Element Content

 define function bargain($a as element(article)) as element(article) {     <article>{       $a/name,       $a/seller,       <start_date>{ fn:current-date() }</start_date>       <end_date>{ fn:current-date() + ($a/end_date - $a/start_date)                                                          }</end_date>       <reserve_price>{ $a/reserve_price * 0.80 }</reserve_price>       <description>{         $a/description/(@*  *  text()),         validate skip { <bargain>Marked down from { $a/reserve_price                                                         }</bargain> }       }</description>     } </article>   }

Here a validate expression is used to set a skip validation mode for the bargain element. Otherwise, the bargain constructor would raise an error. At static-analysis time, it would fail because strict mode requires the element to be declared, and the given schema does not define a bargain element. If static analysis was turned off, it would fail at evaluation time because validation in strict mode requires the element to be declared.

A strict validation mode is not appropriate for XML literal data that contains data that is not strictly validated . It is possible to include skip or lax validated data by wrapping the element constructor in two validate expressions. For example, Listing 4.14 shows a call to the function of Listing 4.13 that takes literal data as its argument:

Listing 4.14 Call to the Function of Listing 4.13 That Takes Literal Data as Its Argument

 bargain(   validate { validate skip {     <article id="1001">       <name>Red Bicycle</name>       <seller idref="U01"/>       <start_date>1999-01-05</start_date>       <end_date>1999-01-20</end_date>       <reserve_price>40</reserve_price>       <description>          A <brand>Schwinn</brand> <make>String-ray</make> with          <accessory>banana seat</accessory>.       </description>     </article>   } } )

Here, the inner validate expression sets the validation mode to skip . This is necessary because the validation performed by the construction of the brand , make , and accessory elements would fail in a strict validation mode. The outer validate revalidates the XML literal data in the strict mode specified by the static environment. This is necessary because the article element must be labeled as correctly validated in order to match the element type specified by the function signature. Because validation proceeds top-down at evaluation time, the outer validation above uses skip mode for the contents of the description element, as specified by the schema, and so the brand , make , and accessory elements cause no problem.

These examples show that when the validation mode of the query and the validation mode of a constructed element match, which we expect to be the common case, no extra work is necessary to avoid static type errors. But when the two modes do not match, some care is necessary to avoid static type errors. This extra work is a useful "red flag" to remind users that they are mixing valid and well-formed data.