Validating the Result Document | NetBeansв„ў IDE Field Guide: Developing Desktop, Web, Enterprise, and Mobile Applications (2nd Edition)

You can also request validation of the output of the transformation. For example, if you have written the stylesheet to generate XHTML, you can ask for it to be validated by writing your first template rule as follows :

  <xsl:template match="/">   <xsl:result-document validate="strict">   <xsl:apply-templates/>   </xsl:result-document>   </xsl:template>

In this example, there is nothing that says what the expected type of the output is. What «validate=" strict" » means is that the outermost element of the result document (for example, <xhtml: html> ) must correspond to an element declaration that's present in some schema known to the system, and the system is then required to check that the contents of the element conform to the rules defined in that element declaration.

You could argue that validating the output from within the stylesheet is no different from running the transformation and then putting the output through a schema processor to check that it's valid. However, once you try developing a stylesheet this way, you will find that the experience is very different. If you put the output file through a free-standing schema processor once the transformation is complete, the schema processor will give you error messages in terms of a position within the result document. You will then have to open the result document in a text editor, find out what's wrong with it, find the instruction in the stylesheet that generated the incorrect output, and then correct the stylesheet. Working with a schema processor that's integrated into your XSLT processor is much more direct: In most cases the error message will tell you directly which instruction in the stylesheet needs to be changed. This makes for a much more rapid development cycle.

In principle there is another advantage-in many cases it should be possible for a schema-aware XSLT processor to tell you that the output will be invalid before you even try running the stylesheet against a source document. That is, it should be able to report some of your errors at compile time. This gives you an even quicker turnaround in fixing errors, and more importantly, it means that the ability to detect bugs in your code is less dependent on the completeness of your test suite. Stylesheet programming is often done without much regard to the traditional disciplines of software engineering-testing tends to be less than thorough. So anything that reduces the risk of failures once the stylesheet is in live use is to be welcomed.

At the time of writing this chapter, the only schema-aware XSLT processor available is Saxon 8.0. This doesn't do any compile-time checking of the stylesheet against the schema, other than checking that the type names used in the stylesheet are actually declared in the schema. This feature is bound to appear as the technology matures.

Validation of a result document can be controlled using either the validation attribute or the type attribute of the <xsl: result-document> element. You can use only one of these: they can't be mixed. The validation attribute allows four values, whose meanings are explained in the table below.

Attribute value	Meaning
strict	The result document is subjected to strict validation. This means that there must be an element declaration for the outermost element of the result document in some schema, and the structure of the result document must conform to that element declaration
lax	The result document is subjected to lax validation. This means that the outermost element is validated against a schema if a declaration for that element name can be located; if not, the system assumes the existence of an element declaration that allows any content for that element. The children of the element are also subjected to lax validation, and so on recursively. So any elements in the tree that are declared in a schema must conform to their declaration, but for other elements, there are no constraints
preserve	This option means that no validation is applied at the document level, but if any elements or attributes within the result tree have been constructed using node-level validation (as described in the next section), then the type annotations resulting from that node-level validation will be preserved in the result tree. These node annotations are only relevant, of course, if the result tree is passed to another process that understands them. If the result tree is simply serialized, it makes no difference whether type annotations are preserved or not
strip	This option means that no validation is applied at the document level, and moreover, if any elements or attributes within the result tree have been constructed using node-level validation (as described in the next section), then the type annotations resulting from that node-level validation will be removed from the result tree. Instead, all elements will be given a type annotation of xdt:untyped , and attributes will have the type annotation xdt:untypedAtomic

The other way of requesting validation of the result tree is through the type attribute. If the type attribute is specified, its value must be a QName, which must match the name of a global type definition in an imported schema. In practice this will almost invariably be a complex type definition. The rules to pass validation are as follows:

The result tree must be a well- formed document: That is, it must contain exactly one element node, and no text nodes, among the children of the document node. (In the absence of validation, this rule can be relaxed . For example, it is possible to have a temporary tree in which the document node has three element nodes as its children.)
The document element (that is, the single element node child of the document node) must validate against the schema definition of the specified type, according to the rules defined in XML Schema.
The document must satisfy document-level integrity constraints defined in the schema. This means:
- Elements and attributes of type xs:ID must have unique values within the document.
- Elements and attributes of type xs:IDREF or xs:IDREFS must contain valid references to xs:ID values within the document.
- Any constraints defined by <xs:unique> , <xs:key> , and <xs:keyref> declarations in the schema must be satisfied.

Document-level validation rules must also be satisfied when validation is requested using the option «validate="strict" » or «validate="lax" » .

The language of the XSLT specification that describes these rules is somewhat tortuous. This is because XSLT tries to define what validation means in terms of the rules in the XML Schema specification, which means that there is a need to establish a precise correspondence between the terminologies of the two specifications. This is made more difficult by the fact that XML Schema is not defined in terms of the XSLT/XPath data model, but rather in terms of the XML Infoset and the Post Schema Validation Infoset (PSVI), which is defined in the XML Schema specification itself. To make this work, the XSLT specification says that when a document is validated, it is first serialized, and then re-parsed to create an Infoset. The Infoset is then validated, as defined in XML Schema, to create a PSVI. Finally, this PSVI is converted to a document in the XPath data model using rules defined in the XPath data model specification. However, you should regard this description of the process as a purely formal device to ensure that no ambiguities are introduced between the different specifications. In practice, an XSLT processor is likely to have a fairly intimate interface with a schema processor, and both are likely to share the same internal data structures.

If you request validation by specifying «validation="strict" » or «validation="lax" » , this raises the question of where the XSLT processor should look to find a schema that contains a suitable element declaration. The specification leaves this slightly open. The first place a processor will look is among the schemas that were imported into the stylesheet using <xsl:import-schema> declarations: see Importing Schemas on page 168. The processor is also allowed (but not required) to use any xsi:schemaLocation or xsi:noNamespaceSchemaLocation attributes that are present in the result document itself as hints indicating where to locate a suitable schema. My advice, however, would be to make sure that the required schemas are explicitly imported.