Validating the Source Document


Validation is the process of taking a raw XML document and processing it using an XML Schema. The most obvious output of this process is a success or failure rating: the document is either valid or invalid against the schema. But this is not the only output. Validation also annotates the document, marking each element and attribute node with a label indicating its type. For example, if validation checks that a <shippingAddress> element is valid according to the «us-postal-address » type in the schema, then this element will be annotated as having the type «us-postal-address » . There are various ways the stylesheet can then use this information:

  • Many operations on nodes extract the typed value of the nodes. This process is called atomization , and it is sensitive to the type annotations on the nodes. For example, when you compare two attributes using an expression such as «@discount gt $customer/@max-discount » , the values that are compared are the typed values of @discount and @max-discount respectively. If the schema defines these values to be numbers (for example, using the type xs:decimal ), then they will be compared numerically , so the value «10.00 » will be considered greater than the value «2.50 » . If the same values were compared as strings, «10.00 » would be less than «2.50 » . Adding type annotations to nodes, through the process of schema validation, enables operations on the values of the nodes to be performed more intelligently.

  • There are many operations that only make sense when applied to a particular kind of data. At the top level, the stylesheet as a whole might be designed to process purchase orders, and will produce garbage if you make the mistake of feeding it with input that's actually a delivery note. At a more fine-grained level, you might have a stylesheet function or template rule that's designed to process US postal addresses, and that won't work properly if you give it a phone number instead. XSLT 2.0 allows you to define the type of data that you expect your functions and template rules to process, and to define the type of result that they produce as their output. A schema-aware processor will then automatically check that when the function or template is actually called, the data is of the right type, and if it isn't, the error will be reported .

    At times these errors can become frustrating. But remember, every time you get one of these error messages, it tells you about a programming mistake that might otherwise have been much harder to track down. With XSLT 1.0, most programming mistakes don't give you an error message, they simply give you wrong output, and it can be a tortuous process debugging the stylesheet to find out where you went wrong. With XSLT 2.0, if you choose to define data types for your stylesheet functions and templates, you can get error messages that make it much clearer where the mistake lies.

You don't request validation of the input document from within the stylesheet. It's assumed that you will request this as part of the way you invoke the transformation, and details of how you do this will vary from one XSLT processor to another. (With Saxon, for example, you can use the -val option on the command line.) What you can do is test in your stylesheet whether the input has actually been validated . For example, you can write the first template rule in the stylesheet as follows :

  <xsl:template match="/">   <xsl:if test="not(* instance of schema-element(purchase-order))">   <xsl:message terminate="yes">   Source document is not a validated purchase order   </xsl:message>   </xsl:if>   <xsl:apply-templates/>   </xsl:template>  

The effect of writing the template rule this way is that if the stylesheet is presented with a document that is not a validated purchase order, it will immediately fail and display an error message, rather than trying to process it and producing garbage output.

Note the carefully chosen phrase a validated purchase order. It's not enough to supply an XML document that would be deemed valid if you tried to validate it. To pass this test, the document must already have been through a schema processor, and must have passed validation.

If you prefer, you could code the stylesheet to invoke the validation explicitly, by writing the following:

  <xsl:template match="/">   <xsl:variable name="input">   <xsl:copy-of select="*" type="purchase-order-type"/>   </xsl:variable>   <xsl:apply-templates select="$input/>   </xsl:template>  

This defines a variable to hold a copy of the input document. The «type » attribute on the <xsl:copy-of> instruction asks the XSLT processor to invoke schema validation on the document, and if this succeeds, the element and attribute nodes in the new copy will have type annotations reflecting the result of this process. There is no explicit logic here to test whether validation has succeeded. It isn't needed, because a validation failure will always cause the transformation to be aborted with an error message.

However, I wouldn't normally recommend this approach. Creating a copy of the input document is likely to be expensive. It's better to do the validation on the fly while the input document is being parsed in the first place.

The value of the «type » attribute in this example, like the type named in the «instance of » expression in the previous example, is a type that's defined in a schema. We'll see later how the XSLT processor locates a schema containing this type definition.




XSLT 2.0 Programmer's Reference
NetBeansв„ў IDE Field Guide: Developing Desktop, Web, Enterprise, and Mobile Applications (2nd Edition)
ISBN: 764569090
EAN: 2147483647
Year: 2003
Pages: 324

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net