Validating Individual Elements | NetBeansв„ў IDE Field Guide: Developing Desktop, Web, Enterprise, and Mobile Applications (2nd Edition)

Rather than applying validation at the document level, it is possible to invoke validation of specific elements as they are constructed . This can be useful in a number of circumstances:

Sometimes you do not have a schema definition for the result document as a whole, but you do have schema definitions for individual elements within it. For example, you may be creating a data file in which the contents of some elements are expected to be valid XHTML.
If you are running a transformation whose purpose is to extract parts of the source document, then you may actually know that the result document as a whole is deliberately invalid-the schema for source documents may require the presence of elements or attributes, which you want to exclude from the result document, perhaps for privacy or security reasons. The fact that the result document as a whole has no schema should not stop you from validating those parts that do have one.
You may be creating elements in temporary working storage (that is, in variables ) that are to be copied or processed before incorporating them into a final result document. It can be useful to validate these elements in their own right, to make sure that they have the type annotations that enable them to be used as input to functions and templates that will only work on elements of a particular type.

The usual way of creating a new element node in XSLT is either by using a literal result element, or by using the <xsl:element> instruction.

The <xsl:element> instruction has attributes validation and type, which work in a very similar way to the corresponding attributes of <xsl:result-document> and <xsl:document>; however, in this case it is only element-level validation that is invoked, not document-level validation.

The same facilities are available with literal result elements. In this case, however, the attributes are named xsl:validation and xsl:type. This is to avoid any possible conflict with attributes that you want copied to the result document as attributes of the element you are creating.

For example, suppose you want to validate an address. If there is a global element declaration with the name address, you might write:

  <address xsl:validation="strict">   <number>39</number>   <street>Lombard Street</street>   <city>London</city>   <postcode>EC1 3CX</postcode>   </address>

If this matches the schema definition of the element declaration for address, this will succeed, and the resulting element will be annotated as an address-or more strictly , as an instance of the type associated with the address element, which might be either a named type in the schema, or an anonymous type. In addition, the child elements will also have type annotations based on the way they are defined in the schema, for example the <number> element might (perhaps) be annotated as type xs:integer . If validation fails, the whole transformation is aborted.

What if there is no global element declaration for the <address> element (typically because it is defined in the schema as a local element declaration within some larger element)? You can still request validation if the element is defined in the schema to have a named type. For example, if the element is declared as:

  <xs:element name="address" type="address-type"/>

then you can cause it to be validated by writing:

  <address xsl:type="address-type">   <number>39</number>   <street>Lombard Street</street>   <city>London</city>   <postcode>ECl 3CX</postcode>   </address>

If neither a top-level element declaration nor a top-level type definition is available, you can't invoke validation at this level. The only thing you can do is either (a) change the schema so that some of the elements and/or types are promoted to be globally defined, or (b) invoke validation at a higher level, where a global element declaration or type definition is available.

You don't need to invoke validation at more than one level, and it may be inefficient to do so. Asking for validation of <address> in the above example will automatically invoke validation of its child elements. If you also invoked validation of the child elements by writing, say:

  <address xsl:type="address-type">   <number xsl:type="xs:integer">39</number>   ...

then it's possible that the system would do the validation twice over. If you're lucky the optimizer will spot that this is unnecessary, but you could be incurring extra costs for no good reason.

If you ask for validation of a child element, but don't validate its parent element, then the child element will be checked for correctness, but the type annotation will probably not survive the process of tree construction. For example, suppose you write the following:

  <address>   <number xsl:type="xs:integer">39</number>   <street>Lombard Street</street>   <city>London</city>   <postcode>EC1 3CX</postcode>   </address>

Specifying the xsl: type attribute on the <number> element causes the system to check that the value of the element is numeric, and to construct an element that is annotated as an integer. The result of evaluating the sequence constructor contained in the <address> element is thus a sequence of four elements, of which the first has a type annotation of xs:integer. Evaluating the literal result element <address> creates a new <address> element, and forms children of this element from the result of evaluating the contained sequence constructor: The formal model is that the elements in this sequence are copied to form these children. The xsl: validation attribute on the <address> element determines what happens to the type annotations on these child elements. If the value is strict or lax, the type annotations on the child elements are ignored, and the type annotation in the final tree depends only on the result of validation of <address> against the schema. If the value is preserve, the type annotation on the child element is preserved, and if the value is strip, then the type annotation on the child element is replaced by xdt: untyped.

The type of an element never depends on the types of the items used to form its children. For example, suppose that the variable $ i holds an integer value. Then you might suppose that the construct:

  <xsl:element name="x">   <xsl:sequence select="$i"/>   </xsl:element>

would create an element whose type annotation is xs: integer. It doesn't-the type annotation will be xs: untyped. Atomic values in the sequence produced by evaluating the sequence constructor are always (at least conceptually) converted to strings, and any type annotation in the new element is obtained by validating the resulting string values against the desired type.

This might not seem a very satisfactory design-why discard the type information? The working groups agonized over this question for months. The problem is that there are some cases like this one where retaining the type annotation obviously makes sense; there are many other cases, such as a sequence involving mixed content, where it obviously doesn't make sense; and there are further cases such as a sequence containing a mixture of integers and dates where it could make sense, but the definition would be very difficult. Because the working group found it difficult to devise a clear rule that separated the simple cases from the difficult or impossible ones, they eventually decided on this rather blunt rule: everything is reduced to a string before constructing the new node and validating it.

When there is no xsl:validation or xsl:type attribute on a literal result element, the default value is taken from the default-validation attribute on the containing <xsl:stylesheet> element; and if this attribute isn't specified either, the default is taken as strip. So in this example, assuming there is no default-validation attribute, all the elements in the resulting tree will be annotated as xdt:untyped.

Note that when you use the xsl: type attribute to validate an element, the actual element name can be anything you like. There is no requirement that it should be an element name declared in the schema. It can even be an element name that is declared in the schema, but with a different type (though I can't see any justification for doing something quite so confusing, unless the types are closely related ).

All the same considerations apply when creating a new element using the <xsl:element> or <xsl: copy> instruction rather than a literal result element. The only difference is that the attributes are now called validation and type instead of xsl: validation and xsl:type.

The value of the type or xsl:type attribute is always a QName, and this must always be the name of a top-level complex type or simple type defined in an imported schema. This isn't the same as the «as » attribute used in declaring the type of variables or functions. Note the following differences:

If the «as » attribute is a QName, the QName must identify an atomic type. The «type » attribute is always a QName, and this may be any type defined in an imported schema: complex types are allowed as well as all three kinds of simple type, list types, union types, and atomic types.
The «as » attribute can include an occurrence indicator ( «? », «* », or «+ »). The «type » attribute never includes an occurrence indicator.
The «as » attribute may define node kinds, for example «node() », «element() », or «comment() ». Such constructs are never used in the «type » attribute.

This means that to create an element holding a sequence of IDREF values, you write:

  <xsl:element name="ref" type="xs:IDREFS"   select="'id001 id002 id003 '"/>

whereas to declare a variable holding the same sequence, you write:

  <xsl:variable name="ref" as="xs:IDREF*"   select="xs:IDREF('id001'), xs:IDREF('id002'), xs:IDREF('id003')"/>

In the case of <xsl: copy>, note that the option «validation= "preserve" » applies to the children (and attributes) of the copied element, but not to the copied element itself. This instruction does a shallow copy, so in general the content of the new element will be completely different from the content of the old one. It doesn't make sense to keep the type annotation intact if the content is changing, because this could result in the type annotation becoming inconsistent with the actual content.

By contrast, the <xsl: copy-of> instruction does a deep copy. Since the content remains unchanged, it's safe to keep the type annotation unchanged, and the option «validation="preserve" » is useful in achieving this. When you request validation at the element level, the system does not perform any document-level integrity checks. That is, it does not check that ID values are unique, or that IDREF values don't point into thin air, and it does not check constraints defined by <xs: unique>, <xs: key>, and <xs:keyref> definitions in the schema. To invoke this level of validation, you have to do it at the document level.