xsl:element

The <xsl:element> instruction is used to create an element node and write it to the result sequence. It provides an alternative to using a literal result element, and is useful especially when the element name or namespace is to be calculated at runtime.

Changes in 2.0

Two new attributes validation and type are available, to control whether and how the copied nodes are validated against a schema.

Format

 <xsl:element   name = { qname }   namespace? = { uri-reference }   use-attribute-sets? = qnames   validation? = "strict"  "lax"  "preserve"  "strip"   type? = qname>   <!-- Content: sequence-constructor --> </xsl:element>

Position

<xsl:element> is used as an instruction within a sequence constructor.

Attributes

Name	Value	Meaning
name mandatory	Attribute value template returning a lexical QName	The name of the element to be generated
namespace optional	Attribute value template returning a URI	The namespace URI of the generated element
use-attribute-sets optional	Whitespace-separated list of lexical QNames	List of named attribute sets containing attributes to be added to this output element
validation optional	«strict » , «lax » , «preserve » , or «skip »	Indicates whether and how the element should be subjected to schema validation, or whether existing type annotations on attributes and child elements should be retained or removed
type optional	lexical QName	Identifies a type declaration (either a built-in type, or a user -defined type imported from a schema) against which the new element is to be validated

The type and validation attributes are mutually exclusive: If one is present, the other must be absent.

Content

A sequence constructor.

Effect

The effect of this instruction is to create a new element node, and to return this node as the result of the instruction. In some error cases, the results may be different: These situations are described below.

The name of the generated element node is determined using the name and namespace attributes. The way in which these attributes are used is described below in the section The Name of the Element.

The sequence constructor contained in the <xsl:element> instruction, together with the use-attribute-sets attribute, is used to form the content of the new element: that is, its namespaces, attributes, and child nodes. The way this works is described in the section The Content of the Element.

When a schema-aware XSLT processor is used, the new element (and its contained elements and attributes) may be validated to ensure that they conform to a type defined in a schema. This process results in the new element node having a type annotation. The type annotation affects the behavior of subsequent operations on this element node even though it is not visible when the result tree is serialized as raw XML. The validation and annotation of the new element node are controlled using the type and validation attributes. This is described in the section Validating and Annotating the Element.

The XSLT specification is written in terms of writing nodes to a result sequence. Sometimes it is convenient to think in terms of the start tag of the <xsl:element> element producing a start tag in the output XML file and the end tag of the <xsl:element> element producing the corresponding end tag, with the intervening sequence constructor producing the contents of the output element. However, this is a dangerous simplification, because writing the start tag and end tag are not separate operations that can be individually controlled, they are simply two things that happen together as a consequence of the <xsl:element> instruction being evaluated. This is explained in more detail in the section Literal Result Elements in Chapter 3, page 106.

The Name of the Element

The name of an element node has two parts : the local name and the namespace URI. These are controlled using the name and the namespace attributes.

Both the name and the namespace attributes may be given as attribute value templates; that is, they may contain expressions nested within curly braces. One of the main reasons for using the <xsl:element> instruction in preference to a literal result element (described in the section Literal Result Elements in Chapter 3, page 106) is that <xsl:element> allows the name of the element to be decided at runtime, and this is achieved by using attribute value templates in these two attributes.

The result of expanding the name attribute value template must be a lexical QName; that is, a valid XML name with an optional namespace prefix, for example, «table » or «fo:block » . If there is a prefix, it must correspond to a namespace declaration that is in scope at this point in the stylesheet, unless there is also a namespace attribute, in which case it is taken as referring to that namespace.

If the name is not a valid QName, the XSLT processor is required either to report the error, or to leave this element node out of the generated tree, while still including its children (but not its attributes). Different processors may thus handle this error differently.

The local part of the name of the created element node will always be the same as the local part of the QName supplied as the value of the name attribute.

If the <xsl:element> instruction has a namespace attribute, it is evaluated (expanding the attribute value template if necessary) to determine the namespace URI part of the name of the created element node:

If the value is a zero-length string, the element will have a null namespace URI.
Otherwise, the value should be a URI identifying a namespace. This namespace does not need to be in scope at this point in the stylesheet, in fact it usually won't be. The system does not check that the value conforms to any particular URI syntax, so in effect any string can be used.

If there is no namespace attribute:

If the supplied QName includes a prefix, the prefix must be a namespace prefix that is in scope at this point in the stylesheet. In other words, there must be an «xmlns:prefix="uri" » attribute either on the <xsl:element> instruction itself or on some containing element. The namespace URI in the output will be that of the namespace associated with this prefix in the stylesheet.
Otherwise, the default namespace is used. This is the namespace declared, in some containing element in the stylesheet, with an «xmlns="uri" » declaration. If there is no default namespace declaration in scope, then the element will have a null namespace URI. Note that this is one of the few places in XSLT where the default namespace is used to expand a QName having no prefix; in nearly all other cases, a null namespace URI is used. The reason is to ensure that the behavior is consistent with that of an element name used in the start tag of a literal result element.

Element nodes, according to the formal data model, do not contain a namespace prefix, only a namespace URI and a local name. There are two situations in which a prefix needs to be generated for the element node: Firstly, when the name() function is called, and secondly, when the tree is serialized into textual XML. The XSLT specification leaves the choice of a prefix to some degree up to the implementation; however, the choice is constrained by the process known as namespace fixup , described below on page 265.

The Content of the Element

The attributes, namespaces, and child nodes of the new element node are constructed in what is conceptually a three-stage process, though in practice most implementations are likely to collapse the three stages into one.

The first stage is to evaluate the sequence constructor contained in the <xsl:element> instruction. The sequence constructor is a sequence of instructions, and as its name implies, the result of evaluating these instructions is a sequence of items. Usually these values will all be newly constructed nodes but the sequence might also contain atomic values and/or references to existing nodes.

The way that the instructions in the sequence constructor are evaluated is described in the rules for each instruction; the items produced by each instruction are concatenated together (in the order in which the instructions appear in the stylesheet) to produce the final result sequence.

The instructions in a sequence constructor can be evaluated in any order, or in parallel, but their results must be assembled in the correct order on completion.

If the use-attribute-sets attribute is present it must be a whitespace-separated list of lexical QNames that identify named <xsl:attribute-set> declarations in the stylesheet. The <xsl:attribute> instructions within these named attribute sets are evaluated, and the resulting sequence of attribute nodes is added to the start of the result sequence. For more details, see <xsl:attribute-set> on page 214.

The second stage of the process is to use the result sequence delivered by evaluating the sequence constructor (and the use-attribute-sets attribute if present) to create the content of the new element node. This process works as follows :

If there are any atomic values in the sequence, they are converted to strings using the XPath casting rules. Errors may arise if the value has a data type that cannot always be cast to a string, specifically xs:NOTATION and xs:QName .
Any sequence of adjacent strings is converted to a single text node, using a single space as a separator between adjacent strings. This allows list-valued content to be constructed, for example where the schema for the result document requires the content of an element to be a sequence of integers.
If there is a document node in the sequence, then it is replaced in the sequence by its children (document nodes in the data model are not constrained to represent well- formed XML documents, so this may produce an arbitrary sequence of elements, text nodes, comments, and processing instructions).
Adjacent text nodes within the sequence are combined into a single text node, without any space separator, and zero-length text nodes are removed.
Duplicate attribute nodes are removed. If several attributes in the sequence have the same name, all but the last are discarded.
Duplicate namespace nodes are removed. If several namespace nodes in the sequence have the same name and string-value (that is, they bind the same namespace prefix to the same namespace URI), then all but one of them are discarded. It makes no difference which one is kept.

It is an error if the resulting sequence contains an attribute or namespace that is preceded by a node that is not an attribute or namespace node. The processor has the choice of reporting this error, or ignoring the relevant attribute or namespace node.

The reason for this rule is to allow the implementation the flexibility to generate the output as an XML file, without having to build the result tree in memory first. If attributes could be added at any time, the whole result tree would need to be kept in memory.

Another error that can arise is that the sequence contains two conflicting namespace nodes, that is, two namespace nodes that bind the same prefix to different namespace URIs. Again, the processor can either report this error or ignore all but the last of the duplicates.

Finally, the attribute nodes in the sequence are attached to the new element as its attributes, the namespace nodes are attached as its namespaces, and the other nodes are attached as its children. Officially, this involves making a deep copy of each node: This is because nodes in the data model are immutable, so you cannot change the parent of an existing node. In practice, making a copy at this stage is very rarely necessary, because in most cases the node being attached has only just been created and will never be used independently of its new parent. The only case where it is necessary is where the result sequence contains references to existing nodes, which can be produced using the <xsl:sequence> instruction:

  <xsl:element name="digest">   <xsl:sequence select="//email[@date=current-date()]"/>   </xsl:element>

In this situation, the result is exactly the same as if <xsl:copy-of> had been used instead of <xsl:sequence> .

The third stage of the process is called namespace fixup. Conceptually this is done after all the nodes produced by the sequence constructor have been added to the new element. In practice all the information needed to do namespace fixup is available once all the attributes and namespaces have been added, and a processor that serializes the result tree "on the fly" is likely to perform this operation at that stage, so that the start tag of the serialized element can be output as early as possible. Namespace fixup is described in the next section.

Namespace Fixup

Namespace fixup is applied to any element node as soon as its content has been constructed, whether the node is created using the <xsl:element> instruction, or using another mechanism such as a literal result element, <xsl:copy> , or <xsl:copy-of> . The process ensures that the new element node will automatically contain all the namespace nodes it needs to bind unique namespace prefixes to the namespaces used in the element name itself and on the names of all its attributes.

Although namespace fixup is described in terms of creating namespace nodes for an element, another way of thinking about it is that it is the process that allocates namespace prefixes for the namespace URIs used in the element name and in the names of its attributes. It just happens that in the formal data model, namespace nodes are where these prefixes are held.

In principle, XSLT processors can choose any prefix they like when generating namespace nodes during the namespace fixup process. (Choosing a prefix in this discussion also includes the option of using the empty prefix, which is how the data model represents the default namespace.) In practice, processors will usually be able to make a sensible choice, resulting in prefixes that are recognizable to users, rather than random alphanumeric noise. For example, if the QName used as the value of the name attribute of <xsl:element> includes a prefix, then most processors will choose this prefix during the namespace fixup process. There are only really two reasons why a processor might choose a different prefix:

There might be another prefix available that is just as good. For example, if the parent of the new element is in the same namespace, then the processor might decide to use the same prefix that was used for the parent element, which will reduce the number of namespace declarations required when the result tree is serialized.
The prefix might already be in use to refer to a different namespace URI. This is more of a theoretical possibility than something that happens often in practice; it is most likely to occur in the case of the empty prefix (the prefix that identifies the default namespace). When this happens, the system may be forced to invent an arbitrary prefix such as «ns0001 » .

The main importance of namespace nodes is that they determine the namespace declarations that will be output when a result tree is serialized. Namespace nodes are not the same as namespace declarations, but they contain essentially the same information. Usually namespace nodes are handled automatically behind the scenes, and users don't need to worry about them. The only time they really become important is if your XML document uses QNames in the content of elements and attributes. In this situation, the XSLT processor doesn't necessarily know that your content is dependent on particular namespaces being in scope, and so it can't automate the process of generating the right namespace nodes in the same way as namespace fixup does for element and attribute names. You can use an <xsl:namespace> instruction to create a namespace node explicitly, in the rare cases where it isn't generated automatically through namespace fixup.

Namespace fixup also ensures that every element has a namespace node that maps the prefix «xml » to the namespace URI http://www.w3.org/XML/1998/namespace . At any rate, this is what the specification says. In practice, implementations probably won't store a real node for this namespace, instead they will simply behave as if it always existed.

It's worth observing one thing that namespace fixup doesn't do. When you create an element as a child of <a> , namespace fixup does not try to give the element a copy of every namespace node that is present for the <a> element. This reflects a new freedom that comes with the XML Namespaces 1.1 specification, which introduces the ability to undeclare namespaces. It was always possible under XML Namespaces 1.0 to write:

  <a xmlns="one.uri">   <b xmlns=""/>   </a>

which has the effect that the «one.uri » namespace is in scope for <a> but not for . This is represented in the data model by the fact that the <a> element has a namespace node that maps the empty prefix to the namespace URI «one.uri » , while the element has no such namespace node. With XML Namespaces 1.1 it becomes possible to do the same thing with a nondefault namespace. You can now write:

  <a xmlns:one="one.uri">   <b xmlns:one=""/>   </a>

Again, this is represented in the data model by the fact that the <a> element has a namespace node that maps the prefix «one » to the namespace URI «one.uri » , while the element has no such namespace node. And if the above code appears in your stylesheet rather than your source document, it will have the same effect, as you would expect.

This also means that if you have a source document, which is simply:

  <b/>

and you then write in your stylesheet

  <a xmlns:one="one.uri">   <xsl:copy-of select="/b"/>   </a>

then you will create the same data model. There is nothing in the rules that says the element will acquire a namespace node for the «one.uri » namespace.

The only way to serialize this result tree into an XML document in such a way that you get exactly the same data model back when it is parsed (a property known as round-tripping ) is as follows:

  <?xml version="1.1"?>   <a xmlns:one="one.uri">   <b xmlns:one=""/>   </a>

Unfortunately an XML 1.0 parser will throw this out, so it's not a very practical proposition as long as 99% of the XML users in the world are still using version 1.0. Therefore, to get this serialized output, you have to request both «version="1.1" » and «undeclare-namespaces="yes" » in your <xsl:output> declaration. Unless you do this, the serialized output will be:

  <?xml version="1.1"?>   <a xmlns:one="one.uri">   <b/>   </a>

This won't round-trip accurately, because the element will acquire an extra namespace node in the process, but few people are likely to notice, and in any case, the behavior is consistent with what happened with XSLT 1.0.

For more details of serialization options, see <xsl:output> on page 375.

One final observation about namespace fixup: it doesn't create namespace nodes in respect of namespaces that are referenced in the content of the element, or the content of its attributes. For example, if you create the attribute «xsi:type="mf:part-number" » , namespace fixup won't automatically create a namespace node for the «mf » namespace, even if you use a schema-aware processor that knows that the type of the «xsi:type » attribute is xs:QName . The problem is that namespace fixup has to be done before validation, because validation would reject an element on which the relevant namespaces haven't been declared; and until validation is done, there is no way of knowing that the attribute in question has type xs:QName . To find out how to generate an attribute whose value is an xs:QName , see page 210.

Validating and Annotating the Element

This section is relevant only if you are using a schema-aware XSLT processor. With a non-schema-aware processor, you cannot use the type and validation attributes, and the type annotation on the new element will always be xs:untyped , which you can effectively ignore because it imposes no constraints.

With a schema-aware processor, you can validate the new element to ensure that it conforms with relevant definitions in a schema. If validation fails, a fatal error is reported . If it succeeds, the new element will have a type annotation that reflects the validation that was performed. This type annotation will not affect the way the element node is serialized, but if you want to do further processing on the element, the type annotation may affect the way this works. For example, if you sort a sequence of elements annotated with type xs:integer , you will get different results than if they are annotated as xs:string .

If you use the type attribute, the value of the attribute must be a lexical QName that identifies a known type definition. Generally this means that it must either be a built-in type such as xs:string or xs:dateTime , or it must be the name of a global simple or complex type defined in a schema that has been imported using an <xsl:import-schema> declaration in the stylesheet. (That is, the local part of the QName must match the name attribute of a top-level <xs:simpleType> or <xs:complexType> element in the schema whose target namespace matches the namespace URI part of the QName .)

The XSLT specification allows the implementation to provide other ways of accessing type definitions, perhaps through an API or a configuration file, and it also allows the type definition to originate from a source other than an XML Schema, but since it provides no details of how this might work, we won't explore the possibility further here.

The processor validates that the constructed element conforms to the named type definition. If it does, the element is annotated with the name of this type. If it doesn't, processing fails.

Validating an element is a recursive process, which also involves validating all its attributes and child elements. So these contained elements and attributes may also acquire a different type annotation.

In general, it's likely that some of the contained elements and attributes will be validated against anonymous type definitions in the schema, that is, types defined inline as part of another type definition (or element or attribute declaration), rather than named global types. In this case, the XSLT processor invents a name for each such type definition, and uses this invented name as the type annotation. The invented name is not visible to the application, though it might appear in diagnostics, but it is used during subsequent processing whenever there is a need to check that the element or attribute conforms to a particular type. (In practice, of course, the invented "name" might not really be a name at all, but a pointer to some data structure containing the type definition.)

There is potentially a lot of redundant processing if you validate every element that you add to the result tree, because elements at the bottom level of the tree will be validated repeatedly each time an ancestor element is validated. It's up to the XSLT processor to handle this sensibly; one approach that it might use is to mark the element as needing validation, but to defer the actual validation until it really needs to be done.

Validating the element may also have other effects, in particular, it may cause default values for elements and attributes within the element's content to be expanded. Default values can be defined in the schema using <xs:element default="ABC"> or <xs:attribute default="XYZ"> . So the element after validation may contain element and attribute values that were not put there explicitly by the stylesheet. The XSLT specification does not attempt to describe exactly how validation works, instead it simply points to the XML Schema specification, which describes the process in minute detail.

Validating an element using the type attribute places no constraints on the name of the element. It does not need to be an element name that is defined in any schema. The validation is concerned with the content of the element (including, of course, the names of its attributes and children) and not with its name.

In contrast, validation using the validation attribute is driven by the element's name.

There are two options for the validation attribute that cause schema validation to happen, and two options that cause it not to happen. Let's take the last two first:

«validation="preserve" » means that the new element will have a type annotation of xs:anyType , and the attributes and elements in its content will have their original type annotation. During the (formal) process of copying nodes from the sequence produced by evaluating the sequence constructor, the nodes are copied with their type annotations intact.
«validation="strip" » means that the new element will have a type annotation of xs:untyped , and in this case the attributes and elements in its content (at any depth) will have their type annotation changed to xs:untypedAtomic or xs:untyped respectively. (The type annotations are changed in the course of copying the nodes; the original nodes are, of course, unchanged).

The difference between xs:anyType and xs:untyped is rather subtle, and most applications won't notice the difference. However, the XSLT processor knows when it sees an xs:untyped element that all its descendants will also be xs:untyped , and this makes certain optimizations possible.

The other two options are «strict » and «lax » :

«validation="strict" » causes the processor to look in the schema for an element declaration that matches the name of the element. That is, it looks for a top-level <xs:element> whose name attribute matches the local name of the element being validated, in a schema whose target namespace matches the namespace URI of the element being validated. If it can't find such a definition, a fatal error is reported. Otherwise, the content of the element is validated against the schema-defined rules implied by this element declaration.

If the element declaration in the schema refers to a named type definition, then on successful validation, the element is annotated with this type name. If the element declaration contains an inline (and therefore unnamed) type definition, the XSLT processor invents a name for this implicit type, and uses this invented name as the type annotation, just as in the case described earlier for the type attribute.

If the element declaration requires it, then strict validation of an element proceeds recursively through the content of the element. It is possible, however, that the element declaration is liberal . It may, for example, define the permitted contents of the element using <xs:any> , with processContents set to «lax » or «skip » . In this case, validation follows the schema rules. If «skip » is specified, for example, the relevant subtree is not validated. All nodes in such a subtree will be annotated as if «validation="strip" » were specified.
«validation="lax" » behaves in the same way as «validation="strict" » , except that no failure occurs if the processor cannot locate a top-level schema definition for the element. Instead of reporting an error, the element is annotated as xs:anyType , and validation continues recursively (again, in lax mode) with its attributes and child elements. Once an element is found that does have a schema definition, however, it is validated strictly against that definition, and if validation fails, a fatal error is reported.

I said that the processor looks in the schema for an appropriate element declaration, but where does it find the schema? It knows the namespace of the element name at this stage, so if a schema for this target namespace has been imported using <xsl:import-schema> (see page 324) then there is no problem. Otherwise, the specification leaves things rather open . It recognizes that some processors are likely to have some kind of catalog or repository that enables the schema for a given namespace to be found without difficulty, and it allows this to happen where the implementation supports it. You can also create an xsl:schemaLocation attribute node on the element being validated, to provide guidance on where a schema document might be found. In other cases, the implementation is allowed to report an error.

If neither the type nor validation attribute is present, then the system behaves as if the validation attribute were present, and had the value given by the default-validation attribute of the containing <xsl:stylesheet> element. If no default is specified at that level, the effect is the same as «validation="strip" » .

XSLT does not provide any way to request validation of an element against a local element or type definition in a schema. The way around this is to request validation only when you create an element for which there is a top-level definition in the schema. This will then implicitly validate the whole subtree contained by that element, including elements that have local definitions in the schema. Alternatively, many locally-declared elements make use of a globally-defined type, and you can then use the type attribute to validate against the type definition.

Usage and Examples

In most cases, elements in a result tree can be generated either using literal result elements in the stylesheet, or by copying a node from the source document using <xsl:copy> .

The only situations where <xsl:element> is absolutely needed are therefore where the element name in the result document is not fixed, and is not the same as an element in the source document.

Using <xsl:element> rather than a literal result element can also be useful where different namespaces are in use. It allows the namespace URI of the generated element to be specified explicitly, rather than being referenced via a prefix. This means the namespace does not have to be present in the stylesheet itself, thus giving greater control over exactly which elements the namespace declarations are attached to.

Converting Attributes to Child Elements

This example illustrates how <xsl:element> can be used to create element nodes whose names and content are taken from the names and values of attributes in the source document.

Source

The source document book.xml contains a single <book> element with several attributes:

  <?xml version="1.0"?>   <book title="Object-oriented Languages"   author="Michel Beaudouin-Lafon"   translator="Jack Howlett"   publisher="Chapman &amp; Hall"   isbn="0 412 55800 9"   date="1994"/>

Stylesheet

The stylesheet atts-to-elements.xsl handles the book element by processing each of the attributes in turn (the expression «@* » selects all the attribute nodes). For each one, it outputs an element whose name is the same as the attribute name and whose content is the same as the attribute value.

The stylesheet is as follows:

  <xsl:transform   xmlns:xsl="http://www.w3.org/1999/XSL/Transform"   version="1.0"   >   <xsl:output indent="yes"/>   <xsl:template match="book">   <book>   <xsl:for-each select="@*">   <xsl:element name="{name()}">   <xsl:value-of select="."/>   </xsl:element>   </xsl:for-each>   </book>   </xsl:template>   </xsl:transform>

This selects all the attributes of the <book> element (using the expression «@* » ), and for each one, it generates an element whose name is the same as the name of that attribute, and whose content is the value of that attribute.

Output

The XML output (on my system) is shown below. Actually, this stylesheet isn't guaranteed to produce exactly this output. This is because the order of attributes is undefined. This means that the <xsl:for-each> loop might process the attributes in any order, so the order of child elements in the output is also unpredictable. With Saxon, it actually depends on which XML parser you are using.

  <book>   <author>Michel Beaudouin-Lafon</author>   <date>1994</date>   <isbn>0 412 55800 9</isbn>   <publisher>Chapman &amp; Hall</publisher>   <title>Object-oriented Languages</title>   <translator>Jack Howlett</translator>   </book>