XML Namespaces | Developing Enterprise Web Services: An Architects Guide: An Architects Guide

Namespaces in object-oriented programming languages allow developers to name classes unambiguously. Given that different organizations (should) use different namespaces for the software components, even in the cases where two third-party software components contain a class with exactly the same name, the fact that those classes are in different namespaces means that they are easily distinguished from one another.

Unambiguous naming is a requirement that also permeates the XML world. For example, it may be the case that several versions of a document with a root element dvd may exist, but the structure of each is different. The way we distinguish the document that we want from a number of available dvd documents is by its XML namespace.

Unlike popular programming languages where specific scope resolution operators are used to build namespaces (e.g., MyPackage.MyClass in Java and MyNamespace::MyClass in C++) the convention in XML is to use a URI (Universal Resource Identifier) as the namespace identifier.

In fact, XML namespaces use URIs by convention only. Strictly speaking, an XML namespace is just a string. The value in using URIs is that they ensure uniqueness that strings cannot.

The URI is the union of the familiar URL and the not-so-familiar URN (Uniform Resource Name) schemes as shown in Figure 2-3 and Figure 2-4.

Figure 2-3. Some familiar URI schemes.

 ftp://src.doc.ic.ac.uk gopher://gopher.dna.affrc.go.jp http://www.arjuna.com mailto:some.one@somewhere.com news:uk.jobs.offered telnet://foo.bar.com/

The general scheme for the construction of a URI is <scheme>:<scheme-specific-part>. An absolute URI contains the name of the scheme in use followed by a colon (e.g., news:), which is followed by a string which is interpreted according to the semantics of that scheme (i.e., uk.jobs.offered identifies a particular Usenet newsgroup).

While the URI scheme doesn't mandate the meaning of the <scheme-specific-part>, many individual schemes share the same form which most Web users will have experienced with URLs (Uniform Resource Locator) where the syntax consists of a sequence of four parts: <scheme>://<authority><path>?<query> (for example, http://search.sun.com/search/suncom/?qt=java). Depending on the scheme in use, not all of these parts are necessary but given those rules any valid URI can be constructed.

Another good convention to adopt for namespaces is that the URI chosen should have some meaning. For instance, if a document has a namespace which is a HTTP URL, then dereferencing that URL should retrieve the schema which constrains that document.

A URN is intended to be a persistent, location-independent, resource identifier. In typical situations a URN is used where a name is intended to be persistent. The caveat is that once a URN has been affiliated with a particular entity (protocol message, Web service, and so on), it must not be reused to reference another resource. The URNs in Figure 2-4 are typical of the kinds of identifiers we find in Web services applications (taken from OASIS BTP, see Chapter 7):

Figure 2-4. An example of the URN scheme.

 urn:oasis:names:tc:BTP:1.0:core urn:oasis:names:tc:BTP:1.0:qualifiers

XML namespaces affiliate the elements and attributes of an XML document with namespaces identified by URIs. This process is called qualification and the names of the elements and attributes given a namespace scope are called qualified names, or simply QNames.

Now that we understand we can qualify our documents with a namespace, we can extend the example in Figure 2-2 to include namespace affiliation. Given that it is likely there will be other DVD cataloging systems and those systems will also use elements with names like dvd (which will likely have a different structure and content from our own version), the addition of a namespace into our XML document confers the advantage that it cannot be mixed up with any other similar-looking dvd documents from outside of our namespace. Our newly namespaced document is shown in Figure 2-5.

Figure 2-5. A simple namespaced XML document with attributes and comments.

 <?xml version="1.0" encoding="utf-8"?> <!-- This is the European release of the DVD --> <d:dvd xmlns:d="http://dvd.example.com" region="2"> <d:title>The Phantom Menace</d:title> <d:year>2001</d:year> </d:dvd>

We have introduced into Figure 2-5 an association between a prefix and a URI (in this case we've used a URL), using the xmlns attribute from the XML Namespace specification. We then used that prefix throughout the document to associate our elements with that namespace. Any XML processing infrastructure that reads our document does not see the elements as simply their element names but de-references the URI to arrive at the form {URI}:<local name> (e.g., {http://dvd.example.com}:dvd}) which is unambiguous, unlike the element name alone (i.e., just dvd). It is important to remember that the syntax {prefix}:<local name> is not understood by XML processing programs, it is a convention used when describing qualified elements.

Although any element can contain a namespace declaration, the style convention in XML is to declare all namespaces that a document uses in its root element. Although this can make the opening tag of the root element quite large, it does improve overall document readability since we do not then pepper the document with namespace declarations.

Explicit and Default Namespaces

XML permits two distinct kinds of namespace declarations. The first of these as we have seen is the explicit form, whereby a prefix is given a namespace association (e.g., xmlns:d="http://dvd.example.com"), and then elements and attributes which belong to that namespace are explicitly adorned with the chosen prefix. The second of these is the default namespace declared as xmlns=<uri> that provides a default namespace affiliation which applies to any elements without a prefix.

The default namespace can be used to improve the readability of an XML document. In documents where a particular explicit namespace is predominantly used (like the WSDL or SOAP documents in Chapter 3), declaring a default namespace alleviates the need to pepper the document with the same prefix all over. Using this strategy, only those elements outside of the default namespace will need to be prefixed, which can make documents significantly easier to understand.

We present a modified version of the XML from Figure 2-5 in Figure 2-6, where the default namespace declaration implicitly scopes all following elements within the http://dvd.example.com namespace, like this:

Figure 2-6. Using default namespaces.

 <?xml version="1.0" encoding="utf-8"?> <!-- This is the European release of the DVD --> <dvd xmlns="http://dvd.example.com" region="2"> <title>The Phantom Menace</title> <year>2001</year> </dvd>

Adding a namespace affiliation to an XML document is analogous to placing a Java class into a specific package. Where the Java equivalent of in Figure 2-2 (which has no namespace affiliation) might have been referenced by a declaration such as DVD myDVD, the equivalent type of reference for the document in Figure 2-5 or Figure 2-6 would be com.example.dvd.DVD myDVD, which when reduced to Java terms is clearly unambiguous since only the owner of the dvd.example.com domain should be using that namespace (and by inference should be the only party using that namespace to name XML documents).

Inheriting Namespaces

Once a default or explicit namespace has been declared, it is "in scope" for all child elements of the element where it was declared. The default namespace is therefore propagated to all child elements implicitly unless they have their own explicit namespace.

This arrangement is common in WSDL files (Chapter 3) where the WSDL namespace is the default namespace for an interface, but where the binding elements use their own explicit namespace.

The rule of thumb for choosing a default or explicit namespace is that if you can't see at a glance yourself which namespace an element belongs to, then no one else will be able to and, therefore, explicit namespaces should be used. If, however, it is obvious which namespace an element belongs to and there are lots of such elements in the same namespace, then readability may be improved with the addition of a default namespace.

And Not Inheriting Namespaces

Of course, a child may not necessarily want to inherit the default namespace of its parent and may wish to set it to something else or remove the default namespace entirely. This is not a problem with explicit namespaces because the child element can just be prefixed with a different explicit namespace than its parent, as shown in Figure 2-7, where the genre element has a different namespace affiliation than the rest of the document (which uses the default namespace).

Figure 2-7. Mixing explicit and default namespaces within a document.

 <?xml version="1.0" encoding="utf-8"?> <!-- This is the European release of the DVD --> <dvd xmlns="http://dvd.example.com" region="2">     <title>The Phantom Menace</title>     <year>2001</year>     <g:genre xmlns:g="http://film-genre.example.com">         sci-fi     </g:genre> </dvd>

It is important to realize that any children of the genre element in Figure 2-7 that use the default namespace will be using the default namespace of the dvd element since the genre element only declares an explicit namespace for its scope. Similarly, with default namespaces, any element is at liberty to define a namespace for itself and any of its children irrespective of the namespace affiliations of any of its parent elements. This is shown below in Figure 2-8:

Figure 2-8. Mixing default namespaces within a document.

 <?xml version="1.0" encoding="utf-8"?> <!-- This is the European release of the DVD --> <dvd xmlns="http://dvd.example.com" region="2">     <title>The Phantom Menace</title>     <year>2001</year>     <genre xmlns ="http://film-genre.example.com">         sci-fi     </genre> </dvd>

The genre element from Figure 2-8 declares that the default namespace for itself and its children (if any) are, by default, in the namespace http://film-genre.example.com. This differs from the example shown in Figure 2-7 since in the absence of any explicit namespace, children of the genre element belong to the http://film-genre.example.com and not to the http://dvd.example.com namespace as the outer elements do.

Of course it may be the case that an element does not require a default namespace and that the parent default namespace is inappropriate. In such cases, we can remove any default namespace completely, by setting it to the empty string xmlns="".

For default namespaces, remember that the scoping rules are based on the familiar concept of "most local" where the declaration nearest to the use has the highest precedence.

Attributes and Namespaces

So far all of our attention has been focused on the interplay between namespaces and elements. However, it is equally valid for attributes to be qualified with namespaces through the same prefix syntax. When namespace-qualifying attributes have a default namespace, different rules apply compared to elements. Attributes are not affiliated with any default namespace, so if an attribute is to be namespace qualified, then it must be done so explicitly since any attribute without a prefix will not be considered namespace qualified even if declared in the scope of a valid default namespace.

The convention in XML is to associate elements with namespaces, but to leave attributes unqualified since they reside within elements with qualified names.

At this point we now understand both basic XML document structure and some more advanced features like namespaces. These both set the scene for higher-level XML-based technologies (including Web services) which we shall continue by looking at XML Schema.