Flylib.com

Books Software

 
 
 

2.6 Namespaces

 < Day Day Up > 


2.6 Namespaces

It is often necessary to combine markup from two different XML applications in a single document. If the same element name is used in both XML applications, the interpretation of that element name becomes ambiguous.

For example, both MathML and SVG define an element called set . If you include a MathML equation and an SVG graphic in an XHTML document, an XML processor reading that document has no way of knowing whether a given set element is a MathML element or an SVG element. This can lead to problems in validating the document and interpreting its meaning.

XML uses the concept of namespaces to distinguish elements with the same name belonging to different XML formats. Each namespace associates a collection of element and attribute names with a specific URL. So, for example, all MathML elements are placed in the MathML namespace, and all SVG elements in the SVG namespace. Since each URL is a unique string, two elements that have the same name but belong to different XML applications can always be distinguished.

There are two ways to specify the namespace for a particular element. The first is to specify the namespace explicitly on each element, using a namespace prefix. To do this, you use an attribute declaration of the form xmlns: prefix-name =" url " to associate a prefix name with a specific namespace URL. This prefix declaration must occur either on or before the outermost element belonging to that namespace. You then replace the name of each element belonging to that namespace by a qualified name of the form, prefix-name : element-name . Here is an example:

<m:math xmlns:m="http://www.w3.org/1998/Math/MathML"> <m:set> <m:ci>b</m:ci> <m:ci>a</m:ci> </m:set> </m:math>

The URL http://www.w3.org/1998/Math/MathML is a unique identifier for the MathML namespace. If you associate the prefix m with this URL using the xmlns:m attribute, all element and attribute names of the form m: name are interpreted as names defined by MathML.

The second way is to specify a default namespace using an xmlns attribute. This provides an alternative to using a namespace prefix for each element. For example:

<math xmlns="http://www.w3.org/1998/Math/MathML"> <set> <ci> b </ci> <ci> a </ci> </set> </math>

Here, the math element contains an xmlns attribute whose value is set to the URL that defines the MathML namespace. By default, all element names that appear inside the math element (such as set and ci ) are then assumed to lie within the namespace defined by that URL. This distinguishes them from any other set elements in the document belonging to another XML application.



 < Day Day Up > 
 < Day Day Up > 


2.7 XML and Unicode

An XML document can contain any Unicode text. Unicode is an international standard for representing multilingual text. It defines a very large character set that includes characters from most of the world's languages as well as many mathematical and technical symbols.

A character set defines a mapping between a set of characters and a set of numbers , which are called code points . For example, in Unicode, the Greek letter α is represented by the code point 945 (in decimal notation), or x3B1 (in hexadecimal notation).

Unicode is a superset of American Standard Code for Information Interchange (ASCII), a widely used character set that includes all the letters and common punctuation marks used in English. ASCII consists of 128 characters with code points from 0 to 127. The first 128 characters of Unicode are identical to ASCII. For example, the letter A has the code point 65 in both ASCII and Unicode. However, Unicode goes well beyond ASCII by including many more characters. The current version of the standard, Unicode 3.2, defines code points for approximately 95,000 characters.

In an XML document, the names of elements and attributes, as well as the character data contained in an element, can all be written in Unicode. The advantage of using Unicode is that it allows you to use a single character set for text containing multiple languages and many different types of symbols. This avoids the problems caused by conflicting character sets, in which a single code point might be assigned to more than one character or a single character might have more than one code point, depending on the type of computer being used. Many software applications and operating systems now support Unicode. Unicode thus provides a standard way of encoding multilingual text so it can be exchanged and interpreted reliably across a wide variety of computer systems.

You can include a Unicode character in an XML document in the form of a character entity reference. For example, to include the Greek character α , you would type &#x3B1; . If the document includes a DTD declaration with entity names defined for specific characters, you can also insert the character using a named entity reference. For example, suppose you include a reference to the MathML DTD, as shown here:

<!DOCTYPE math SYSTEM "http://www.w3.org/TR/MathML2/dtd/mathml2.dtd">

You can then insert the α character in this document by using the named entity reference, &alpha; because the MathML DTD includes an entity declaration that associates the entity name alpha with the corresponding Unicode character code. We shall learn more about named characters in MathML in Section 3.4.



 < Day Day Up >