D.2 What Is RELAX NG?


The W3C XML Schema process began with high ambitions to be a more powerful alternative to DTDs, but many people found XSD to be more trouble than it was worth. XSD is difficult for many people to create, difficult to process, has areas (notably block and final) that are fairly contentious, and not everyone wants to define their documents in terms of object inheritance anyway. While XSD has done well in some fields of XML work, and Microsoft has implemented it throughout its product line, there was a plain need for an alternative.

RELAX NG, which has developed largely from work done by XML pioneers Murata Makoto and James Clark, has mathematical foundations rather than the ad hoc object structures used by XSD. Fortunately, you don't need to know the math to use the schemas, but these foundations make it a lot simpler to both use and process RELAX NG. RELAX NG comes in both an XML syntax and a compact syntax, but we'll focus on the compact syntax here because it's generally quite approachable.

RELAX NG is being developed at the Organization for the Advancement of Structured Information Standards (OASIS), a different specification development organization from the W3C, and standardized through the International Organization for Standardization (ISO) as part of the Document Schema Definition Languages (DSDL) effort. For more on OASIS development of RELAX NG, see http://www.oasis-open.org/committees/relax-ng/. For more on the DSDL work, see http://dsdl.org.

D.2.1 A Basic RELAX NG Schema

For our first RELAX NG schema, we'll start with Example D-3, which is the same document shown in Example D-1 except without the DOCTYPE declaration.

Example D-3. A sample XML document
<?xml version="1.0" encoding="us-ascii"?> <authors>     <person abbrev="edd">         <name>Edd Dumbill</name>         <nationality>British</nationality>     </person>     <person abbrev="simonstl">         <name>Simon St.Laurent</name>         <nationality>American</nationality>     </person>     <person abbrev="vdv">         <name>Eric van der Vlist</name>         <nationality>French</nationality>     </person> </authors>

Described in RELAX NG Compact syntax, the schema for this document can resemble the schemas shown in Examples Example D-4 and Example D-5. Example D-4 uses a nested syntax.

Example D-4. A nested RELAX NG schema
element authors {   element person {      attribute abbrev {text},      element name {text},      element nationality {text}   }* }

The curly braces work much like those in C structs, defining the contents of named components. This schema defines an authors element, which contains zero or more person elements. (The zero or more comes from the asterisk after the closing brace for person.) The person elements have mandatory abbrev attributes and name and nationality elements, all of which store their contents as text. If you prefer a more declarative approach, RELAX NG also supports that option. Example D-5 uses a more DTD-like declaration approach.

Example D-5. A declarative RELAX NG schema
start=authors    authors = element authors { person* } person = element person { abbrev, name, nationality } abbrev = attribute abbrev {text} name = element name {text} nationality = element nationality {text}

This approach reads differently, but describes the same structure. Instead of just starting with the authors element, it explicitly lists possible root elements in the start declaration. Each declaration describes the contents of one element or attribute. The difference between attribute and element declarations is much smaller in RELAX NG than in XSD or in DTDs, and the abbrev attribute is attached to the person element just like the name and nationality elements. Elements and attributes that contain text just list text as their content.

To validate documents against these schemas, you can use James Clark's Jing tool, which is included with Trang, the tool we'll be using later in this appendix to convert RELAX NG types into XSD. Go to the directory where you've unzipped Trang, and you can run the validator by typing the following:

java -jar jing.jar -c appD-4.rnc appD-3.xml

If there aren't any errors in the document, Jing does its work and doesn't report anything. Otherwise, it reports errors like:

C:\trang>java -jar jing.jar -c appD-4.rnc appD-3broken.xml C:\trang\appD-3broken.xml:5: error: attribute "country" not allowed at this point; ignored C:\trang\appD-3broken.xml:9: error: unknown element "address"

This can be a useful diagnostic, but in work with Office you'll probably convert your RELAX NG to XSD.

D.2.2 Advanced Features: Namespaces and Datatypes

RELAX NG goes well beyond the capabilities of DTDs and into the features that XSD provides. RELAX NG provides simple support for namespaces, so adding a namespace to the schema shown in Example D-5 requires adding only one line, as shown in Example D-6.

Example D-6. A declarative RELAX NG schema with namespaces
default namespace = "http://example.com/authors/" start=authors    authors = element authors { person* } person = element person { abbrev, name, nationality } abbrev = attribute abbrev {text} name = element name {text} nationality = element nationality {text}

Now all of the elements without prefixes authors, person, name, and nationality are in the http://example.com/authors/ namespace. Applying this to the non-namespaced Example D-3 produces an error:

C:\trang>java -jar jing.jar -c appD-6.rnc appD-3.xml C:\trang\appD-3.xml:2: error: unknown element "authors"

Adding a default namespace declaration to the root element clears things up:

<authors xmlns="http://example.com/authors/">...

Jing no longer reports any errors. You can also define namespaces for prefixed elements and attributes, using slightly different syntax:

namespace auth = "http://www.example.com/authors/" start=auth:authors auth:authors=element auth:authors {auth:person * } ...

These namespace declarations are most commonly made at the top of the schema, and they apply to all the declarations that follow them.

RELAX NG doesn't provide its own set of datatypes, preferring to let developers choose their own set. For the most part and conveniently compatible with Office's expectations RELAX NG developers use the datatypes defined by XML Schema. This requires an extra declaration, and then you can use XSD types. For example, to define the text contents of the name and nationality elements as xsd:string and the abbrev attribute's contents as xsd:token, we'll change the RELAX NG schema to use datatypes, as in Example D-7.

Example D-7. A declarative RELAX NG schema using datatypes
default namespace = "http://example.com/authors/" datatypes xsd = "http://www.w3.org/2001/XMLSchema-datatypes"    start=authors    authors = element authors { person* } person = element person { abbrev, name, nationality } abbrev = attribute abbrev {xsd:token} name = element name {xsd:string} nationality = element nationality {xsd:string}

You can use any of the of the XML Schema datatypes and constrain their facets, if needed.

For a more thorough introduction to RELAX NG Compact syntax, see Michael Fitzgerald's tutorial at http://www.xml.com/pub/a/2002/06/19/rng-compact.html. The specification for RELAX NG compact syntax is available at http://www.oasis-open.org/committees/relax-ng/compact-20020607.html.




Office 2003 XML
Office 2003 XML
ISBN: 0596005385
EAN: 2147483647
Year: 2003
Pages: 135

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net