Hack 75 Create a RELAX NG Schema from an Instance

   

figs/beginner.gif figs/hack75.gif

Trang and Relaxer can create RELAX NG schemas on the fly, in either XML or compact syntax.

Trang (http://www.thaiopensource.com/relaxng/trang.html) can translate XML documents into RELAX NG schemas, in either XML or compact syntax. Likewise, Relaxer (http://www.relaxer.org) can produce RELAX NG schemas in XML syntax. This means that you can develop an XML document and then instantly produce a RELAX NG schema for it. The schemas that Trang or Relaxer produce may not be exactly what you want, but they will give you a good start a schema that you can edit for your own purposes. By the way, Trang can also produce XML Schema documents and DTDs. This hack will walk you through the steps to automatically produce RELAX NG schemas, in either XML or compact formats, from an XML document.

We'll translate the document newhire.xml. This document is based on specifications from the HR-XML Consortium (http://www.hr-xml.org/channels/home.htm), which develops XML vocabularies for human resource applications. The file newhire.xml contains some personal information about an employee, Floyd Filigree, who lives in New York:

<?xml version="1.0" encoding="UTF-8"?>     <Employee xmlns="http://ns.hr-xml.org">  <PersonName>   <GivenName>Floyd</GivenName>   <FamilyName>Filigree</FamilyName>   <Affix type="formOfAddress">Mr</Affix>  </PersonName>  <PostalAddress>   <CountryCode>US</CountryCode>   <PostalCode>10001</PostalCode>   <Region>NY</Region>   <Municipality>New York</Municipality>   <DeliveryAddress>    <PostOfficeBox>0000</PostOfficeBox>   </DeliveryAddress>  </PostalAddress> </Employee>

5.9.1 Trang (XML Syntax)

To translate newhire.xml into RELAX NG in XML syntax with Trang, type this command:

java -jar trang.jar newhire.xml newhire.rng

You can use the Trang JAR that is in the file archive that came with the book, or you can download the latest version of Trang from http://www.thaiopensource.com/download, if there is a version later than 20030619.

When you use two arguments, Trang expects that the file type will match a file's suffix (e.g., .xml for XML, .rng for RELAX NG). You could also type in your command like this, for the same results:

java -jar trang.jar -I xml -O rng newhire.xml newhire.rng

Either of these commands will produce newhire.rng, shown here:

<?xml version="1.0" encoding="UTF-8"?> <grammar  ns="http://ns.hr-xml.org" xmlns="http://relaxng.org/ns/structure/1.0"  datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes">   <start>     <element name="Employee">       <element name="PersonName">         <element name="GivenName">           <data type="NCName"/>         </element>         <element name="FamilyName">           <data type="NCName"/>         </element>         <element name="Affix">           <attribute name="type">             <data type="NCName"/>           </attribute>           <data type="NCName"/>         </element>       </element>       <element name="PostalAddress">         <element name="CountryCode">           <data type="NCName"/>         </element>         <element name="PostalCode">           <data type="integer"/>         </element>         <element name="Region">           <data type="NCName"/>         </element>         <element name="Municipality">           <text/>         </element>         <element name="DeliveryAddress">           <element name="PostOfficeBox">             <data type="integer"/>           </element>         </element>       </element>     </element>   </start> </grammar>

Trang makes decisions on how to lay out the schema based on the document information, and then you can edit its results by hand. For example, you may not want the NCName datatype (lines 8, 11, 15, 22, and 28), so you could edit newhire.rng and replace all occurrences of NCName with string.

5.9.2 Relaxer (XML Syntax)

Another option for translating an XML document into RELAX NG schema in XML syntax is Relaxer. Assuming that you have downloaded and installed Relaxer [Hack #37], translate newhire.xml into a RELAX NG schema in XML syntax with the command:

relaxer -dir:out -rng newhire.xml

which will produce newhire.rng in the out subdirectory:

<?xml version="1.0" encoding="UTF-8" ?> <grammar xmlns="http://relaxng.org/ns/structure/1.0"          xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0"          xmlns:java="http://www.relaxer.org/xmlns/relaxer/java"          xmlns:relaxer="http://www.relaxer.org/xmlns/relaxer"          xmlns:sql="http://www.relaxer.org/xmlns/relaxer/sql"          datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes"          ns="http://ns.hr-xml.org"          >   <start>     <ref name="Employee"/>   </start>   <define name="Employee">     <element name="Employee">       <ref name="PersonName"/>       <ref name="PostalAddress"/>     </element>   </define>   <define name="PersonName">     <element name="PersonName">       <element name="GivenName">         <data type="token"/>       </element>       <element name="FamilyName">         <data type="token"/>       </element>       <ref name="Affix"/>     </element>   </define>   <define name="Affix">     <element name="Affix">       <attribute name="type">         <data type="token"/>       </attribute>       <data type="token"/>     </element>   </define>   <define name="PostalAddress">     <element name="PostalAddress">       <element name="CountryCode">         <data type="token"/>       </element>       <element name="PostalCode">         <data type="int"/>       </element>       <element name="Region">         <data type="token"/>       </element>       <element name="Municipality">         <data type="token"/>       </element>       <ref name="DeliveryAddress"/>     </element>   </define>   <define name="DeliveryAddress">     <element name="DeliveryAddress">       <element name="PostOfficeBox">         <data type="int"/>       </element>     </element>   </define> </grammar>

The namespace declaration on line 3 is for the RELAX NG DTD Compatibility spec (http://www.oasis-open.org/committees/relax-ng/compatibility-20011203.html). This spec integrates default attribute values, IDs, and documentation into RELAX NG processing, but it is a separate spec with different conformance requirements than straight RELAX NG 1.0. Though Relaxer declares this namespace, it does not use it. Relaxer adds several other namespace declarations for its own purposes (lines 4, 5, and 6), which the document does not use.

Relaxer organizes nodes into named templates (lines 13, 19, 38, and 55), unlike Trang's output. In addition, Relaxer uses the token datatype, which is one of two built-in RELAX NG datatypes (lines 22, 25, 33, 35, 41, 47, and 50). The other is string (XML Schema has a string datatype, too). Relaxer uses the int XML Schema datatype (http://www.w3.org/TR/xmlschema-2/#int) on lines 44 and 58 for PostalCode and PostOfficeBox, where Trang uses the larger integer instead (http://www.w3.org/TR/xmlschema-2/#integer).

5.9.3 Trang (Compact Syntax)

To translate newhire.xml into RELAX NG's compact syntax, type this command:

java -jar trang.jar newhire.xml newhire.rnc

You could also type the command using the switches -I and -O to produce the same results:

java -jar trang.jar -I xml -O rnc newhire.xml newhire.rnc

Either of these commands will produce newhire.rnc:

default namespace = "http://ns.hr-xml.org"     start =   element Employee {     element PersonName {       element GivenName { xsd:NCName },       element FamilyName { xsd:NCName },       element Affix {         attribute type { xsd:NCName },         xsd:NCName       }     },     element PostalAddress {       element CountryCode { xsd:NCName },       element PostalCode { xsd:integer },       element Region { xsd:NCName },       element Municipality { text },       element DeliveryAddress {         element PostOfficeBox { xsd:integer }       }     }   }



XML Hacks
XML Hacks: 100 Industrial-Strength Tips and Tools
ISBN: 0596007116
EAN: 2147483647
Year: 2006
Pages: 156

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net