Hack 71 Check the Integrity of a W3C Schema

   

figs/beginner.gif figs/hack71.gif

Use the xni.XMLGrammarBuilder class from Xerces to do some extra checking on your schemas.

Whether you create a W3C XSD Schema with the same tools you use to create other XML documents or use a specialized schema-generation tool to create one, parsing it against the schema in the Schema for Schemas appendix of the W3C Schema recommendation (http://www.w3.org/TR/xmlschema-1/#normative-schemaSchema) may alert you to some problems, such as whether you mistyped the name of a schema definition element or put one schema definition element inside of another where it doesn't belong. There are other potential errors that this won't catch, though; for example, what if your maxOccurs value for one element is less than the minOccurs value for the same element?

The xni.XMLGrammarBuilder class manages this multidocument validation by creating a compiled version of that schema in memory and then re-using that compiled version for each instance document passed to it. Like any compiler, it makes various integrity checks as it compiles. If you're developing an XSD schema, this round of checks can help you before you've created your first document that conforms to that schema.

Imagine that you just drafted a schema, badschema.xsd, which has the following problems:

  • In the content model for the order element, the itemNum element has a maxOccurs value of 1 and a minOccurs value of 4. If the value must be greater than or equal to 4 or and less than or equal to 1, that doesn't leave any valid values!

  • It declares the itemNum element to be of type itemTypist. While the schema does declare a type called itemType, it has no type called itemTypist, and this certainly isn't one of the primitive or derived datatypes listed in the XML Schema datatypes recommendation (http://www.w3.org/TR/xmlschema-2/).

  • It declares the monthType type as being an integer with values between 1 and 12, inclusive. The orderMonthType is declared as an extension of monthType, but with a greater range of values allowed (0 to 12). This is illegal because a restricted type definition must restrict the allowable values, not expand them.

Here is the schema:

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">      <xs:element name="orders">   <xs:complexType>    <xs:sequence>     <xs:element ref="order" maxOccurs="unbounded"/>    </xs:sequence>   </xs:complexType>  </xs:element>      <xs:element name="order">   <xs:complexType>    <xs:sequence>     <xs:element name="itemNum" type="itemTypist"                 maxOccurs="1" minOccurs="4"/>  <!-- line 16 -->     <xs:element name="orderMonth" type="orderMonthType"                 maxOccurs="1"/>    </xs:sequence>   </xs:complexType>  </xs:element>       <xs:simpleType name="itemType">     <xs:restriction base="xs:string">       <xs:pattern value="\d{3}-\d{4}"/>     </xs:restriction>   </xs:simpleType>       <xs:simpleType name="monthType">     <xs:restriction base="xs:integer">       <xs:minInclusive value="1"/>       <xs:maxInclusive value="12"/>     </xs:restriction>   </xs:simpleType>       <xs:simpleType name="orderMonthType">     <xs:restriction base="monthType">  <!-- line 37 -->       <xs:minInclusive value="0"/>       <xs:maxInclusive value="12"/>     </xs:restriction>   </xs:simpleType>     </xs:schema>

Before trying the following command, make sure that your classpath includes both the xercesImpl.jar and the xercesSamples.jar files that come with the Java Xerces distribution (Version 2.6.2 or later). You can download the Xerces distribution from http://xml.apache.org/xerces2-j/download.cgi. While in the working directory, enter this command:

java -cp xercesImpl.jar;xercesSamples.jar xni.XMLGrammarBuilder  -a badschema.xsd

Use a colon (:) between JAR filenames if you are working in a Unix environment. The xni.XMLGrammarBuilder's -a switch names the schema to parse. The error messages it outputs list the problems with the schema:

[Error] badschema.xsd:37:38: FacetValueFromBase: Value '0' of  facet 'minInclusive' must be from the value space of the base type. [Error] badschema.xsd:16:46: p-props-correct.2.1: {min occurs} = '4'  must not be greater than {max occurs} = '1' for 'element'. [Error] badschema.xsd:16:46: src-resolve: Cannot resolve the  name 'itemTypist' to a(n) type definition component.

I know of no other utility that lets you check XML 1.0 DTDs for correctness in this way. Previously, to check a particular DTD I'd created, I used to throw together a simple document that conformed to it and then validated that document to see if a parser would find any problem with the DTD itself on the way to parsing the document. Having done this many times, I particularly appreciate xni.XMLGrammarBuilder's ability to check schema integrity with no need for any sample documents.

Bob DuCharme



XML Hacks
XML Hacks: 100 Industrial-Strength Tips and Tools
ISBN: 0596007116
EAN: 2147483647
Year: 2006
Pages: 156

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net