Talking to SAX Programs


JDOM works very well with SAX parsers. SAX is an almost-ideal event model for building a JDOM tree; and when the tree is complete, JDOM makes it easy to walk the tree, firing off SAX events as you go. Fast and memory efficient, SAX doesn't add a lot of extra overhead to JDOM programs.

Configuring SAXBuilder

When reading a file or stream through a SAX parser, you can set various properties on the parser, including the ErrorHandler , EntityResolver , DTDHandler , and any custom features or properties that are supported by the underlying SAX XMLReader . SAXBuilder includes several methods that delegate these configurations to the underlying XMLReader :

 public void  setErrorHandler  (ErrorHandler  errorHandler  )  public void  setEntityResolver  (EntityResolver  entityResolver  ) public void  setDTDHandler  (DTDHandler  dtdHandler  ) public void  setIgnoringElementContentWhitespace  (boolean  ignoreWhitespace  ) public void  setFeature  (String  name  , boolean  value  ) public void  setProperty  (String  name  , Object  value  ) 

For example, suppose you want to schema validate documents before using them. This requires three additional steps beyond the norm:

  1. Explicitly pick a parser class that is known to be able to schema validate, such as org.apache.xerces.parsers.SAXParser . (Most parsers can't schema validate.)

  2. Install a SAX ErrorHandler that reports validity errors.

  3. Set the SAX feature that turns on schema validation to true. Which feature this is depends on the parser you picked in step 1. In Xerces, it's http://apache.org/xml/features/validation/schema , and you also need to turn on validation using the standard SAX feature http://xml.org/sax/features/validation .

Example 14.11 is a simple JDOM program that uses Xerces to schema validate a URL named on the command line. This is similar to the earlier JDOMValidator in Example 14.8. Here, because the installed ErrorHandler ( BestSAXChecker from Example 7.8) merely prints validity error messages on System.out and does not throw an exception, validity errors do not terminate the parse. The Document object is still built as long as it's well- formed , whether or not it's valid. You could of course change this behavior by using a more draconian ErrorHandler that did throw exceptions for validity errors.

Example 14.11 A JDOM Program That Schema Validates Documents
 import org.jdom.JDOMException; import org.jdom.input.SAXBuilder; import java.io.IOException; public class JDOMSchemaValidator {   public static void main(String[] args) {     if (args.length == 0) {       System.out.println("Usage: java JDOMSchemaValidator URL");       return;     }     SAXBuilder builder = new SAXBuilder(      "org.apache.xerces.parsers.SAXParser");     builder.setValidation(true);     builder.setErrorHandler(new BestSAXChecker());                     // ^^^^^^^^^^^^^^                    // From Chapter 7     // turn on schema support     builder.setFeature(       "http://apache.org/xml/features/validation/schema", true);                       // command line should offer URIs or file names     try {       builder.build(args[0]);     }     // indicates a well-formedness error     catch (JDOMException e) {       System.out.println(args[0] + " is not well-formed.");       System.out.println(e.getMessage());     }     catch (IOException e) {       System.out.println("Could not check " + args[0]);       System.out.println(" because " + e.getMessage());     }   } } 

Here is the result from when I used this program to check a mildly invalid document. One error was reported .

 %  java JDOMSchemaValidator original_hotcop.xml  Error: cvc-type.3.1.3: The value '6:20' of element 'LENGTH' is  not valid.  at line 10, column 24  in entity file:///D:/books/XMLJAVA/examples/14/ original_hotcop.xml 

Caution

You should only use setFeature() and setProperty() for nonstandard features and properties like http://apache.org/xml/features/validation/schema. SAXBuilder requires certain settings of the standard features such as http://xml.org/sax/features/namespace-prefixes and standard properties such as http://xml.org/sax/properties/lexical-handler in order to work properly. If you change these, then the document may not be built correctly.


Another interesting possibility is to set a SAX filter that is applied to the document as it's read:

 public void  setXMLFilter  (XMLFilter  filter  ) 

If you use this, the JDOM Document will include only the filtered content.

SAXOutputter

In addition to reading a file or stream through a SAX parser, you can also feed a JDOM document into a SAX ContentHandler using the org.jdom.output.SAXOutputter class. This class is initially configured with a ContentHandler and optionally an ErrorHandler , DTDHandler , EntityResolver , and/or LexicalHandler . The output() method walks the tree, firing off events to these handlers as it does so.

For example, suppose you've built a document in memory that happens to contain some XInclude elements, and you'd like to resolve them. JDOM does not have built-in support for XInclude. To JDOM, an XInclude element is just an element that happens to have the local name include and the namespace URI http://www.w3.org/2001/XInclude. However, GNU JAXP does include a filter that can resolve XIncludes. Unfortunately it's a SAX filter rather than a JDOM filter. Not to worry. It's straightforward to feed a JDOM document into the GNU JAXP gnu.xml.pipeline.XIncludeFilter using a SAXOutputter , as shown in Example 14.12.

Example 14.12 A JDOM Program That Passes Documents to a SAX ContentHandler
 import org.jdom.*; import org.jdom.input.SAXBuilder; import org.jdom.output.SAXOutputter; import java.io.IOException; import gnu.xml.pipeline.*; import org.xml.sax.SAXException; public class XIncluder {   public static void main(String[] args) {     if (args.length == 0) {       System.out.println("Usage: java XIncluder URL");       return;     }     SAXBuilder builder = new SAXBuilder(      "gnu.xml.aelfred2.XmlReader");     // command line should offer URIs or file names     try {       Document doc = builder.build(args[0]);       XIncludeFilter filter = new XIncludeFilter(         new TextConsumer(System.out)       );       SAXOutputter outputter = new SAXOutputter(filter);       outputter.setContentHandler(filter);       outputter.setDTDHandler(filter);       outputter.setLexicalHandler(filter);       outputter.output(doc);     }     // indicates a well-formedness error     catch (JDOMException e) {       System.out.println(args[0] + " is not well-formed.");       System.out.println(e.getMessage());     }     catch (SAXException e) {       System.out.println(e.getMessage());     }     catch (IOException e) {       System.out.println("Could not merge " + args[0]);       System.out.println(" because " + e.getMessage());     }   } } 

Here the XIncludeFilter is itself hooked up to another GNU JAXP class, TextConsumer , which merely prints the document on a specified OutputStream .



Processing XML with Java. A Guide to SAX, DOM, JDOM, JAXP, and TrAX
Processing XML with Javaв„ў: A Guide to SAX, DOM, JDOM, JAXP, and TrAX
ISBN: 0201771861
EAN: 2147483647
Year: 2001
Pages: 191

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net