Parsing XML with SAX


XMLReader parser =    XMLReaderFactory.createXMLReader(          "org.apache.xerces.parsers.SAXParser"); parser.setContentHandler(new MyXMLHandler( )); parser.parse("document.xml");



The SAX API works by scanning through an XML document from start to finish and providing callbacks for events that occur within the XML document. The events include things such as the start of an element, the end of an element, the start of an attribute, the end of an attribute, and so on. In this phrase, we create an XMLReader instance using the SAXParser. After we have created the parser instance, we then set a content handler using the setContentHandler() method. The content handler is a class that defines the various callback methods that will be called by the SAX parser when an XML document is parsed. In this phrase, we create an instance of MyXMLHandler, a class we then must implement, to serve as our handler. Finally, we call the parse() method, passing the name of an XML document and the SAX processing kicks off.

Here we show an example implementation of the MyXMLHandler class. The DefaultHandler class that we extend is a default base class for SAX event handlers.

class MyXMLHandler extends DefaultHandler {    public void startElement(String uri,                             String localName,                             String qname,                             Attributes attributes) {       // process start of element    }    public void endElement(String uri,                           String localName,                           String qname) {       // process end of element    }    public void characters(char[] ch,                           int start,                           int length) {       // process characters    }    public MyXMLHandler( )           throws org.xml.sax.SAXException {       super( );    } }


In this example implementation, we implement only three methodsthe startElement(), endElement(), and characters() methods. The startElement() method is called by the SAX parser when the start of an element in the XML document is encountered. Likewise, the endElement() method is called when the end of an element is encountered. The characters() method is called to notify of character data inside an element.

See the DefaultHandler JavaDoc for a complete description of all the methods that can be overridden in the SAX handler: http://java.sun.com/j2se/1.5.0/docs/api/org/xml/sax/helpers/DefaultHandler.html In this phrase, the underlying Sax parser used is the Xerces parser. We set this parser in the method call shown below:

XMLReader parser =    XMLReaderFactory.createXMLReader(          "org.apache.xerces.parsers.SAXParser");


JAXP is designed to support pluggable parser implementations, and thus if you find a parser that you prefer over the Xerces parser, you can still use that with the code contained in this phrase. You do have to make sure that whatever parser implementation you are using is included in your class path.

SAX is generally more memory efficient than a DOM parser because with SAX, the entire XML document is not stored in memory all at once. The DOM API reads the entire document into memory and it is then processed in-memory.




JavaT Phrasebook. Essential Code and Commands
Java Phrasebook
ISBN: 0672329077
EAN: 2147483647
Year: 2004
Pages: 166

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net