CONTENTS |
|
SAX, the Simple API for XML, is a straightforward, event-based API used to parse XML documents. David Megginson, SAX's original author, placed SAX in the public domain. SAX is bundled with all parsers that implement the API, including Xerces, MSXML, Crimson, the Oracle XML Parser for Java, and lfred. However, you can also get it and the full source code from http://sax.sourceforge.net/.
SAX was originally defined as a Java API and is intended primarily for parsers written in Java, so this chapter will focus on its Java implementation. However, its port to other object-oriented languages, such as C++, Python, Perl, and Eiffel, is common and usually quite similar.
|
The org.xml.sax package contains the core interfaces and classes that comprise the Simple API for XML.
The Attributes Interface |
An object that implements the Attributes interface represents a list of attributes on a start-tag. The order of attributes in the list is not guaranteed to match the order in the document itself. Attributes objects are passed as arguments to the startElement( ) method of ContentHandler. You can access particular attributes in three ways:
By number
By namespace URI and local name
By qualified (prefixed) name
This list does not include namespace declaration attributes (xmlns and xmlns:prefix) unless the http://xml.org/sax/features/namespace-prefixes feature is true. It is false by default.
If the namespace-prefixes feature is false, qualified name access may not be available; if the http://xml.org/sax/features/namespaces feature is false, local names and namespace URIs may not be available:
package org.xml.sax; public interface Attributes { public int getLength( ); public String getURI(int index); public String getLocalName(int index); public String getQName(int index); public int getIndex(String uri, String localName); public int getIndex(String qualifiedName); public String getType(int index); public String getType(String uri, String localName); public String getType(String qualifiedName); public String getValue(String uri, String localName); public String getValue(String qualifiedName); public String getValue(int index); }
The ContentHandler Interface |
ContentHandler is the key piece of SAX. Almost every SAX program needs to use this interface. ContentHandler is a callback interface. An instance of this interface is passed to the parser via the setContentHandler( ) method of XMLReader. As the parser reads the document, it invokes the methods in its ContentHandler to tell the program what's in the document:
package org.xml.sax; public interface ContentHandler { public void setDocumentLocator(Locator locator); public void startDocument( ) throws SAXException; public void endDocument( ) throws SAXException; public void startPrefixMapping(String prefix, String uri) throws SAXException; public void endPrefixMapping(String prefix) throws SAXException; public void startElement(String namespaceURI, String localName, String qualifiedName, Attributes atts) throws SAXException; public void endElement(String namespaceURI, String localName, String qualifiedName) throws SAXException; public void characters(char[] text, int start, int length) throws SAXException; public void ignorableWhitespace(char[] text, int start, int length) throws SAXException; public void processingInstruction(String target, String data) throws SAXException; public void skippedEntity(String name) throws SAXException; }
The DTDHandler Interface |
By passing an instance of the DTDHandler interface to the setDTDHandler( ) method of XMLReader, you can receive notification of notation and unparsed entity declarations in the DTD. You can store this information and use it later to retrieve information about the unparsed entities you encounter while reading the document:
package org.xml.sax; public interface DTDHandler { public void notationDecl(String name, String publicID, String systemID) throws SAXException; public void unparsedEntityDecl(String name, String publicID, String systemID, String notationName) throws SAXException; }
The EntityResolver Interface |
By passing an instance of the EntityResolver interface to the setEntityResolver( ) method of XMLReader, you can intercept parser requests for external entities, such as the external DTD subset or external parameter entities, and redirect those requests in order to substitute different entities. For example, you could replace a reference to a remote copy of a standard DTD with a local one or find the sources for particular public IDs in a catalog. The interface is also useful for applications that use URI types other than URLs:
package org.xml.sax; public interface EntityResolver { public InputSource resolveEntity(String publicID, String systemID) throws SAXException, IOException; }
The ErrorHandler Interface |
By passing an instance of the ErrorHandler interface to the setErrorHandler( ) method of XMLReader, you can provide custom handling for particular classes of errors detected by the parser. For example, you can choose whether to stop parsing when a validity error is detected. The SAXParseException passed to each of the three methods in this interface provides details about the specific cause and location of the error:
package org.xml.sax; public interface ErrorHandler { public void warning(SAXParseException exception) throws SAXException; public void error(SAXParseException exception) throws SAXException; public void fatalError(SAXParseException exception) throws SAXException; }
Warnings represent possible problems noticed by the parser that are not technically violations of XML's well-formedness or validity rules. For instance, a parser might issue a warning if an xml:lang attribute's value was not a legal ISO-639 language code. The most common kind of error is a validity problem. The parser should report it, but it should also continue processing. A fatal error violates well-formedness. The parser should not continue parsing after reporting such an error.
The Locator Interface |
Unlike most other interfaces in the org.xml.sax package, the Locator interface does not have to be implemented. Instead, the parser has the option to provide an implementation. If it does so, it passes its implementation to the setDocumentLocator( ) method in your ContentHandler instance before it calls startDocument( ). You can save a reference to this object in a field in your ContentHandler class, like this:
private Locator locator; public void setDocumentLocator(Locator locator) { this.locator = locator; }
Once you've found the locator, you can then use it inside any other ContentHandler method, such as startElement( ) or characters( ), to determine in exactly which document and at which line and column the event took place. For instance, the locator allows you to determine that a particular start-tag began on the third column of the document's seventeenth line at the URL http://www.slashdot.org/slashdot.xml:
package org.xml.sax; public interface Locator { public String getPublicId( ); public String getSystemId( ); public int getLineNumber( ); public int getColumnNumber( ); }
The XMLFilter Interface |
An XMLFilter is an XMLReader that obtains its events from another parent XMLReader, rather than reading it from a text source such as InputStream. Filters can sit between the original source XML and the application and modify data in the original source before passing it to the application. Implementing this interface directly is unusual. It is almost always much easier to use the more complete org.xml.sax.helpers.XMLFilterImpl class instead.
package org.xml.sax; public interface XMLFilter extends XMLReader { public void setParent(XMLReader parent); public XMLReader getParent( ); }
The XMLReader Interface |
The XMLReader interface represents the XML parser that reads XML documents. You generally do not implement this interface yourself. Instead, use the org.xml.sax.helpers.XMLReaderFactory class to build a parser-specific implementation. Then use this parser's various setter methods to configure the parsing process. Finally, invoke the parse( ) method to read the document, while calling back to methods in your own implementations of ContentHandler, ErrorHandler, EntityResolver, and DTDHandler as the document is read:
package org.xml.sax; public interface XMLReader { public boolean getFeature(String name) throws SAXNotRecognizedException, SAXNotSupportedException; public void setFeature(String name, boolean value) throws SAXNotRecognizedException, SAXNotSupportedException; public Object getProperty(String name) throws SAXNotRecognizedException, SAXNotSupportedException; public void setProperty(String name, Object value) throws SAXNotRecognizedException, SAXNotSupportedException; public void setEntityResolver(EntityResolver resolver); public EntityResolver getEntityResolver( ); public void setDTDHandler(DTDHandler handler); public DTDHandler getDTDHandler( ); public void setContentHandler(ContentHandler handler); public ContentHandler getContentHandler( ); public void setErrorHandler(ErrorHandler handler); public ErrorHandler getErrorHandler( ); public void parse(InputSource input) throws IOException, SAXException; public void parse(String systemID) throws IOException, SAXException; }
The InputSource Class |
The InputSource class is an abstraction of a data source from which the raw bytes of an XML document are read. It can wrap a system ID, a public ID, an InputStream, or a Reader. When given an InputSource, the parser tries to read from the Reader. If the InputSource does not have a Reader, the parser will try to read from the InputStream using the specified encoding. If no encoding is specified, then it will try to autodetect the encoding by reading the XML declaration. Finally, if neither a Reader nor an InputStream has been set, then the parser will open a connection to the URL given by the system ID.
package org.xml.sax; public class InputSource { public InputSource( ); public InputSource(String systemID); public InputSource(InputStream byteStream); public InputSource(Reader reader); public void setPublicId(String publicID); public String getPublicId( ); public void setSystemId(String systemID); public String getSystemId( ); public void setByteStream(InputStream byteStream); public InputStream getByteStream( ); public void setEncoding(String encoding); public String getEncoding( ); public void setCharacterStream(Reader reader); public Reader getCharacterStream( ); }
The SAXExceptions Class |
Most exceptions thrown by SAX methods are instances of the SAXException class or one of its subclasses. The single exception to this rule is the parse( ) method of XMLReader, which may throw a raw IOException if a purely I/O-related error occurs, for example, if a socket is broken before the parser finishes reading the document from the network.
Besides the usual exception methods, such as getMessage( ) and printStackTrace( ), that SAXException inherits from or overrides in its superclasses, SAXException adds a getException( ) method to return the nested exception that caused the SAXException to be thrown in the first place:
package org.xml.sax; public class SAXException extends Exception { public SAXException(String message); public SAXException(Exception ex); public SAXException(String message, Exception ex); public String getMessage( ); public Exception getException( ); public String toString( ); }
SAXParseException |
If the parser detects a well-formedness error while reading a document, it throws a SAXParseException, a subclass of SAXException. SAXParseExceptions are also passed as arguments to the methods of the ErrorHandler interface, where you can decide whether you want to throw them.
Besides the methods it inherits from its superclasses, this class adds methods to get the line number, column number, system ID, and public ID of the document where the error was detected:
package org.xml.sax; public class SAXParseException extends SAXException { public SAXParseException(String message, Locator locator); public SAXParseException(String message, Locator locator, Exception e); public SAXParseException(String message, String publicID, String systemID, int lineNumber, int columnNumber); public SAXParseException(String message, String publicID, String systemID, int lineNumber, int columnNumber, Exception e); public String getPublicId( ); public String getSystemId( ); public int getLineNumber( ); public int getColumnNumber( ); }
SAXNotRecognizedException |
A SAXNotRecognizedException is thrown if you attempt to set a property or feature the parser does not recognize. Besides the constructors, all its methods are inherited from superclasses:
package org.xml.sax; public class SAXNotRecognizedException extends SAXException { public SAXNotRecognizedException( ); public SAXNotRecognizedException(String message); }
SAXNotSupportedException |
A SAXNotSupportedException is thrown if you attempt to set a property or feature that the parser recognizes, but either cannot set or get now or cannot set the value to which you want to set it. Besides the constructors, all of its methods are inherited from superclasses:
package org.xml.sax; public class SAXNotSupportedException extends SAXException { public SAXNotSupportedException( ); public SAXNotSupportedException(String message); }
The org.xml.sax.helpers package contains support classes for the core SAX classes. These classes include factory classes used to build instances of particular org.xml.sax interfaces and default implementations of those interfaces.
The AttributesImpl Class |
AttributesImpl is a default implementation of the Attributes interface that SAX parsers and filters may use. Besides the methods of the Attributes interface, this class offers manipulator methods so the list of attributes can be modified or reused. These methods allow you to take a persistent snapshot of an Attributes object in startElement( ) and construct or modify an Attributes object in a SAX driver or filter:
package org.xml.sax.helpers; public class AttributesImpl implements Attributes { public AttributesImpl( ); public AttributesImpl(Attributes atts); public int getLength( ); public String getURI(int index); public String getLocalName(int index); public String getQName(int index); public String getType(int index); public String getValue(int index); public int getIndex(String uri, String localName); public int getIndex(String qualifiedName); public String getType(String uri, String localName); public String getType(String qualifiedName); public String getValue(String uri, String localName); public String getValue(String qualifiedName); public void clear( ); public void setAttributes(Attributes atts); public void addAttribute(String uri, String localName, String qualifiedName, String type, String value); public void setAttribute(int index, String uri, String localName, String qualifiedName, String type, String value); public void removeAttribute(int index) public void setURI(int index, String uri) public void setLocalName(int index, String localName) public void setQName(int index, String qualifiedName); public void setType(int index, String type); public void setValue(int index, String value); }
The DefaultHandler Class |
DefaultHandler is a convenience class that implements the EntityResolver, DTDHandler, ContentHandler, and ErrorHandler interfaces with do-nothing methods. You can subclass DefaultHandler and override methods for events to which you actually want to respond. You never have to use this class. You can always implement the interfaces directly instead. The pattern is similar to the adapter classes in the AWT, such as MouseAdapter and WindowAdapter:
package org.xml.sax.helpers; public class DefaultHandler implements EntityResolver, DTDHandler, ContentHandler, ErrorHandler { // Default implementation of the EntityResolver interface. public InputSource resolveEntity(String publicID, String systemID) throws SAXException { return null; } // Default implementation of the DTDHandler interface. public void notationDecl(String name, String publicID, String systemID) throws SAXException {} public void unparsedEntityDecl(String name, String publicID, String systemID, String notationName) throws SAXException{} // Default implementation of the ContentHandler interface. public void setDocumentLocator(Locator locator) {} public void startDocument( ) throws SAXException {} public void endDocument( ) throws SAXException {} public void startPrefixMapping(String prefix, String uri) throws SAXException {} public void endPrefixMapping(String prefix) throws SAXException {} public void startElement(String uri, String localName, String qualifiedName, Attributes attributes) throws SAXException {} public void endElement(String uri, String localName, String qualifiedName) throws SAXException {} public void characters(char[] text, int start, int length) throws SAXException {} public void ignorableWhitespace(char[] whitespace, int start, int length) throws SAXException {} public void processingInstruction(String target, String data) throws SAXException {} public void skippedEntity(String name) throws SAXException {} // Default implementation of the ErrorHandler interface. public void warning(SAXParseException ex) throws SAXException {} public void error(SAXParseException ex) throws SAXException {} public void fatalError(SAXParseException ex) throws SAXException { throw ex; } }
The LocatorImpl Class |
LocatorImpl is a default implementation of the Locator interface for the convenience of parser writers. You probably won't need to use it directly. Besides the constructors, it adds setter methods to set the public ID, system ID, line number, and column number returned by the getter methods declared in Locator:
package org.xml.sax.helpers; public class LocatorImpl implements Locator { public LocatorImpl( ); public LocatorImpl(Locator locator); public String getPublicId( ); public String getSystemId( ); public int getLineNumber( ); public int getColumnNumber( ); public void setPublicId(String publicID); public void setSystemId(String systemID); public void setLineNumber(int lineNumber); public void setColumnNumber(int columnNumber); }
The NamespaceSupport Class |
NamespaceSupport provides a stack that can track the namespaces in scope at various points in the document. To use it, push a new context at the beginning of each element's namespace mappings, and place it at the end of each element. Each startPrefixMapping( ) invocation should call declarePrefix( ) to add a new mapping to the NamespaceSupport object. Then at any point where you need to figure out to which URI a prefix is bound, you can call getPrefix( ). The empty string indicates the default namespace. The getter methods can then tell you the prefix that is mapped to any URI or the URI that is mapped to any prefix at each point in the document. If you reuse the same NamespaceSupport object for multiple documents, be sure to call reset( ) in between documents.
package org.xml.sax.helpers; public class NamespaceSupport { public final static String XMLNS="http://www.w3.org/XML/1998/namespace"; public NamespaceSupport( ); public void reset( ); public void pushContext( ); public void popContext( ); public boolean declarePrefix(String prefix, String uri); public String[] processName(String qualifiedName, String[] parts, boolean isAttribute); public String getURI(String prefix); public Enumeration getPrefixes( ); public String getPrefix(String uri); public Enumeration getPrefixes(String uri); public Enumeration getDeclaredPrefixes( ); }
The ParserAdapter Class |
The ParserAdapter class uses the adapter design pattern to convert a SAX1 org.xml.sax.Parser object into a SAX2 org.xml.sax.XMLReader object. As more parsers support SAX2, this class becomes less necessary. Note that some SAX2 features are not available through an adapted SAX1 parser. For instance, a parser created with this adapter does not report skipped entities and does not support most features and properties, not even the core features and properties:
package org.xml.sax.helpers; public class ParserAdapter implements XMLReader, DocumentHandler { public ParserAdapter( ) throws SAXException; public ParserAdapter(Parser parser); // Implementation of org.xml.sax.XMLReader. public void setFeature(String name, boolean state) throws SAXNotRecognizedException, SAXNotSupportedException; public boolean getFeature(String name) throws SAXNotRecognizedException, SAXNotSupportedException; public void setProperty(String name, Object value) throws SAXNotRecognizedException, SAXNotSupportedException; public Object getProperty(String name) throws SAXNotRecognizedException, SAXNotSupportedException; public void setEntityResolver(EntityResolver resolver); public EntityResolver getEntityResolver( ); public void setDTDHandler(DTDHandler handler); public DTDHandler getDTDHandler( ); public void setContentHandler(ContentHandler handler); public ContentHandler getContentHandler( ); public void setErrorHandler(ErrorHandler handler); public ErrorHandler getErrorHandler( ); public void parse(String systemID) throws IOException, SAXException; public void parse(InputSource input) throws IOException, SAXException; // Implementation of org.xml.sax.DocumentHandler. public void setDocumentLocator(Locator locator); public void startDocument( ) throws SAXException; public void endDocument( ) throws SAXException; public void startElement(String qualifiedName, AttributeList qualifiedAttributes) throws SAXException; public void endElement(String qualifiedName) throws SAXException; public void characters(char[] text, int start, int length) throws SAXException; public void ignorableWhitespace(char[] text, int start, int length) throws SAXException; public void processingInstruction(String target, String data) throws SAXException; }
The XMLFilterImpl Class |
XMLFilterImpl is invaluable for implementing XML filters correctly. An instance of this class sits between an XMLReader and the client application's event handlers. It receives messages from the reader and passes them to the application unchanged, and vice versa. However, by subclassing this class and overriding particular methods, you can change the events that are sent before the application gets to see them. You chain a filter to an XMLReader by passing the reader as an argument to the filter's constructor. When parsing, you invoke the filter's parse( ) method, not the reader's parse( ) method.
package org.xml.sax.helpers; public class XMLFilterImpl implements XMLFilter, EntityResolver, DTDHandler, ContentHandler, ErrorHandler { public XMLFilterImpl( ); public XMLFilterImpl(XMLReader parent); // Implementation of org.xml.sax.XMLFilter public void setParent(XMLReader parent); public XMLReader getParent( ); // Implementation of org.xml.sax.XMLReader public void setFeature(String name, boolean state) throws SAXNotRecognizedException, SAXNotSupportedException; public boolean getFeature(String name) throws SAXNotRecognizedException, SAXNotSupportedException; public void setProperty(String name, Object value) throws SAXNotRecognizedException, SAXNotSupportedException; public Object getProperty(String name) throws SAXNotRecognizedException, SAXNotSupportedException; public void setEntityResolver(EntityResolver resolver); public EntityResolver getEntityResolver( ); public void setDTDHandler(DTDHandler handler); public DTDHandler getDTDHandler( ); public void setContentHandler(ContentHandler handler); public ContentHandler getContentHandler( ); public void setErrorHandler(ErrorHandler handler); public ErrorHandler getErrorHandler( ); public void parse(InputSource input) throws SAXException, IOException; public void parse(String systemID) throws SAXException, IOException // Implementation of org.xml.sax.EntityResolver public InputSource resolveEntity(String publicID, String systemID) throws SAXException, IOException; // Implementation of org.xml.sax.DTDHandler public void notationDecl(String name, String publicID, String systemID) throws SAXException; public void unparsedEntityDecl(String name, String publicID, String systemID, String notationName) throws SAXException; // Implementation of org.xml.sax.ContentHandler public void setDocumentLocator(Locator locator); public void startDocument( ) throws SAXException; public void endDocument( ) throws SAXException; public void startPrefixMapping(String prefix, String uri) throws SAXException; public void endPrefixMapping(String prefix) throws SAXException; public void startElement(String namespaceURI, String localName, String qualifiedName, Attributes atts) throws SAXException; public void endElement(String namespaceURI, String localName, String qualifiedName) throws SAXException; public void characters(char[] text, int start, int length) throws SAXException; public void ignorableWhitespace(char[] text, int start, int length) throws SAXException; public void processingInstruction(String target, String data) throws SAXException; public void skippedEntity(String name) throws SAXException; // Implementation of org.xml.sax.ErrorHandler public void warning(SAXParseException ex) throws SAXException; public void error(SAXParseException ex) throws SAXException; public void fatalError(SAXParseException ex) throws SAXException; }
The XMLReaderAdapter Class |
XMLReaderAdapter is the reverse of ParserAdapter; it uses the Adapter design pattern to adapt a SAX2 XMLReader to a SAX1 Parser. This lets you use SAX2 parsers for legacy programs written to a SAX1 interface:
package org.xml.sax.helpers; public class XMLReaderAdapter implements Parser, ContentHandler { public XMLReaderAdapter( ) throws SAXException; public XMLReaderAdapter(XMLReader reader); // Implementation of org.xml.sax.Parser. public void setLocale(Locale locale) throws SAXException; public void setEntityResolver(EntityResolver resolver); public void setDTDHandler(DTDHandler handler); public void setDocumentHandler(DocumentHandler handler); public void setErrorHandler(ErrorHandler handler); public void parse(String systemID) throws IOException, SAXException; public void parse(InputSource input) throws IOException, SAXException // Implementation of org.xml.sax.ContentHandler. public void setDocumentLocator(Locator locator); public void startDocument( ) throws SAXException; public void endDocument( ) throws SAXException; public void startPrefixMapping(String prefix, String uri) throws SAXException; public void endPrefixMapping(String prefix) throws SAXException; public void startElement(String namespaceURI, String localName, String qualifiedName, Attributes atts) throws SAXException; public void endElement(String namespaceURI, String localName, String qualifiedName) throws SAXException; public void characters(char[] text, int start, int length) throws SAXException; public void ignorableWhitespace(char[] text, int start, int length) throws SAXException; public void processingInstruction(String target, String data) throws SAXException; public void skippedEntity(String name) throws SAXException; }
The XMLReaderFactory Class |
XMLReaderFactory creates XMLReader instances in a parser-independent manner. The noargs createXMLReader( ) method instantiates the class named by the org.xml.sax.driver system property. The other createXMLReader( ) method instantiates the class named by its argument. This argument should be a fully packaged qualified name, such as org.apache.xerces.parsers.SAXParser:
package org.xml.sax.helpers; public final class XMLReaderFactory { public static XMLReader createXMLReader( ) throws SAXException; public static XMLReader createXMLReader(String className) throws SAXException; }
Absolute URIs are used to name a SAX parser's properties and features. Features have a boolean value; that is, for each parser, a recognized feature is either true or false. Properties have object values. SAX defines six core features and two core properties that parsers should recognize. In addition, parsers can add features and properties to this list, and most do.
SAX Core Features |
All SAX parsers should recognize six core features. Of these six, two (http://xml.org/sax/features/namespaces and http://xml.org/sax/features/namespace-prefixes) must be implemented by all conformant processors. The other four are optional and may not be implemented by all parsers:
When true, this feature indicates that the startElement( ) and endElement( ) methods provide namespace URIs and local names for elements and attributes. When false, the parser provides prefixed element and attribute names to the startElement( ) and endElement( ) methods. If a parser does not provide something it is not required to provide, then that value will be set to the empty string. However, most parsers provide all three (URI, local name, and prefixed name) regardless of the value of this feature. This feature is true by default.
When true, this feature indicates that xmlns and xmlns:prefix attributes will be included in the attributes list passed to startElement( ). When false, these attributes are omitted. Furthermore, if this feature is true, then the parser will provide the prefixed names for elements and attributes. The default is false unless http://xml.org/sax/features/namespaces is false, in which case this feature defaults to true. You can set both http://xml.org/sax/features/namespaces and http://xml.org/sax/features/namespace-prefixes to true to guarantee that local names, namespace URIs, and prefixed names are all available.
When this feature is true, all element names, prefixes, attribute names, namespace URIs, and local names are internalized using the intern( ) method of java.lang.String; that is, equal names compare equally when using ==.
When true, the parser validates. When false, it doesn't. The default is false for most parsers. If you turn on this feature, you'll probably also want to register an ErrorHandler with the XMLReader to receive notice of any validity errors.
When true, the parser resolves external parsed general entities. When false, it doesn't. The default is true for most parsers that can resolve external entities. Turning on validation automatically activates this feature because validation requires resolving external entities.
When true, the parser resolves external parameter entities. When false, it doesn't. Turning on validation automatically activates this feature because validation requires resolving external entities.
SAX Core Properties |
SAX defines two core properties, though implementations are not required to support them:
This property's value is an org.w3c.dom.Node object that represents the current node the parser is visiting.
This property's value is a java.lang.String object containing the characters that were the source for the current event. As of mid-2001, no Java parsers are known to implement this property.
The org.xml.sax.ext package provides optional interfaces that parsers may use to provide further functionality. Not all parsers support these interfaces, though most major ones do.
The DeclHandler Interface |
DeclHandler is a callback interface that provides information about the ELEMENT, ATTLIST, and parsed ENTITY declarations in the document's DTD. To configure an XMLReader with a DeclHandler, pass the name http://xml.org/sax/properties/DeclHandler and an instance of your handler to the reader's setProperty( ) method:
try { parser.setProperty( "http://xml.org/sax/properties/DeclHandler", new YourDeclHandlerImplementationClass( )); } catch(SAXException e) { System.out.println("This parser does not provide declarations."); }
If the parser does not provide declaration events, it throws a SAXNotRecognizedException. If the parser cannot install a DeclHandler at this moment (generally because it's in the middle of parsing a document), then it throws a SAXNotSupportedException. If it doesn't throw one of these exceptions, it will call back to the methods in your DeclHandler as it parses the DTD:
package org.xml.sax.ext; public interface DeclHandler { public void elementDecl(String name, String model) throws SAXException; public void attributeDecl(String elementName, String attributeName, String type, String defaultValue, String value) throws SAXException; public void internalEntityDecl(String name, String value) throws SAXException; public void externalEntityDecl(String name, String publicID, String systemID) throws SAXException; }
The LexicalHandler Interface |
LexicalHandler is a callback interface that provides information about aspects of the document that are not normally relevant, specifically:
CDATA sections
Entity boundaries
DTD boundaries
Comments
Without a LexicalHandler, the parser simply ignores comments and expands entity references and CDATA sections. By using the LexicalHandler interface, however, you can read the comments and learn which text came from regular character data, which came from a CDATA section, and which came from which entity reference.
To configure an XMLReader with a LexicalHandler, pass an instance of your handler to the reader's setProperty( ) method with the name http://xml.org/sax/properties/LexicalHandler:
try { parser.setProperty( "http://xml.org/sax/properties/LexicalHandler", new YourLexicalHandlerClass( ) ); } catch(SAXException e) { System.out.println("This parser does not provide lexical events."); }
If the parser does not provide lexical events, it throws a SAXNotRecognizedException. If the parser cannot install a LexicalHandler at this moment (generally because it's in the middle of parsing a document), then it throws a SAXNotSupportedException. If it doesn't throw one of these exceptions, it calls back to the methods in your LexicalHandler as it encounters entity references, comments, and CDATA sections. The basic content of the resolved entities and CDATA sections are still reported through the ContentHandler interface, as normal:
package org.xml.sax.ext; public interface LexicalHandler { public void startDTD(String name, String publicID, String systemID) throws SAXException; public void endDTD( ) throws SAXException; public void startEntity(String name) throws SAXException; public void endEntity(String name) throws SAXException; public void startCDATA( ) throws SAXException; public void endCDATA( ) throws SAXException; public void comment(char[] text, int start, int length) throws SAXException; }
CONTENTS |