Working with SAX

This first example shows how to work with SAX. In this case, I'll use SAX to count the number of <CUSTOMER> elements in the same example we saw in the previous chapter. Here's that file, renamed ch12_01.xml for this chapter:

Listing ch12_01.xml
 <?xml version = "1.0" standalone="yes"?> <DOCUMENT>     <CUSTOMER>         <NAME>             <LAST_NAME>Smith</LAST_NAME>             <FIRST_NAME>Sam</FIRST_NAME>         </NAME>         <DATE>October 15, 2003</DATE>         <ORDERS>             <ITEM>                 <PRODUCT>Tomatoes</PRODUCT>                 <NUMBER>8</NUMBER>                 <PRICE>.25</PRICE>             </ITEM>             <ITEM>                 <PRODUCT>Oranges</PRODUCT>                 <NUMBER>24</NUMBER>                 <PRICE>.98</PRICE>             </ITEM>         </ORDERS>     </CUSTOMER>     <CUSTOMER>         <NAME>             <LAST_NAME>Jones</LAST_NAME>             <FIRST_NAME>Polly</FIRST_NAME>         </NAME>         <DATE>October 20, 2003</DATE>         <ORDERS>             <ITEM>                 <PRODUCT>Bread</PRODUCT>                 <NUMBER>12</NUMBER>                 <PRICE>.95</PRICE>             </ITEM>             <ITEM>                 <PRODUCT>Apples</PRODUCT>                 <NUMBER>6</NUMBER>                 <PRICE>.50</PRICE>             </ITEM>         </ORDERS>     </CUSTOMER>     <CUSTOMER>         <NAME>             <LAST_NAME>Weber</LAST_NAME>             <FIRST_NAME>Bill</FIRST_NAME>         </NAME>         <DATE>October 25, 2003</DATE>         <ORDERS>             <ITEM>                 <PRODUCT>Asparagus</PRODUCT>                 <NUMBER>12</NUMBER>                 <PRICE>.95</PRICE>             </ITEM>             <ITEM>                 <PRODUCT>Lettuce</PRODUCT>                 <NUMBER>6</NUMBER>                 <PRICE>.50</PRICE>             </ITEM>         </ORDERS>     </CUSTOMER> </DOCUMENT> 

Here, I'll base the new program on a new class named ch12_02 :

 public class ch12_02 extends DefaultHandler  {         .         .         . } 

Note the keywords extends DefaultHandler here. This means that our class, ch12_02 , is based on the Java DefaultHandler class. The DefaultHandler class already has a number of methods predefined for you that the SAX parser will call, including these callback methods:

  • startDocument Called when the start of the document is encountered

  • endDocument Called when the end of the document is encountered

  • startElement Called when the opening tag of an element is encountered

  • endElement Called when the closing tag of an element is encountered

  • characters Called when the XML parser sees text characters

All the required callback methods are already implemented in the DefaultHandler class, but they don't do anything. That means you have to implement only the methods you want to use, such as startDocument to catch the beginning of the document, or endDocument to catch the end of the document, as we'll see. You can see all the methods of the DefaultHandler class in Table 12-1.

Table 12-1. Methods of the DefaultHandler Class
Method Does This
DefaultHandler() The class constructor
void characters(char[] ch, int start, int length) Handles character data inside an element
void endDocument() Handles the end of the document
void endElement(String uri, String localName, String qName) Handles the end of an element
void endPrefixMapping(String prefix) Handles the end of a namespace mapping
void error(SAXParseException e) Handles a recoverable parser error
void fatalError(SAXParseException e) Reports a fatal parsing error
void ignorableWhitespace(char[] ch, int start, int length) Handles ignorable whitespace (such as that used to indent a document) in element content
void notationDecl(String name, String publicId, String systemId) Handles a notation declaration
void processingInstruction(String target, String data) Handles an XML processing instruction (such as a JSP directive)
InputSource resolveEntity(String publicId, String systemId) Resolves an external entity
void setDocumentLocator(Locator locator) Sets a Locator object for document events
void skippedEntity(String name) Handles a skipped XML entity
void startDocument() Handles the beginning of the document
void startElement(String uri, String localName, String qName, Attributes attributes) Handles the start of an element
void startPrefixMapping(String prefix, String uri) Handles the start of a namespace mapping
void unparsedEntityDecl(String name, String publicId, String systemId, String notationName) Handles an unparsed entity declaration
void warning(SAXParseException e) Handles a parser warning

You have to pass the SAX handler an object based on a handler class such as DefaultHandler so that it can call the methods you see in Table 12-1. Our main class, ch12_02 , is based on the DefaultHandler class, so we can pass an object of the ch12_02 class to the SAX parser. I begin by creating an object of the ch12_02 class named obj and then calling that object's displayDocument method, which creates the SAX parser. (The reason I create a new object from the current class instead of just calling displayDocument directly is that we have to pass an object to the SAX parser so that the parser can call that object's callback methods; we'll use obj itself for that purpose.)

 public static void main(String args[])  {  ch12_02 obj = new ch12_02();   obj.displayDocument(args[0]);  } 

In the displayDocument method, I'll create the SAX parser. To do that, you can use the Java SAXParserFactory class to create an object of the SAXParser class. (As with the DocumentBuilderFactory class in Chapter 11, "Java and the XML DOM," the Java SAXParserFactory class is called a factory because you can use it to create parsers using Java classes from different parser vendors , not just the default Java XML SAX parser that we'll use here.)

The actual parsing is done by the SAXParser object's parse method. You pass it the object whose methods it is supposed to call when the parser sees the beginning of an XML element, the end of an element, and so on. In this case, that's the current object (that is, obj , whose displayDocument method we're inside right now). In Java, you can refer to the present object with the this keyword. That means we can parse the file the user is asking us to parse like this:

 public static void main(String args[])      {         ch12_02 obj = new ch12_02();         obj.displayDocument(args[0]);     }     public void displayDocument(String uri)     {  DefaultHandler handler = this;   SAXParserFactory factory = SAXParserFactory.newInstance();   try {   SAXParser saxParser = factory.newSAXParser();   saxParser.parse(new File(uri), handler);   } catch (Throwable t) {}  }     } } 

You'll find the methods of the SAXParserFactory class in Table 12-2 and the methods of the SAXParser class in Table 12-3.

Table 12-2. Methods of the javax.xml.parsers.SAXParserFactory Interface
Method Does This
protected SAXParserFactory() The default constructor
abstract boolean getFeature (String name) Returns the particular property requested
boolean isNamespaceAware() True if the factory will produce parsers that use XML namespaces
boolean isValidating() True if the factory will produce parsers that validate the XML content
static SAXParserFactory newInstance() Gets a new SAXParserFactory object
abstract SAXParser newSAXParser() Creates a new SAXParser object
abstract void setFeature (String name, boolean value) Sets the particular feature requested
void setNamespaceAware(boolean awareness) Requires the parser produced to support XML namespaces
void setValidating (boolean validating) Requires the parser produced to validate XML documents
Table 12-3. Methods of the SAXParser Class
Method Does This
protected SAXParser() The default constructor
abstract Parser getParser() Returns the SAX parser
abstract Object getProperty (String name) Returns the particular property requested
abstract XMLReader getXMLReader() Returns the XMLReader object used
abstract boolean isNamespaceAware() True if this parser is configured to understand namespaces
abstract boolean isValidating() True if this parser is configured to validate XML documents
void parse (File f, DefaultHandler dh) Parses the content of the file specified using the specified DefaultHandler object
void parse(File f , HandlerBase hb) Parses the content of the file specified using the specified HandlerBase object
void parse(InputSource is , DefaultHandler dh) Parses the content specified by InputSource using the specified DefaultHandler object
void parse(InputSource is , HandlerBase hb) Parses the content specified by InputSource using the specified HandlerBase object
void parse(InputStream is , DefaultHandler dh) Parses the content of the specified InputStream instance using the specified DefaultHandler object
void parse(InputStream is , DefaultHandler dh, String systemId) Parses the content of the specified InputStream instance using the specified DefaultHandler object
void parse(InputStream is , HandlerBase hb) Parses the content of the specified InputStream instance using the specified HandlerBase object
void parse(InputStream is , HandlerBase hb, String systemId) Parses the content of the specified InputStream instance using the specified HandlerBase object and system ID
void parse(String uri , DefaultHandler dh) Parses the content described by the giving uniform resource identifier (URI) using the specified DefaultHandler object
void parse(String uri , HandlerBase hb) Parses the content described by the specified URI using the specified HandlerBase object
abstract void setProperty (String name, Object value) Sets a particular property in the XMLReader object

Now the SAX parser calls the various methods in the ch12_02 class when it encounters elements in the document we're parsing. In this case, the goal is to determine how many <CUSTOMER> elements the document has, so I implement the startElement method like this:

 import org.xml.sax.*;  import org.xml.sax.helpers.DefaultHandler; import javax.xml.parsers.*; import java.io.*; public class ch12_02 extends DefaultHandler {  public void startElement(String uri, String localName, String qualifiedName,   Attributes attributes)   {   .   .   .   }  } 

The startElement method is called each time the SAX parser sees the start of an element, and the endElement method is called when the SAX parser sees the end of an element.

Note that two element names are passed to the startElement method: localName and qualifiedName . You use the localName argument with namespace processing; this argument holds the name of the element without any namespace prefix. The qualifiedName argument holds the full, qualified name of the element, including any namespace prefix.

We're just going to count the number of <CUSTOMER> elements, so I'll take a look at the element's qualifiedName argument. If that argument equals "CUSTOMER" , I'll increment a variable named customerCount :

 import org.xml.sax.*;  import org.xml.sax.helpers.DefaultHandler; import javax.xml.parsers.*; import java.io.*; public class ch12_02 extends DefaultHandler {  int customerCount = 0;  public void startElement(String uri, String localName, String qualifiedName,         Attributes attributes)     {  if (qualifiedName.equals("CUSTOMER")) {   customerCount++;   }  } 

How do you know when you've reached the end of the document and there are no more <CUSTOMER> elements to count? You use the endDocument method, which is called when the end of the document is reached. I'll display the number of tallied <CUSTOMER> elements in that method:

Listing ch12_02.java
 import org.xml.sax.*; import org.xml.sax.helpers.DefaultHandler; import javax.xml.parsers.*; import java.io.*; public class ch12_02 extends DefaultHandler {     int customerCount = 0;     public void startElement(String uri, String localName, String qualifiedName,         Attributes attributes)     {         if (qualifiedName.equals("CUSTOMER")) {             customerCount++;         }     }  public void endDocument()   {   System.out.println("The document has " + customerCount + " <CUSTOMER> elements.");   }  public static void main(String args[])     {         ch12_02 obj = new ch12_02();         obj.displayDocument(args[0]);     }     public void displayDocument(String uri)     {         DefaultHandler handler = this;         SAXParserFactory factory = SAXParserFactory.newInstance();         try {             SAXParser saxParser = factory.newSAXParser();             saxParser.parse(new File(uri), handler);         } catch (Throwable t) {}     } } 

You can compile and then run this program like this:

 %java ch12_02 ch12_01.xml  The document has 3 <CUSTOMER> elements. 

And that's all it takes to get started with SAX.



Real World XML
Real World XML (2nd Edition)
ISBN: 0735712867
EAN: 2147483647
Year: 2005
Pages: 440
Authors: Steve Holzner

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net