This first example shows how to work with SAX. In this case, I'll use SAX to count the number of <CUSTOMER> elements in the same example we saw in the previous chapter. Here's that file, renamed ch12_01.xml for this chapter: Listing ch12_01.xml<?xml version = "1.0" standalone="yes"?> <DOCUMENT> <CUSTOMER> <NAME> <LAST_NAME>Smith</LAST_NAME> <FIRST_NAME>Sam</FIRST_NAME> </NAME> <DATE>October 15, 2003</DATE> <ORDERS> <ITEM> <PRODUCT>Tomatoes</PRODUCT> <NUMBER>8</NUMBER> <PRICE>.25</PRICE> </ITEM> <ITEM> <PRODUCT>Oranges</PRODUCT> <NUMBER>24</NUMBER> <PRICE>.98</PRICE> </ITEM> </ORDERS> </CUSTOMER> <CUSTOMER> <NAME> <LAST_NAME>Jones</LAST_NAME> <FIRST_NAME>Polly</FIRST_NAME> </NAME> <DATE>October 20, 2003</DATE> <ORDERS> <ITEM> <PRODUCT>Bread</PRODUCT> <NUMBER>12</NUMBER> <PRICE>.95</PRICE> </ITEM> <ITEM> <PRODUCT>Apples</PRODUCT> <NUMBER>6</NUMBER> <PRICE>.50</PRICE> </ITEM> </ORDERS> </CUSTOMER> <CUSTOMER> <NAME> <LAST_NAME>Weber</LAST_NAME> <FIRST_NAME>Bill</FIRST_NAME> </NAME> <DATE>October 25, 2003</DATE> <ORDERS> <ITEM> <PRODUCT>Asparagus</PRODUCT> <NUMBER>12</NUMBER> <PRICE>.95</PRICE> </ITEM> <ITEM> <PRODUCT>Lettuce</PRODUCT> <NUMBER>6</NUMBER> <PRICE>.50</PRICE> </ITEM> </ORDERS> </CUSTOMER> </DOCUMENT> Here, I'll base the new program on a new class named ch12_02 : public class ch12_02 extends DefaultHandler { . . . } Note the keywords extends DefaultHandler here. This means that our class, ch12_02 , is based on the Java DefaultHandler class. The DefaultHandler class already has a number of methods predefined for you that the SAX parser will call, including these callback methods:
All the required callback methods are already implemented in the DefaultHandler class, but they don't do anything. That means you have to implement only the methods you want to use, such as startDocument to catch the beginning of the document, or endDocument to catch the end of the document, as we'll see. You can see all the methods of the DefaultHandler class in Table 12-1. Table 12-1. Methods of the DefaultHandler Class
You have to pass the SAX handler an object based on a handler class such as DefaultHandler so that it can call the methods you see in Table 12-1. Our main class, ch12_02 , is based on the DefaultHandler class, so we can pass an object of the ch12_02 class to the SAX parser. I begin by creating an object of the ch12_02 class named obj and then calling that object's displayDocument method, which creates the SAX parser. (The reason I create a new object from the current class instead of just calling displayDocument directly is that we have to pass an object to the SAX parser so that the parser can call that object's callback methods; we'll use obj itself for that purpose.) public static void main(String args[]) { ch12_02 obj = new ch12_02(); obj.displayDocument(args[0]); } In the displayDocument method, I'll create the SAX parser. To do that, you can use the Java SAXParserFactory class to create an object of the SAXParser class. (As with the DocumentBuilderFactory class in Chapter 11, "Java and the XML DOM," the Java SAXParserFactory class is called a factory because you can use it to create parsers using Java classes from different parser vendors , not just the default Java XML SAX parser that we'll use here.) The actual parsing is done by the SAXParser object's parse method. You pass it the object whose methods it is supposed to call when the parser sees the beginning of an XML element, the end of an element, and so on. In this case, that's the current object (that is, obj , whose displayDocument method we're inside right now). In Java, you can refer to the present object with the this keyword. That means we can parse the file the user is asking us to parse like this: public static void main(String args[]) { ch12_02 obj = new ch12_02(); obj.displayDocument(args[0]); } public void displayDocument(String uri) { DefaultHandler handler = this; SAXParserFactory factory = SAXParserFactory.newInstance(); try { SAXParser saxParser = factory.newSAXParser(); saxParser.parse(new File(uri), handler); } catch (Throwable t) {} } } } You'll find the methods of the SAXParserFactory class in Table 12-2 and the methods of the SAXParser class in Table 12-3. Table 12-2. Methods of the javax.xml.parsers.SAXParserFactory Interface
Table 12-3. Methods of the SAXParser Class
Now the SAX parser calls the various methods in the ch12_02 class when it encounters elements in the document we're parsing. In this case, the goal is to determine how many <CUSTOMER> elements the document has, so I implement the startElement method like this: import org.xml.sax.*; import org.xml.sax.helpers.DefaultHandler; import javax.xml.parsers.*; import java.io.*; public class ch12_02 extends DefaultHandler { public void startElement(String uri, String localName, String qualifiedName, Attributes attributes) { . . . } } The startElement method is called each time the SAX parser sees the start of an element, and the endElement method is called when the SAX parser sees the end of an element. Note that two element names are passed to the startElement method: localName and qualifiedName . You use the localName argument with namespace processing; this argument holds the name of the element without any namespace prefix. The qualifiedName argument holds the full, qualified name of the element, including any namespace prefix. We're just going to count the number of <CUSTOMER> elements, so I'll take a look at the element's qualifiedName argument. If that argument equals "CUSTOMER" , I'll increment a variable named customerCount : import org.xml.sax.*; import org.xml.sax.helpers.DefaultHandler; import javax.xml.parsers.*; import java.io.*; public class ch12_02 extends DefaultHandler { int customerCount = 0; public void startElement(String uri, String localName, String qualifiedName, Attributes attributes) { if (qualifiedName.equals("CUSTOMER")) { customerCount++; } } How do you know when you've reached the end of the document and there are no more <CUSTOMER> elements to count? You use the endDocument method, which is called when the end of the document is reached. I'll display the number of tallied <CUSTOMER> elements in that method: Listing ch12_02.javaimport org.xml.sax.*; import org.xml.sax.helpers.DefaultHandler; import javax.xml.parsers.*; import java.io.*; public class ch12_02 extends DefaultHandler { int customerCount = 0; public void startElement(String uri, String localName, String qualifiedName, Attributes attributes) { if (qualifiedName.equals("CUSTOMER")) { customerCount++; } } public void endDocument() { System.out.println("The document has " + customerCount + " <CUSTOMER> elements."); } public static void main(String args[]) { ch12_02 obj = new ch12_02(); obj.displayDocument(args[0]); } public void displayDocument(String uri) { DefaultHandler handler = this; SAXParserFactory factory = SAXParserFactory.newInstance(); try { SAXParser saxParser = factory.newSAXParser(); saxParser.parse(new File(uri), handler); } catch (Throwable t) {} } } You can compile and then run this program like this: %java ch12_02 ch12_01.xml The document has 3 <CUSTOMER> elements. And that's all it takes to get started with SAX. |