Handling Lexical Events

printer-friendly version of this section  Print  e-mail this section  E-Mail  add a public, group or private note  Add Note  add a bookmark about this section  Add Bookmark    

Java APIs for XML Kick Start
By Aoyon Chowdhury, Parag Choudhary

Table of Contents
Chapter 4.  Advanced Use of SAX


Lexical information in an XML document implies information that pertains to the text of the XML itself. In an XML document, the CDATA section, comments, and the references to parsed entities constitute the lexical information.

Normally, the CDATA section is used in an XML document when you want to store text that is not be parsed. This happens when an element contains large sections of text that might include special characters, and it is inconvenient to replace each occurrence with an appropriate entity reference. The CDATA section is analogous to the <pre></pre> tags of HTML.

For example, to put the following text in an XML file, you will need to escape every occurrence of <, >, and &; otherwise, the parser will generate errors:

<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>SAMS Publishing is the &best& <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<>>>>>>>>>>>>>>>>>>>>>>> 

This is obviously painful and counterproductive. To overcome such issues, a CDATA section is used. Also, a parser normally ignores a comment in an XML file. However, when creating applications that read, filter, and output XML files, you might need to read and write out the comments in the XML source file.

Additionally, the XML source file might contain entity references such as &CompanyName;. In normal parsing, &CompanyName; would be replaced by the text it represents. However, a filtering application will need to transfer the entity reference as is without de-referencing it.

To handle these three cases, the SAX package provides the LexicalHandler interface. It contains the methods that provide the mechanisms to handle CDATA sections, comments, and parsed entities in an XML file.

To handle the lexical events in your application, you need to do the following:

  • Import the LexicalHandler interface

  • Implement the LexicalHandler interface

  • Configure the XMLReader to send the lexical events to the LexicalHandler

  • Implement the methods defined in the LexicalHandler interface

To understand how the lexical handler works, you need to make some changes in the CarParts.xml file and the MyXMLHandler application.

Updating the CarParts.xml File

In the CarParts.xml file, make the following changes:

  1. Separate the DTD from the CarParts.xml file and name it CarParts.dtd.

  2. In the CarParts.xml, refer to the CarParts.dtd.

  3. Add a new element called forCDATA in the DTD and the CarParts.xml file.

  4. Change the entry in the supplier element from Engine 1 to the entity reference &companyname;.

To make these changes, CarParts.dtd should be as shown in Listing 4.3. The lines in bold are the entries that have been newly added.

Listing 4.3 Using LexicalHandler
<?xml version='1.0' encoding='us-ascii'?> <!--  DTD for the XML file that describes car parts --> <!ELEMENT carparts (supplier,engines,carbodies,wheels,carstereos,forCDATA)> <!ELEMENT engines (engine+)> <!ELEMENT carbodies (carbody+)> <!ELEMENT wheels (wheel+)> <!ELEMENT carstereos (carstereo+)> <!ELEMENT forCDATA (CDATA)> <!ELEMENT supplier (#PCDATA)> <!ATTLIST supplier             name CDATA #REQUIRED             URL CDATA #REQUIRED > <!ELEMENT engine (#PCDATA)*> <!ATTLIST engine            id CDATA #REQUIRED             type CDATA #REQUIRED             capacity (1000 | 2000 | 2500 ) #REQUIRED             price CDATA #IMPLIED             text CDATA #IMPLIED > <!ELEMENT carbody (#PCDATA)*> <!ATTLIST carbody             id CDATA #REQUIRED             type CDATA #REQUIRED             color CDATA #REQUIRED > <!ELEMENT wheel (#PCDATA)*> <!ATTLIST wheel             id CDATA #REQUIRED             type CDATA #REQUIRED             price CDATA #IMPLIED             size  (X | Y | Z) #IMPLIED > <!ELEMENT carstereo (#PCDATA)*> <!ATTLIST carstereo             id CDATA #REQUIRED             manufacturer CDATA #REQUIRED             model CDATA #REQUIRED              Price CDATA #REQUIRED > 

Note that the element forCDATA, which contains CDATA, has been added into the DTD. Also, the entity reference has been retained in the XML file itself.

To update the CarParts.xml file with the changes, add the lines displayed in bold in Listing 4.4.

Listing 4.4 Output of MyXMLHandler with LexicalHandler
<?xml version='1.0' encoding='us-ascii'?> <!--  XML file that describes car parts --> <!DOCTYPE carparts SYSTEM "CarParts.dtd" [ <!ENTITY  companyname "Heaven Car Parts (TM)"> <!ENTITY  companyweb "http://carpartsheaven.com"> ]> <carparts>     <?supplierformat format="X13" version="3.2"?>     <supplier name="&companyname;" URL="&companyweb;">    &companyname;     </supplier>     <engines>         <engine  type="Alpha37" capacity="2500" price="3500">             Engine 1         </engine>     </engines>     <carbodies>         <carbody  type="Tallboy" color="blue">             Car Body 1         </carbody>     </carbodies>     <wheels>         <wheel  type="X3527" price="120">             Wheel Set 1         </wheel>     </wheels>     <carstereos>         <carstereo  manufacturer="MagicSound" model="T76w" Price="500">             Car Stereo 1         </carstereo>     </carstereos>     <forCDATA><![CDATA[Special Text: <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<>>>>>>>>>  graphics/ccc.gif>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>SAMS Publishing is the &best&  graphics/ccc.gif<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<>>>>>>>>>>>>>>>>>>>>>>..]]> </forCDATA> </carparts> 

Next, you need to import the LexicalHandler interface into the MyXMLHandler application.

Importing the LexicalHandler Package

To import the LexicalHandler interface to the MyXMLHandler application, add the following line listed in bold:

import javax.xml.parsers.*; import org.xml.sax.*; import org.xml.sax.helpers.*; import org.xml.sax.ext.LexicalHandler; public class MyXMLHandler extends DefaultHandler { ..... } 

Next, implement the LexicalHandler interface in the MyXMLHandler application.

Implementing the LexicalHandler Interface

To implement the LexicalHandler interface within the MyXMLHandler, add the following line displayed in bold:

import javax.xml.parsers.*; import org.xml.sax.*; import org.xml.sax.helpers.*; import org.xml.sax.ext.LexicalHandler; public class MyXMLHandler extends DefaultHandler implements LexicalHandler{ 

Next, XMLReader needs to be configured to send the lexical events to the LexicalHandler.

Configuring XMLReader for Lexical Handling

Configuring the XMLReader for lexical handling involves setting the property http://xml.org/sax/properties/lexical-handler. This property is defined in the SAX standard, and can be set by using the setProperty() method of XMLReader.

To set the property, add the line listed in bold:

static public void main(String[] args) throws Exception {     ......................                 /*XML Reader is the interface for reading an XML document using  graphics/ccc.gifcallbacks*/         XMLReader xmlReader = saxParser.getXMLReader();         xmlReader.setProperty("http://xml.org/sax/properties/lexical-handler", new  graphics/ccc.gifMyXMLHandler());         /*set the error handler*/         xmlReader.setErrorHandler(new MyErrorHandler()); .................................     } 

You've now set the required property for lexical handling and configured XMLReader to pass the lexical events to your lexical handler, which in this case is the application itself.

Next, you need to implement the methods of the LexicalHandler interface.

Implementing Methods of the LexicalHandler Interface

The LexicalHandler interface declares seven methods that must be defined in the application implementing the interface.

These methods pertain to processing the comments, CDATA sections, and parsed entities. You'll need to implement the methods so that they display the lexical event name when it occurs. To do so, add the lines of code listed in bold in Listing 4.5.

Listing 4.5 Implementing Methods of LexicalHandler Interface
public void printAllAttributes(Attributes elementAttributes)         {             System.out.println("\tTotal Number of Attributes: "+ elementAttributes. graphics/ccc.gifgetLength());             for(int i=0;i<elementAttributes.getLength();i++)             {                 System.out.println("\t\tAttribute: "+ elementAttributes.getQName(i)+ " =  graphics/ccc.gif"+ elementAttributes.getValue(i));             }     }     //Lexical Handler Methods     public void startCDATA() throws SAXException     {         System.out.println("\nStarting CDATA Section\n");     }     public void endCDATA() throws SAXException     {         System.out.println("\nEnding CDATA Section\n");     }     public void comment(char[] ch, int start, int length) throws SAXException     {         String commentText = new String(ch,start,length);         System.out.println("Comment : " + commentText + "\n");     }     public void startDTD(java.lang.String name,                          java.lang.String publicId,                          java.lang.String systemId)                   throws SAXException      {          System.out.println("Starting DTD :" + systemId);      }      public void endDTD() throws SAXException     {     System.out.println("Ending DTD"); }     public void startEntity(java.lang.String name) throws SAXException     {         System.out.println("Starting entity :" + name);     }     public void endEntity(java.lang.String name) throws SAXException     {     System.out.println("Ending entity :" + name);     } 

NOTE

The code discussed here is available in the example0402 folder. This folder also contains the sample CarParts.xml file.


You can now compile and run the program. The output should be similar to the listing displayed in Listing 4.6. The output from the lexical handling methods is displayed in bold.

Listing 4.6 Output of MyXMLHandler with LexicalHandler Methods
Version 0402.0 of MyXMLHandler in example0402 Locator :file:///D:/sams_work/Java Api/Chapter 4- SAX APIs - Advanced Use/Example0402/ graphics/ccc.gifCarParts.xml Start Document: -----Reading the document CarParts.xml with MyXMLHandler------ Comment :   XML file that describes car parts Starting DTD :CarParts.dtd Starting entity :[dtd] Comment :   DTD for the XML file that describes car parts Ending entity :[dtd] Ending DTD Location of event at line number :8 Start Element-> carparts     Total Number of Attributes: 0 Location of event at line number :10 Start Element-> supplier     Total Number of Attributes: 2         Attribute: name = Heaven Car Parts (TM)         Attribute: URL = http://carpartsheaven.com Characters: Starting entity :companyname Characters: Heaven Car Parts (TM) Ending entity :companyname Characters: End Element-> supplier Location of event at line number :13 Start Element-> engines     Total Number of Attributes: 0 Location of event at line number :14 Start Element-> engine     Price= 3500 Characters:             Engine 1 Characters: End Element-> engine End Element-> engines Location of event at line number :18 Start Element-> carbodies     Total Number of Attributes: 0 Location of event at line number :19 Start Element-> carbody     Total Number of Attributes: 3         Attribute: id = C32         Attribute: type = Tallboy         Attribute: color = blue Characters:             Car Body 1 Characters: End Element-> carbody End Element-> carbodies Location of event at line number :23 Start Element-> wheels     Total Number of Attributes: 0 Location of event at line number :24 Start Element-> wheel     Price= 120 Characters:             Wheel Set 1 Characters: End Element-> wheel End Element-> wheels Location of event at line number :28 Start Element-> carstereos     Total Number of Attributes: 0 Location of event at line number :29 Start Element-> carstereo     Total Number of Attributes: 4         Attribute: id = C2         Attribute: manufacturer = MagicSound         Attribute: model = T76w         Attribute: Price = 500 Characters:             Car Stereo 1 Characters: End Element-> carstereo End Element-> carstereos Location of event at line number :33 Start Element-> forCDATA     Total Number of Attributes: 0 Starting CDATA Section Characters: Special Text:  graphics/ccc.gif<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> SAMS  graphics/ccc.gifPublishing is the &best& <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<>>>>>>>>>>>>>>>>>>>>>>.. Ending CDATA Section End Element-> forCDATA End Element-> carparts End Document: ----------------Finished Reading the document--------------- 

Note that the LexicalHandler treats the DTD as an entity.


printer-friendly version of this section  Print  e-mail this section  E-Mail  add a public, group or private note  Add Note  add a bookmark about this section  Add Bookmark    
Top

[0672324342/ch04lev1sec2]

 
 


JavaT APIs for XML Kick Start
JAX: Java APIs for XML Kick Start
ISBN: 0672324342
EAN: 2147483647
Year: 2002
Pages: 133

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net