XML Parser APIs

WebLogic Server's support for XML consists of the following:

  • SAX and DOM API parsers

  • WebLogic XML Streaming API

  • JAXP

You will now take a quick look at each of these features in the following sections. To begin with, you'll look at the first way to parse data: using the SAX parser.

SAX API

SAX is short for the simple API for XML. It was the first parser available to parse the contents of an XML file. The SAX parser is a Java parser that provides an API for applications to use for processing XML documents. The major advantage of SAX is that it is an event-based API. An event-based API is faster in terms of processing. This is because the XML parser performs minimal processing. The major responsibility for processing the contents of the XML document lies with the application class that uses the SAX API. Hence, an application that uses the SAX API for processing an XML document is quite complex.

Using a SAX parser, you can traverse the processing of an XML document only in the forward direction. Think of the SAX API as an interpreter. The SAX API is similar to an interpreter that processes code step-by-step in the forward direction in a single pass. The SAX API does not provide for revisiting the previously traversed and processed part of the XML document.

The event-based mechanism of the SAX API does have its advantages, the significant one being performance. The SAX API is more appropriate for use when processing large XML documents at the server side. Also, the SAX parser uses fewer resources because it is just a parser. The major part of the processing is handled by the application that uses the SAX parser.

Here are some of the important APIs that are part of the SAX API:

  • SAXParserFactory The SAXParserFactory class provides an API to obtain an instance of the SAXParser class. An application that needs to instantiate a SAX parser will use the methods in the SAXParserFactory class.

  • SAXParser The SAXParser class encapsulates an implementation of the XMLReader class. The SAXParser class contains the methods to perform the actual parsing of an XML document.

  • XMLReader The XMLReader interface contains the methods to perform the actual processing of an XML document. The vendor who provides the SAX parser must implement the XMLReader interface. Apart from this, the event handlers for XML document processing, such as ContentHandler, DTDHandler, EntityResolver, and ErrorHandler, can be registered using the methods in the XMLReader interface.

  • ContentHandler The ContentHandler interface contains the actual event-processing methods that must be implemented by your XML-document-processing application. The custom content-handler class that processes the events generated as a result of processing the XML document must be registered using the setContentHandler() method of the XMLReader object.

  • DTDHandler The methods in the DTDHandler should be implemented by your XML-document-processing event handler if you need to intercept the notation declaration and unparsed entity declaration events. Normally, you would not need to implement the DTDHandler interface.

  • EntityResolver The EntityResolver interface contains methods for events that are propagated when the SAX parser encounters embedded external references in the XML document, such as DTDs. If you need to trap these references before the SAX parser can include them for processing, you should provide implementations for the methods of the EntityResolver interface.

  • ErrorHandler The methods that an XML-processing Java application uses to perform error handling are defined in the ErrorHandler interface. The custom error-handling class that implements the ErrorHandler interface must register itself using the setErrorHandler() method of the XMLReader object.

  • DefaultHandler The DefaultHandler class provides a default implementation of the callback event handlers the ContentHandler, DTDHandler, EntityResolver, and ErrorHandler interfaces.

The WebLogic Server supports the SAX 2 API using custom classes that are wrapped on the Xerces parser from Apache. These custom classes of the SAX API can be found in the weblogic.apache.xerces.parsers package. The class that should be used is the weblogic.apache.xerces.parsers.SAXParser class, which implements the SAX1 and SAX2 APIs.

DOM API

The Document Object Model (DOM) is the other technique to process XML documents. The DOM API uses the inherent structure of an XML document, that is, the tree structure, to parse and process XML documents. DOM is defined by the W3C consortium to represent the structure of the elements in a document, whether HTML or XML. DOM is essentially a tree structure beginning with the logical root of the tree. The different sets of tags/elements within the XML document represent the leaves, that is, the nodes of the document tree. A branched hierarchical structure of the tree is formed because it is possible to embed tags and elements within other tags and elements. The XML parser that supports the DOM API parses an XML document and generates a tree of objects that represents the DOM tree of the data within the tags and elements in the XML document. This DOM tree of data can then be used for further processing by any application. The DOM API parser handles the major part of parsing an XML document and provides an easy-to-use API for application developers to process the data in the document. You can see this in Figure 20.6.

Figure 20.6. A DOM parser processing an XML file.

graphics/20fig06.gif

From Figure 20.6, you can see the DOM tree generated after parsing the XML file containing the Sams book catalogue information. The contents of the XML file have been parsed and arranged in a hierarchical tree structure. Notice that the element's title, price, author, and publisher are located under the book element in the tree. The values for these elements are placed in the tree with the element that each value is associated with.

The DOM API does have its drawbacks. Because the entire XML document is loaded as a document tree in memory, an application that uses the DOM API to process large-sized XML documents will run into resource problems. Moreover, if the application processing the XML document needs only to partially process the data in the document and leaves the rest of the processing of data to other applications, it will need to pass a potentially huge and resource-intensive tree object between applications. If these applications are distributed, it means serializing and deserializing the DOM tree object. A major resource overhead!

But, unlike a SAX parser, which performs step-by-step parsing and generates events, the DOM parser processes the entire XML document in a single pass. This is analogous to how a compiler processes source code files. Because the entire XML document is processed in one pass, any element or node in the DOM tree can be arbitrarily accessed at any given time in an application that uses the DOM API.

Take a look at the important classes and interfaces of the DOM API, as follows:

  • DocumentBuilderFactory The DocumentBuilderFactory class provides the necessary methods to obtain a new instance of the DOM parser. The DOM parser is the DocumentBuilder class.

  • DocumentBuilder The DocumentBuilder class is the actual implementation of the DOM API parser. The methods required to perform the parsing of the XML document are defined in the DocumentBuilder class.

  • Document The Document interface contains methods to manipulate the contents of the DOM tree generated after parsing and processing an XML document. The Document object represents a processed XML document. The Document object is the logical root of the document tree generated after processing the XML document.

The WebLogic Server 7.0 supports the DOM Level 2 API using custom classes that are wrapped on the Xerces parser from Apache. These custom classes of the DOM API can be found in the weblogic.apache.xerces.parsers package. The class that should be used is the weblogic.apache.xerces.parsers.DOMParser class, which implements the DOM Level 2 specification.

XML Streaming API

WebLogic Server 7.0 provides an improved parser API for parsing and processing XML documents. From your lessons on the SAX and DOM APIs, you know that even though a SAX parser performs processing in less time with minimal resource usage, the application that uses the SAX API becomes quite complex. On the other hand, DOM API does provide a better way to process XML documents into DOM trees but is a resource-intensive operation that can slow down the processing of XML documents in your application.

The WebLogic Server provides the XML streaming API, which aims to establish a middle ground in terms of the advantages and drawbacks of the SAX and DOM APIs.

An application that uses the WebLogic XML streaming API with a SAX parser does not need to implement the event callback methods of the SAX API. Instead, the application utilizes the WebLogic XML streaming API to simplify requests for events that it is interested in. This helps the application to read events from the stream attached to an XML document using the WebLogic XML streaming API. However, there is no complex event-handling code, as you encounter when using just the SAX API.

Similarly, the WebLogic streaming API leverages the DOM API to build a DOM tree of objects in the XML document. The DOM tree (or the event of the SAX API) is converted into an XMLInputStream object. This XMLInputStream is then used by any application to extract and process the parsed contents of the XML document.

The different classes and interfaces of the WebLogic XML streaming API are organized in the weblogic.xml.stream package.

JAXP

Java brings together the SAX and DOM APIs in one single package. The Java API for XML Processing (JAXP) enables Java applications to parse XML documents and convert them into XML objects that can be processed by Java applications. JAXP brings together the divergent industry standard SAX and DOM APIs in one package that can be easily used by Java applications. Apart from the SAX and DOM XML parser APIs, JAXP also includes support for XML Stylesheet Language Transformation (XSLT).

However, JAXP does not limit you to using the SAX and DOM parsers or XSL transforming APIs provided with the reference implementation. JAXP provides advanced plug-n-play features whereby you can use any implementation of the SAX and DOM APIs or any XSLT implementations that you choose.

JAXP consists of two parts: the XML parsing API and the XML transforming API.

The XML Parsing API

The XML parsing APIs that JAXP supports are the SAX and DOM parser APIs. The packages related to this API are as follows:

  • javax.xml.parsers Provides the APIs to load and use different implementations of the SAX and DOM APIs.

  • org.xml.sax Contains the event-based SAX API used for processing XML documents.

  • org.w3c.dom Encapsulates the DOM API. Developers can use the DOM API to process XML documents to construct a tree structure of the contents of the XML document. The DOM API can then be used to process and modify this tree of objects.

The XML Transforming API

The javax.xml.transform package contains the XSLT transformation APIs. These are used to convert an XML document into a different presentation format.



Sams Teach Yourself BEA WebLogic Server 7. 0 in 21 Days
Sams Teach Yourself BEA WebLogic Server 7.0 in 21 Days
ISBN: 0672324334
EAN: 2147483647
Year: 2002
Pages: 339

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net