XML::SAX

Chapter 7 - Great CPAN Modules
by?Sam Tregar?
Apress ? 2002
has companion web siteCompanion Web Site

The Perl XML[21] community has come a long way in recent years. In the beginning there was XML::Parser, created by Larry Wall and maintained by Clark Cooper. XML::Parser provides a thin wrapper around the Expat XML parsing library written in C.[22] At the same time, the Perl XML mailing list, perl-xml@listserv.ActiveState.com, got started to provide a place to discuss using XML with Perl.

XML::Parser did (and still does) an excellent job of parsing XML. But it suffers from a quirky interface that is difficult to learn to use effectively. As a result, many wrapper modules grew up around XML::Parser-XML::Twig, XML::Simple, and XML::TokeParser, to name a few. These modules have helped the situation a great deal and are basically the "state of the art" in Perl XML usage.

XML::SAX[23] may not be the most popular module at present, but it represents an evolutionary step forward in the development of Perl's capabilities. It provides an interface for XML parser usage in the same way that DBI provides an interface for SQL database usage. Individual XML parsers can be plugged into the back-end API provided by XML::SAX. Much like using MySQL through DBI means loading the DBD::MySQL module, using Expat through XML::SAX loads XML::SAX::Expat. On the front-end modules can make use a of consistent and standardized[24] interface to whichever parser is in use.

XML::SAX is an object-oriented module, and it provides much of its functionality through inheritance. To start building a SAX parser or filter, the user creates a module that inherits from XML::SAX::Base and overrides methods as required to implement the desired functionality. It is interesting to note that although XML::SAX is tackling a similar problem to DBI, the choice of object-oriented methods differs- DBI chooses a complex mix of composition and inheritance, whereas XML::SAX chooses a pure inheritance model.

One of the most interesting front-end modules to make use of XML::SAX thus far is XML::SAX::Machines by Barrie Slaymaker. XML::SAX::Machines provides a layer over XML::SAX-based parsers and filters that allows the end user to easily construct XML processing pipelines. Using XML::SAX::Machines, XML::SAX components can be assembled into systems of almost limitless capability. Typically, input XML enters one end of the pipeline through a SAX parser, is transformed and processed by the configured SAX filters, and is output by a SAX writer at the end. Given the proliferation of SAX filters, I expect this to become a popular system for XML processing in Perl.

The Perl XML community is interesting aside from any particular module. Using the Perl XML mailing list as a hub, the community is unusual for its cohesion. Module ideas are vigorously discussed and the community even shares a SourceForge project and Web space. It may be that this cohesion has come as a response to the largely uphill battle facing the Perl XML developers. As Matt Sergeant put it, "Right now our biggest battle seems to be for acceptance within the Perl community as a whole. People completely accept things like the DBI, Tk/Gtk, and LWP as a vital part of Perl, but are quick to dismiss XML as something they just don't need and really don't like. We're kind of the bastard child of the Perl community. But we're winning small battles quite often now."

[21]XML stands for the eXtensible Markup Language. See http://www.w3c.org/XML for details.

[22]Written by James Clark. See http://www.expat.sourceforge.net for details.

[23]Written by Matt Sergeant. See http://www.sergeant.org.

[24]XML::SAX implements the SAX standard, versions 1 and 2. See http://www.saxproject.org for details.



Writing Perl Modules for CPAN
Writing Perl Modules for CPAN
ISBN: 159059018X
EAN: 2147483647
Year: 2002
Pages: 110
Authors: Sam Tregar

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net