Developing a Custom Event Handler


There may be situations when the standard event handler modules won't support what you need to do. For example, let's say that you want to combine standard event handling with regular expression matching, substitutions, or state management. These are tasks that aren't supported by the standard event handlers.

Depending on the requirements, the standard event handler may seem somewhat limited. So, to get around the limitations of the standard event handler, we can create our own SAX handler. This handler will use the SAX interface to manipulate and generate XML data. It's very simple as long as the standard method names are used for different events. Each method can be customized to fit our needs on a specific project. Let's take a look at an example.

Custom Event Handler Package Example

In this example, we're going to create a customized event handler as a package. What is a package? A package is a block of Perl source code that is assigned its own namespace. Because the package is assigned to its own namespace, there can't be any conflicts with an application (for example, duplicate variable names).

Note that packages are not the same as files. Multiple packages can reside in one Perl file, however, you cannot spread a package across multiple files. Typically, there is one package per file. If you have one package in a file, you're required to name the file the same name as the package, and the extension to the file must be .pm, which identifies a Perl module.

Let's take a look at Listing 7.7 and see how to create your own customized event handler package. Each of the handlers in the package are discussed in the following.

Listing 7.7 Customized event handler package. (Filename: ch7_custom_handler.pl)
 1.   package Custom_Handler;  2.  3.   sub new {  4.     my $class= ref $_[0]  $_[0];  5.     return bless {}, $class;  6.   } 7.  8.   # start_document handler  9.   sub start_document {  10.    my $code_handler= shift;  11.    my $document= shift;  12.  }  13.  14.  # end_document handler  15.  sub end_document {  16.    my $code_handler= shift;  17.    my $document= shift;  18.  }  19.  20.  # start_element handler  21.  sub start_element {  22.    my $code_handler= shift;  23.    my $element= shift;  24.    my $name= $element->{Name};  25.    my $atts= $element->{Attributes};  26.    print "<$name";  27.  28.    # Loop through all the attributes  29.    foreach my $att (sort keys %$atts) {  30.      print " $att='$atts->{$att}'";  31.    }  32.    print ">";  33.  }  34.  35.  # characters handler  36.  sub characters {  37.    my $code_handler= shift;  38.    my $character= shift;  39.      print $character->{Data} if( defined $character->{Data});  40.  }  41.  42.  # end_element handler  43.  sub end_element {  44.    my $code_handler= shift;  45.    my $element= shift;  46.      my $name= $element->{Name};  47.    print "</$name>";  48.  } 

1 “6 Our Custom_Handler package needs to contain the new() method to create a new instance of the object to later pass as the handler. This is a non-SAX2 standard method in our case, although it is pretty standard for the majority of Perl Object Oriented (OO) modules. The module is considered to be non-standard because it provides additional capabilities in addition to those defined in the standard. All other methods must be included, even if they don't contain any code. If we don't define one of the required methods , the interpreter will complain of a missing subroutine.

 1.   package Custom_Handler;  2.  3.   sub new {  4.     my $class= ref $_[0]  $_[0];  5.     return bless {}, $class;  6.   } 

8 “18 The start_document() and the end_document() methods create a new instance of XML data. In our case, we don't use it for any particular purpose, although we must still include the method because XML:: SAXDriver::Excel and XML::SAXDriver::CSV modules will attempt to call it.

 8.   # start_document handler  9.   sub start_document {  10.    my $code_handler= shift;  11.    my $document= shift;  12.  }  13.  14.  # end_document handler  15.  sub end_document {  16.    my $code_handler= shift;  17.    my $document= shift;  18.  } 

20 “33 The start_element() method is called when the parser wants to output the opening XML tag. Two arguments are passed to this subroutine, the first being the Handler instance and the second being a reference to a hash containing the tag information needed to output a valid XML tag. Line 24 assigns the $element->{NAME} which contains the XML tag name. Line 25 refers to $element->{Attributes} which is another reference to a hash that holds the attributes names/data in the hash-like key/value format. As you can see from the code, you must output tag opening < and closing > characters yourself.

 20.  # start_element handler  21.  sub start_element {  22.    my $code_handler= shift;  23.    my $element= shift;  24.    my $name= $element->{Name};  25.    my $atts= $element->{Attributes};  26.    print "<$name";  27.  28.    # Loop through all the attributes  29.    foreach my $att (sort keys %$atts) {  30.      print " $att='$atts->{$att}'";  31.    }  32.    print ">";  33.  } 

35 “40 The characters() subroutine is called when the parser encounters the actual data to be written in between the opening and closing XML tags. The sub is passed the instance of the object as well as a reference to a hash with the data assigned to its {Data} key.

 35.   # characters handler  36.  sub characters {  37.    my $code_handler= shift;  38.    my $character= shift;  39.      print $character->{Data} if( defined $character->{Data});  40.  } 

42 “48 Now we define the end_element() subroutine, which is used to close the XML tag. It is passed the same arguments as the start_element() method, although you would not expect the reference {Attributes} to contain data because XML 1.0 specifications do not allow attributes for closing tags.

 42.  # end_element handler  43.  sub end_element {  44.    my $code_handler= shift;  45.    my $element= shift;  46.      my $name= $element->{Name};  47.    print "</$name>";  48.  } 

Now that we've developed this custom handler, how do we use it? This handler can be used in the following manner:

 my $handler= new  Custom_Handler  ;  my $driver_obj = XML::SAXDriver::CSV->new(Source => {SystemId =>  "customer_signup.csv"},                                        Handler => $code_handler,                                        Dynamic_Col_Headings => 1,                                        File_Tag   => 'code');  $driver_obj->parse(); 

As you can see, we easily developed a Custom_Handler object to use the handler with either XML::SAXDriver::Excel or XML::SAXDriver::CSV modules. This is by no means a complete version of the handler, but it can definitely be used as a template.

One important part that we would also need to implement is the escaping subroutine, which is used to escape/convert illegal characters in the XML tag names as well as XML data. We can do this by simply converting &, <, ", >, and ' to the equivalent of &amp; , &lt; , &quot; , &gt; , and &apos; . This is the simplest implementation of the escaping subroutine, although there are some more encoding issues that might need to be taken into consideration, however this is a totally different issue that I will not cover in this section.

My recommendation is to first always try to use an existing handler modules, such as the XML::Handler::YAWriter Perl module. This module is well documented, stable, and widely used. Because XML::Handler::YAWriter is a popular Perl module, most (if not all) of the bugs have been encountered and eliminated. However, there may be times when the standard modules can't support your requirements. Now, with the option of a custom module, you should be able to handle just about any task.



XML and Perl
XML and Perl
ISBN: 0735712891
EAN: 2147483647
Year: 2002
Pages: 145

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net