8.3 Custom Providers


By default, AxKit presumes that the documents that it is serving and processing are stored as plain files on the filesystem. While this is typically adequate for most cases, in some situations you may need to get data from another source. In AxKit, the mechanism for slurping in data for further processing is called a Provider.

Providers come in two flavors: ContentProviders and StyleProviders. As their names suggest, ContentProviders are responsible for fetching the source of the content being delivered, while StyleProviders handle getting the source of the stylesheets to be used to transform that content. The default class for both types, Apache::AxKit::Provider::File, reads data from XML sources on the filesystem. Alternate classes can be configured for both the ContentProvider and the StyleProvider for a given resource using the corresponding AxKit configuration directive:

 # Set each type of Provider explicitly AxContentProvider My::Provider AxStyleProvider My::Other::Provider # Or, configure both to use the same alternate class AxProvider My::Generic::Provider 

Custom Providers can be used to fetch content from non-XML data, to get XML data from sources other than flat files, or a combination of the two. In some cases, you are looking to take advantage of Perl's capable XML tools and other data processing facilities to generate an XML instance based on another source of data. In others, you simply want to read in XML for a source other than a plain file on the disk. For example, some Providers may use a SAX generator class to dynamically generate an XML document from a directory listing or Excel spreadsheet, while others may be used to extract existing XML content from a zip archive or relational database.

But wait, as you saw in Chapter 7, AxKit offers several fine options for generating dynamic content, so why would you use a Provider instead of a taglib? There is no hard and fast rule, but, in general, defining the real source of the content decides the matter. For example, a shopping cart application that includes a list of products from a database is probably best implemented through a taglib, while a content management system that returns a complete document from the same database may best be integrated into AxKit as a custom ContentProvider. That is, in the shopping cart page, the product list is only one component that is included into the content, while the data returned from the CMS defines the content. This distinction may seem a bit arbitrary, and from a technical point of view, it is, given that either task could be achieved by either means. Spending a few minutes considering the best approach for the task at hand can save hours of development time in the long run.

8.3.1 Provider API

Generally, a Provider is expected to offer access to the sources of the content or stylesheets that are associated with the current request, as well as certain key pieces of metadata about those sources. The data for the given resource must be returned from one of the get_fh (get filehandle), get_strref (get string reference), or get_dom (get Document Object instance) methods. These methods are simply variations that allow the source to be passed to AxKit in different data formats. Only one must be implemented for the Provider to work. All custom Providers should be inherited subclasses of AxKit's base Provider class Apache::AxKit::Provider or one of its subclasses:

 package My::Provider; use strict; use Apache::AxKit::Provider use vars qw( @ISA ); @ISA = qw( Apache::AxKit::Provider );  # Override some class methods and   # add few of your own. 1; init( )

Called before all other methods, the init method gives the Provider a chance to perform any initialization logic needed to prepare for further processing. It is most commonly used for things such as instantiating any objects that the Provider needs to handle the request, initializing instance variables , and so on. In classes that inherit from AxKit's base Provider class (and most should), it is passed the same arguments that were passed to the constructor for the current instance:

 sub init {     my $self = shift;     my %args = @_;          $self->{content_application} = My::App->new( );     # and so on . . .  } process( )

The process method is used to communicate whether or not the Provider provides content for the given resource. It is passed no arguments (apart from the instance reference passed as the first argument to all Perl methods) and is expected to return 1 (or any nonzero value) to indicate that all conditions are met for the Provider to handle the request and or undef , otherwise . For cases in which the Provider cannot continue, it is strongly recommended that an appropriate exception be thrown, providing an explanation as to what may have gone wrong:

 sub process {     my $self = shift;         my $uri = $self->apache_request->uri( );          # Get the data based on some URI-to-data mapping method     # implemented elsewhere in your custom Provider     my $data = $self->map_uri_to_data( $uri );          if ( defined( $data ) ) {         $self->{data} = $data;         return 1;     }     else {         throw Apache::AxKit::Exception::Error( -text => "No data associated  with URI '$uri'." );     } } mtime( )

Used with AxKit's caching mechanism, the mtime method is expected to return the last modification time in seconds, since the epoch , for the current resource. If the document being provided will always be dynamic (based on user input, etc.), returning the result of Perl's built-in time( ) function ensures that the data is never cached unexpectedly:

 sub mtime {     return time( ); # content is always fresh. } 

Implementing mtime correctly for cases in which the data being provided is not a plain file on the disk but is an aggregate of data from more than one source can be tricky. For example, if the content is being built as the result of an SQL query that joins several tables that may have been updated at different times, how does one determine the true last modification time for that resource? The answer is always very application-specific, and I will avoid making dubious generalizations here. It is enough to say that being able to take advantage of AxKit's caching facilities wherever possible and appropriate is a huge performance gain. The time spent implementing mtime is usually worth the investment. get_styles( )

Called only on ContentProvider classes, the get_styles method is responsible for returning the final list of processors to be applied to the given resource. It is expected to return a reference to a list of style definitions that AxKit uses to transform the content. Styles are applied in the order that they appearthe first style is applied to the source content, the second to the result of the first, and so on. The style definitions take the form of an anonymous HASH reference containing two required keys: href whose value contains the DocumentRoot-relative path to the stylesheet to be applied, and type , whose value declares the MIME type associated with the Language processor to be used to apply the stylesheet:

 my @styles = ( { type => "text/xsl",                  href => "/styles/style1.xsl"                },                 type => "text/xsl",                 href => "/styles/style2.xsl"                }              ); 

In the default ContentProviders, get_styles( ) is used to map the current preferred style and media to any xml-stylesheet processing instructions contained in the source XML. If no matching styles are found, the ConfigReader's GetMatchingProcessors method is called, the document's root element name and Document Type Definition are evaluated against all AxAdd*Processor configuration directives in the current scope, and any matching styles are used instead. In all, get_styles is a crucial method whose default implementation provides much of AxKit's expected behavior. It should be overridden only with caution and a clear purpose.

That said, some Providers, most notably those implementing a bridge between AxKit and a content creation application that needs to define one or more stylesheet transformations to create the "view" of a given set of data for a particular application state, may need explicit control over the list of styles to apply to the content. In these cases, get_styles offers the most direct, least ambiguous way to define the styles to be applied. Overriding the default implementation of this method does not mean abandoning the use of the preferred style and media properties that an upstream plug-in may have setthese values are passed in as arguments to get_styles . The following shows how an application-based Provider may conditionally override the current list of styles, while still falling back to any default styles defined via configuration directive or xml-stylesheet processing instruction:

 package My::Provider; use vars qw( @ISA ); @ISA = qw( Apache::AxKit::Provider );  . . .  sub get_styles {     my $self = shift;     my ( $preferred_media_name, $preferred_style_name ) = @_;          my $app = $self->{some_content_application};     my @style_list = $app->get_axkit_styles( $preferred_media_name,  $preferred_style_name );          # If your application returned styles, use those; otherwise, fall back to the     # default implementation in your parent Provider class.     if ( scalar( @style_list > 0 ) ) {         return \@style_list; # you return a reference, not the list itself.     }     else {         return $self->SUPER::get_styles( $preferred_media_name, $preferred_style_name );     } } 

Or, here's how an application-driven ContentProvider may alter the preferred media and style properties while letting the default Provider handle the low-level details:

 sub get_styles {     my $self = shift;     my ( $preferred_media_name, $preferred_style_name ) = @_;          my $app = $self->{some_content_application};          my $new_preferred_style = $app->get_axkit_stylename( )  $preferred_style_name;     my $new_preferred_media = $app->get_axkit_medianame( )  $preferred_media_name;          return $self->SUPER::get_styles( $new_preferred_media, $new_preferred_style ); } get_strref( )

One of three methods available for returning content, get_strref (get string reference) offers the ability to return the XML content for the current resource as a reference to a scalar containing the entire document as a string. For example, the following shows how a custom ContentProvider built on XML::Generator::DBI (which generates SAX events from the result of a database query) may implement get_strref to return a generated XML instance:

 sub get_strref {     my $self = shift;     my $content = undef;          my $writer = XML::SAX::Writer->new( Output => $output );     my $generator = XML::Generator::DBI->new(                                Handler => $writer,                                dbh => $self->{db_handle}                                );                                     eval {         $generator->execute( $self->{sql_statement} );     };          if ( my $error = $@ ) {         throw Apache::AxKit::Exception::Error( -text => "Error generating XML: $error" );     }     if ( length( $content ) ) {         # you return a reference, not the scalar itself         return $content;     }     else {         throw Apache::AxKit::Exception::Error( -text => "No data  was returned from SQL $self->{sql_statement}." );     }          } get_fh( )

Similar to get_strref , the get_fh (get filehandle) method offers a way to return data as an open filehandle. In some circumstances too complex to detail here, a filehandle requires fewer system resources than a scalar variable that contains the same document as a plain string; get_fh offers a way to take advantage of that optimization.

 # As above, but return a filehandle instead sub get_fh {     my $self = shift;          # Use the Apache-friendly way to create a new filehandle     my $handle = $self->apache_request->gensym( );          my $writer = XML::SAX::Writer->new( Output => $handle );     my $generator = XML::Generator::DBI->new(                                Handler => $writer,                                dbh => $self->{db_handle}                                );                                     eval {         $generator->execute( $self->{sql_statement} );     };          if ( my $error = $@ ) {         throw Apache::AxKit::Exception::Error( -text => "Error generating XML: $error" );     }     return $handle; } get_dom( )

The get_dom method offers a way to return the XML data for the current resource as an XML::LibXML::Document instance. It is used most often as a means to pass the content from application frameworks such as SAWA and CGI::XMLApplication without incurring the overhead of serializing that DOM object via its toString method and reparsing it once it is passed into AxKit.

 sub get_strref {     my $self = shift;     my $content = $self->{XML_APP}->getDom( );          unless ( $content ) {         throw Apache::AxKit::Exception::Error( -text => "Error generating XML,  no document object returned" );     }            return $content; } key( )

Called throughout AxKit, the Provider's key method should return a string that can be used as a persistent, unique identifier for the current resource. It is used extensively by AxKit's default caching mechanism (along with mtime ) to both look up content that may be cached on the disk or to create the ID for a new cache entry if caching is turned on and none previously existed.

In the default file Provider, key simply returns the filename associated with the current request, which is sufficient in most cases. File-based alternate Providers are encouraged to do the same, or to inherit from Apache::AxKit::Provider::File and avoid implementing the key method altogether. In cases in which there is no one-to-one mapping between the current request URI and a file on the filesystem, a smarter key method is almost always required.

Suppose you use a content management system for part of your site that stores the source XML documents in a relational database. You are now faced with creating the public interface to that data. Let's go a step further and say that your CMS offers an internal hierarchical mapping that allows content objects to be selected using a path interface. Setting up the interface is easy. You only need to create a virtual URL with a <Location> directive and set your custom Provider as the ContentProvider for that URL. Then you can simply use any additional path information from the incoming request as the path passed to your application to retrieve the content. You must explicitly set the cache directory for this resource, since, by default, AxKit attempts to write the cache to the same directory as the requested contentin this case, a directory that does not actually exist.

 # virtual URI for public side of your CMS <Location /cmsapp/content>   AxContentProvider My::CMS::Provider   # always set an explicit cache for virtual URIs   AxCacheDir /.mycachedir </Location> 

Given that all requests for content within this resource always have the same value from the request object's filename , you cannot just use the same strategy as the default Providers. You must use the full URI (including the additional path information) in the string returned to create a unique cache key for each document in the document store:

 sub key {     my $self = shift;     my $r = $self->apache_request( );          return $r->uri( ); } 

You can achieve the same effect using a unique property from the content object itself:

 sub key {     my $self = shift;     return $self->{content_object}->id( ); } exists( )

This method is expected to return 1 (or any nonzero value) if the resource exists and is readable and or undef , otherwise. Typically, a class member added to the current instance during init or process can be examined and the appropriate value returned.

 sub process {     my $self = shift;         my $uri = $self->apache_request->uri( );     my $data = $self->map_uri_to_data( $uri );          if ( defined( $data ) ) {         $self->{data} = $data;  $self->{exists} = 1;  return 1;     }     else {         throw Apache::AxKit::Exception::Error( -text => "No data associated with  URI '$uri'." );     } } sub exists {     my $self = shift;  if ( defined( $self->{exists} ) ) {   return 1;   }   else {   return 0;   }  } 

It is worth mentioning again: as subclasses of one of AxKit's default Providers, most custom Providers only ever need to implement a few of these methods. Often, implementing both the process method to fetch and preprocess the data from the given source and one of the get_* methods to return that data to AxKit are all that is required for a working Provider. To bring this all together, a simple Provider allows you to transparently serve both content and stylesheets from zip archives. (See Example 8-2.)

Example 8-2. Provider::Zip
 package Provider::Zip; use strict; use vars qw($VERSION @ISA); use Apache::AxKit::Provider::File; use Archive::Zip qw(:ERROR_CODES); # Inherit from the default file Provider class @ISA = ('Apache::AxKit::Provider::File'); Archive::Zip::setErrorHandler(\&_error_handler); sub _error_handler {     my $error = shift;     AxKit::Debug(3, $error); } sub exists {     my $self = shift;     return defined( $self->{zip_member} ); } sub mtime {     my $self = shift;     return $self->{zip_member}->{lastModFileDateTime}; } sub process {     my $self = shift;     my $zip = Archive::Zip->new( );     if ($zip->read($self->{file}) != AZ_OK) {         throw Apache::AxKit::Exception::IO (-text => "Couldn't read archive file  '$self->{file}'");     }     my $r = $self->apache_request;     my $member;     my $path_info = $r->path_info;     $path_info =~ s^/;     if ( $self->{zip_uri} ) {         $member = $zip->memberNamed($self->{zip_uri});     }     else {         if ($path_info) {             $member = $zip->memberNamed($path_info);         }         else {             $member = $zip->memberNamed('index.xml')  $zip->memberNamed('index.xsp');         }     }     unless ( $member ) {         throw Apache::AxKit::Exception::Declined(                 -text => "Document could not be retrieved from $self->{file}"                 );     }     $self->{zip_member} = $member;     return 1; } sub get_strref {     my $self = shift;     my ($data, $status) = $self->{zip_member}->contents( );     my $r = $self->apache_request( );     if ($status != AZ_OK) {         throw Apache::AxKit::Exception::Error(                 -text => "Document could not be retrieved from $self->{file}: $status"                 );     }     # Allow images to be served correctly     if ( $r->path_info =~ /\.(pnggifjpg)$/ ) {         my $image_type = ;         $r->content_type( 'image/' . $image_type );         $r->send_http_header( );         $r->print( $data );         throw Apache::AxKit::Exception::Declined(                 -text => "Image detected, skipping further processing."                 );     }     return $data; } 1; 

Obviously, a production-worthy implementation would be a bit more complex, but the basic functionality exists. Once this custom Provider is installed, you only need to configure AxKit to process zip archives and then to set up special Alias and Location directives for each zip to make browsing the archived content seem transparent:

 # Set AxKit to process zip archives AddHandler axkit .zip # Add an Alias, so the zipped content appears # to be a native part of the site. Alias /help /www/sites/myaxkithost/zips/helpdocs.zip # And set AxKit to use Provider::Zip to fetch both # content and stylesheets for the zipped help docs. <Location /help>   AxProvider Provider::Zip </Location> 

With these directives in place, a request to http:://localhost/help/index.xml causes AxKit to extract the file index.xml from the top level of the helpdoc.zip archive. In addition, any xml-stylesheet processing instructions found in that document whose href attribute pointed to a document at or below that same level in the archive also cause that stylesheet document to be extracted and applied at request time.

XML Publishing with AxKit
XML Publishing with Axkit
ISBN: 0596002165
EAN: 2147483647
Year: 2003
Pages: 109
Authors: Kip Hampton

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net