3.2 Example Client: Meerkat


Before moving to Chapter 4 and diving straight into the different Perl toolkits for XML-RPC, let's look at a simple example of a client [1] to give you a feel for the phases in the XML-RPC request/response lifecycle. The example is fairly rudimentary, so that it can be done without using any of the available toolkits. But even so, it's complex enough that it also serves as an incentive for you to read Chapter 4, which shows how much the toolkits can simplify an application.

[1] Illustrating server development will be left for the introduction of toolkits in Chapter 4 because such code would be long and unwieldy if written from scratch.

3.2.1 The Meerkat Service

Meerkat is an open wire service offered by O'Reilly & Associates, Inc. It offers application-level access to news stories in an array of channels and covers a large variety of topics. Meerkat demonstrates the early success of XML-RPC as an API layer.

Users register an account at http://meerkat.oreillynet.com and then customize the way the news content is presented. From the browser interface, it is possible to select not only the channels themselves but also to fine-tune the set of stories that are chosen for display by applying a search pattern (which can in fact be a regular expression) as a filter against the list. For example, a filter of "perl" against the stories from the "Scripting News" channel limits the results to just the stories that mention Perl.

Users may save a choice of channels to browse and a search pattern. To get started, Meerkat offers a set of ready-to-use basic profiles for the more common and popular topics. Figure 3-1 shows a screenshot of Meerkat displayed with the Mozilla browser running under Linux. The profile used for the contents that are displayed gathers items from most of the Perl- related channels, plus a few others such as the popular Slashdot news portal. All stories are searched for the word "perl," so that only the ones that actually mention Perl directly are displayed.

Figure 3-1. A sample Meerkat page
figs/pwsp_0301.gif

In addition to being able to save preferences, users can tune the search parameters within the URL itself. The documentation for the service explains this in greater detail, but basically all the form elements on the page may instead appear in the URL. Furthermore, the format of the data returned can also be tuned to a certain degree, including the generation of XML rather than HTML. In fact, a suitable web services client can be built using just this form of the search interface.

However, Meerkat also provides an XML-RPC interface. All the data-retrieval and searching functionality available to the web browser is also available at the API level. The user -level personalization and customization aren't available to the programmatic interface, but a client program can easily manage its own layer of preferences control.

3.2.2 From Meerkat Query to HTML Sidebar

Our first example that uses XML-RPC is a simple utility that takes the output from a query and converts it to an HTML segment suitable for inclusion in a larger page. In essence, this is the sort of task one might do with an ordinary RSS feed, only in this case the application can use different queries to construct different feeds.

The example doesn't use any of the XML-RPC toolkits available for Perl just yet, which means the code is much more complex than it would otherwise be. This helps to draw a comparison to the code in Chapter 4, in which the tools are introduced and demonstrated. Instead, the example uses the LWP module (a set of classes for HTTP clients , available from the Comprehensive Perl Archive Network) directly as a way to communicate with the Meerkat server and uses the XML::XPath module to parse and process the responses the server provides. In the case where the application reading the response is this specialized, the XPath language provides a much easier solution than having to process the entire document with a package like XML::Parser .

The code for the utility is given in Example 3-8. Remember, this is no small script, because we're doing everything by hand. Using a toolkit from Chapter 4 makes the program considerably smaller, as we'll see. In practice, you'd only do things at this low a level if the toolkits were not available. This is meant to demonstrate how helpful the toolkits are.

Example 3-8. The meer2html sample utility
 #!/usr/bin/perl -w     use strict; use vars qw($chan $cat $num $data $UA $request);     use LWP::UserAgent; use HTTP::Request; use XML::XPath;     use constant MEERKAT =>     'http://www.oreillynet.com/meerkat/xml-rpc/server.php'; use constant XPATH_TO_STRUCTS =>     '/methodResponse/params/param/value/array/data/value' .     '/struct';     if ($ARGV[0] =~ /^-ch/) {     $chan = $ARGV[1];     $num  = $ARGV[2]  15; } elsif ($ARGV[0] =~ /^-ca/) {     $cat = $ARGV[1];     $num = $ARGV[2]  15; }     unless (($chan or $cat) and ($num =~ /^\d+$/)) {     die "USAGE: 
 #!/usr/bin/perl -w use strict; use vars qw($chan $cat $num $data $UA $request); use LWP::UserAgent; use HTTP::Request; use XML::XPath; use constant MEERKAT => 'http://www.oreillynet.com/meerkat/xml-rpc/server.php'; use constant XPATH_TO_STRUCTS => '/methodResponse/params/param/value/array/data/value' . '/struct'; if ($ARGV[0] =~ /^-ch/) { $chan = $ARGV[1]; $num = $ARGV[2]  15; } elsif ($ARGV[0] =~ /^-ca/) { $cat = $ARGV[1]; $num = $ARGV[2]  15; } unless (($chan or $cat) and ($num =~ /^\d+$/)) { die "USAGE: $0 { -channel str  -category str } [ n ]"; } $UA = LWP::UserAgent->new( ); $request = HTTP::Request->new(POST => MEERKAT); $request->content_type('text/xml'); $data = $chan ? data_from_chan($chan, $num) : data_from_cat($cat, $num); show_data($data); exit; sub data_from_chan { my ($chan, $num) = @_; $chan = resolve_name($chan, 'Channels') unless ($chan =~ /^\d+$/); get_data(channel => $chan, $num); } sub data_from_cat { my ($cat, $num) = @_; $cat = resolve_name($cat, 'Categories') unless ($cat =~ /^\d+$/); get_data(category => $cat, $num); } sub show_data { my $data = shift; my $xp = XML::XPath->new(xml => $$data); my $nodes = $xp->find(XPATH_TO_STRUCTS); my @stories = ( ); for my $struct ($nodes->get_nodelist) { my $tmp = {}; for my $key (qw(title link description)) { my $node = $xp->find(qq(member[name="$key"]), $struct); $tmp->{$key} = $xp->find('value/string', $node->get_node(1)) ->string_value; } push(@stories, $tmp); } print STDOUT qq(<span class="meerkat">\n<dl>\n); for (@stories) { print STDOUT <<"END_HTML"; <dt class="title"><a href="$_->{link}">$_->{title}</a></dt> <dd class="description">$_->{description}</dd> END_HTML } print STDOUT qq(</dl>\n</span>\n); } sub resolve_name { my ($str, $name) = @_; $name = "meerkat.get${name}BySubstring"; my $xml = <<"END_XML"; <?xml version="1.0"?> <methodCall> <methodName>$name</methodName> <params> <param><value>$str</value></param> </params> </methodCall> END_XML $request->content($xml); my $resp = $UA->request($request); die "resolve_name: transport error: " . $resp->message if $resp->is_error; my $xp = XML::XPath->new(xml => $resp->content); my $nodeset = $xp->find(XPATH_TO_STRUCTS); die "resolve_name: $str returned more than 1 match" if ($nodeset->size > 1); my $node = $nodeset->get_node(1); $node = $xp->find('member[name="id"]', $node); $xp->find('value/int', $node->get_node(1)) ->string_value; } sub get_data { my ($key, $val, $num) = @_; my $xml = <<"END_XML"; <?xml version="1.0"?> <methodCall> <methodName>meerkat.getItems</methodName> <params> <param><value> <struct> <member> <name>$key</name> <value><int>$val</int></value> </member> <member> <name>time_period</name> <value><string>7DAY</string></value> </member> <member> <name>num_items</name> <value><int>$num</int></value> </member> <member> <name>descriptions</ name > <value><int>200</int></value> </member> </struct> </value></param> </params> </methodCall> END_XML $request->content($xml); my $resp = $UA->request($request); die "resolve_name: transport error: " . $resp->message if $resp->is_error; my $content = $resp->content; return \$content; } 
{ -channel str -category str } [ n ]"; } $UA = LWP::UserAgent->new( ); $request = HTTP::Request->new(POST => MEERKAT); $request->content_type('text/xml'); $data = $chan ? data_from_chan($chan, $num) : data_from_cat($cat, $num); show_data($data); exit; sub data_from_chan { my ($chan, $num) = @_; $chan = resolve_name($chan, 'Channels') unless ($chan =~ /^\d+$/); get_data(channel => $chan, $num); } sub data_from_cat { my ($cat, $num) = @_; $cat = resolve_name($cat, 'Categories') unless ($cat =~ /^\d+$/); get_data(category => $cat, $num); } sub show_data { my $data = shift; my $xp = XML::XPath->new(xml => $$data); my $nodes = $xp->find(XPATH_TO_STRUCTS); my @stories = ( ); for my $struct ($nodes->get_nodelist) { my $tmp = {}; for my $key (qw(title link description)) { my $node = $xp->find(qq(member[name="$key"]), $struct); $tmp->{$key} = $xp->find('value/string', $node->get_node(1)) ->string_value; } push(@stories, $tmp); } print STDOUT qq(<span class="meerkat">\n<dl>\n); for (@stories) { print STDOUT <<"END_HTML"; <dt class="title"><a href="$_->{link}">$_->{title}</a></dt> <dd class="description">$_->{description}</dd> END_HTML } print STDOUT qq(</dl>\n</span>\n); } sub resolve_name { my ($str, $name) = @_; $name = "meerkat.get${name}BySubstring"; my $xml = <<"END_XML"; <?xml version="1.0"?> <methodCall> <methodName>$name</methodName> <params> <param><value>$str</value></param> </params> </methodCall> END_XML $request->content($xml); my $resp = $UA->request($request); die "resolve_name: transport error: " . $resp->message if $resp->is_error; my $xp = XML::XPath->new(xml => $resp->content); my $nodeset = $xp->find(XPATH_TO_STRUCTS); die "resolve_name: $str returned more than 1 match" if ($nodeset->size > 1); my $node = $nodeset->get_node(1); $node = $xp->find('member[name="id"]', $node); $xp->find('value/int', $node->get_node(1)) ->string_value; } sub get_data { my ($key, $val, $num) = @_; my $xml = <<"END_XML"; <?xml version="1.0"?> <methodCall> <methodName>meerkat.getItems</methodName> <params> <param><value> <struct> <member> <name>$key</name> <value><int>$val</int></value> </member> <member> <name>time_period</name> <value><string>7DAY</string></value> </member> <member> <name>num_items</name> <value><int>$num</int></value> </member> <member> <name>descriptions</name> <value><int>200</int></value> </member> </struct> </value></param> </params> </methodCall> END_XML $request->content($xml); my $resp = $UA->request($request); die "resolve_name: transport error: " . $resp->message if $resp->is_error; my $content = $resp->content; return $content; }

One of the first things to notice in this application is the amount of space occupied by the inline-coded XML blocks. In this tool, there are only two such blocks to build. The application itself is very difficult to retarget to a different kind of XML-RPC server, because it makes some specific assumptions about the layout of the responses to the queries.

The application starts by defining the usual pragmas and loading some key libraries. The LWP::UserAgent and HTTP::Request are parts of the LWP package that were briefly touched on in Chapter 2. The XML::XPath module provides the implementation of the W3C's XPath query syntax. Finally, two constant values are defined (using the constant pragma) for the URL of the Meerkat service itself and for a particularly long XPath expression that is used in a few different places.

Processing the command-line arguments with this tool is straightforward. The user is required to specify either a category or channel, using -category and -channel , respectively (which can be abbreviated as short as -ca or -ch ). The argument is checked for validity later. Following this is an optional numeric argument to specify how many items to fetch. This defaults to 15 if not given explicitly. After the command line has been deemed valid, the application creates a LWP::UserAgent object and a HTTP::Request object. The request object is then set to have a Content-Type header of text/xml .

Getting the data is very direct. There are two data-fetching routines defined, one each for channels and categories. This allows the application to specify how to resolve the user-specified value if it isn't already numeric. Because Meerkat uses numeric identifiers, the application allows the user to specify the channel or category by name (by substring, in fact). This value is used to call either the getChannelsBySubstring or getCategoriesBySubstring routine. Luckily for the application, these are both virtually identical in syntax, except for the actual remote procedure name. That simplifies the resolve_name routine, which can plug the type into the XML string it builds, and parse the results the same way regardless of which is called. If the substring match returns more than one hit, the application stops.

With a numeric channel or category ID, the get_data routine makes the ultimate call to the service to get the actual story data. As with resolve_name , the structure of the call is virtually identical in both cases, differing by only one segment of the XML string. By using the frontend routines data_from_chan and data_from_cat , it is easier to avoid repeated tests to tell which type the user provided. The content returned by the call to Meerkat's getItems routine is then returned as a scalar reference (to avoid repeated copying of so large a string on the stack).

Processing the data is also simple. For the sake of the example, the application is designed to generate a block of HTML within a span element that contains a description list. The list uses the title value from a given data record for the dt tag, and the link value as the target of a hyperlink. The data in the description field of the structure is plugged into the dd element.

The XPath expressions used here are simple, but because XPath will be set aside in favor of toolkits, there is no need to explain them in detail. The tool itself, while a very basic example, can be used as-is for generating segments of HTML. These segments can be created from a task scheduler such as the Unix cron command and included as server-side elements in HTML pages. Of course, if used in such a "production" environment, it would be better to make more of the settings (such as how far back in time to search) controllable by parameters.

The real power of the interface is realized when the application is much more flexible in the messages it can send and the results it can process. Meerkat allows access to almost every search aspect present at the web level from the XML-RPC level. Not only can the searches be more detailed and refined, the nature and content of the results can also be tuned to meet the needs of any given application.



Programming Web Services with Perl
Programming Web Services with Perl
ISBN: 0596002068
EAN: 2147483647
Year: 2000
Pages: 123

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net