Building a SOAP Server

The SOAP server we'll create is going to expose two functions for SOAP clients to access. One function returns a list of all the products in the database; this function expects nothing to be passed to it and returns an array of array references (also known as a 'list of lists'). The other function expects a part number to be passed to it and returns an XML document that we'll parse to create a page of detailed information about a product.

We'll begin by taking a look at the code that makes up the SOAP server. This is actually just a CGI program that has a few things added to get it to behave like a SOAP server.

01: #!/usr/bin/perl -w 02: # A simple, SOAP server 03: use strict; 04: use DBI; 05: use LWP::Simple (); 06: use HTML::LinkExtor; 07: use SOAP::Transport::HTTP;

Line 1 tells the system where to find Perl and turns warnings on.

Line 2 is simply a comment about this program.

Line 3 loads the strict module. This module causes Perl to be more picky about what it allows and makes the programmer give more thought to the programming process. Always use strict; it can prevent many common programming mistakes.

Line 4 loads the DBI module. We need to access a database for this program, so we use the DBI module, which is the DataBase Interface for Perl.

Line 5 loads the LWP::Simple module and the () makes sure none of the methods get loaded into our namespace. We explicitly call LWP::Simple::get in our program, so it does not have to be loaded at this point. Our program uses this module for fetching a document from a remote Web server.

Line 6 loads the HTML::LinkExtor module. This module extracts links from an HTML document. We'll be parsing a large HTML document inside of this program; this module makes parsing extremely easy for us.

Line 7 loads the SOAP::Transport::HTTP module. This is how we can easily create a SOAP server that uses the HTTP protocol for its communications.

08: package Catalog; 09: my $dbh = DBI->connect ("DBI:mysql:books:switch.perlguy.net",       "bookuser","") 10:     or die("Cannot connect: $DBI::errstr\nAborting."); 11: my @imgs;

Line 8 declares that we are now inside of the Catalog package. Packages are used to declare different namespaces in Perl.

Line 9 connects us to the database and returns a handle with the connection. When the handle is returned, we store it in the newly declared variable named $dbh (meaning DataBase Handle).

Notice that with this connection string, we provide the driver name (mysql), database (books), and server name (switch.perlguy.net). The database happens to be on a completely different machine than this program.

Line 10 is a continuation of line 9; the die function causes the program to die and to display an error message if there is a problem connecting to the database.

Line 11 declares a new array named @imgs. This newly declared array and the variable named $dbh on line 9 are available to all functions within the Catalog package (namespace), since they are declared outside of any of the subroutines.

12: SOAP::Transport::HTTP::CGI 13:   -> dispatch_to(‘Catalog') 14:   -> handle;

Lines 12-14 create a wrapper around the Catalog package.

The first part, line 12, tells the system that we are creating a SOAP server using the HTTP protocol and that this is going to be run like a CGI-based server. This is all part of the SOAP::Lite package.

The second part, line 13, tells this SOAP server that when someone tries to access a method, he or she must request it to be dispatched to the Catalog package.

The last part, line 14, simply passes control to the handle method so that the requests are handled properly.

These preceding three lines are what make this CGI program a SOAP server.

15: sub Get_Product_List { 16:     my $sql = qq(SELECT title, isbn FROM library); 17:     my $sth = $dbh->prepare($sql); 18:     $sth->execute;

Line 15 begins a subroutine named Get_Product_List. This subroutine is one of the two that we are making available as SOAP services. Get_Product_List takes no arguments; it simply queries the database and returns a list of products.

Line 16 creates a new variable named $sql and stores in it the SQL statement needed to get the data from the database.

Line 17 prepares the SQL statement and stores the prepared statement in the new variable named $sth. STH stands for StatemenT Handle.

Line 18 calls the execute method on the $sth handle. This causes the SQL statement to be run against the database, the results of which are also stored as a reference in the $sth handle.

19:     my @products; 20:     while(my $ptr = $sth->fetch){ 21:         push (@products, [ @$ptr ]);  22:     }

Line 19 declares a new array named @products.

Line 20 begins a while loop that loops as long as the $sth->fetch call returns data. Once the fetch call runs out of data to retrieve, $ptr is set to undef and the while loop terminates.

Line 21 pushes a reference to a new anonymous array holding the current values onto the @products array. Each time it goes through the loop, $ptr stores a reference to an array that contains the current values. To get the array values, you can prefix the reference with @. The brackets ([]) around @$ptr cause the current values of the array to be copied into a new anonymous array. A reference to the new anonymous array gets pushed onto the @products array. Each new anonymous array is its own array and occupies its own memory.

If we try to pass a reference to the array by using $ptr, we will overwrite the data each time through the loop, as we are storing it in the same array memory address each time. At the end of the loop, we then end up with a bunch of arrays that all point to the last record we have accessed-which is not what we want to do.

Line 22 ends the while loop, which begins on line 20.

23:     return(\@products); 24: }

Line 23 returns a reference to the @products array. And that's what any SOAP client calling this routine will get back: a list of lists.

Line 24 ends the Get_Product_List subroutine.

25: sub Get_Product_Data { 26:     my $rval; 27:     my ($class, $input) = @_; 28:     die "Bad ISBN Number."            unless($input =~ m/^[\dXx-]{10,13}$/); 29:     $Catalog::input     = $input;

Line 25 begins the Get_Product_Data subroutine. This subroutine is also accessible through the SOAP server. This subroutine expects to be passed a part number of the item that the user wants more details about. This part number is stored in the $input variable.

Line 26 declares a new scalar variable named $rval.

Line 27 declares two new scalar variables. $class is a reference to the package itself; this value is automatically passed. The $input variable is the value that the function call passes; it should contain the part number of the item we are looking for.

Line 28 calls the die function unless the value passed to the $input variable matches the regular expression. The regular expression checks to ensure that only digits, dashes, and the letter X appear in the $input variable. The regular expression also checks to ensure the value in $input is between 10 and 13 characters long. This allows us to validate that we got the type of input that we expected. Allowing any arbitrary data would be a security risk, so we want to avoid that.

Line 29 sets a package-wide variable named $Catalog::input to the value $input. The $input variable we declare on line 27 has a scope of this current subroutine only. But, by declaring the variable $Catalog::input, we effectively create a new variable that is accessible within the Catalog package.

30:    my $sql  = "SELECT * FROM library WHERE isbn = ‘$input'"; 31:    my $p       = $dbh->selectrow_hashref($sql); 32:    $p->{image} = _get_image($input);

Line 30 declares a new scalar variable named $sql and sets it equivalent to the SQL statement on the right side.

Line 31 creates a new scalar variable named $p, which we use as a pointer to the value the selectrow_hashref method call returns. selectrow_hashref is yet another way to get data from a database by using the DBI. Notice that we do not have to prepare and execute this statement. Since we are getting only one item back, and we know this ahead of time, we can use the selectrow_hashref method. This method is referenced directly from the database handle ($dbh), and the SQL statement is passed directly to it. Once the selectrow_hashref does its magic, the variable $p will contain a reference to a hash, which contains the data that has been returned.

Line 32 sets the hash item at key image to the value that a call to the _get_image subroutine returns.

33:     if($p){ 34:         $rval  = qq(<?xml version="1.0"?>\n); 35:         $rval .= qq( <catalog_item>\n);

Line 33 checks to see if the variable $p contains some sort of data. If it does, this block is entered.

Line 34 stores some text in the $rval variable. This text is the XML header for the data being returned.

Line 35 adds the XML tag <catalog_item> to the $rval variable.

36:         while(my($k,$v) = each %$p) { 37:             if($v) { 38:                 $v     =~ s/^ +//; 39:                 $v     =~ s/(\015?\012)$//; 40:                 $rval .= "  <$k>$v</$k>\n"; 41:             } 42:         }

Line 36 begins a while loop that traverses through each hash element in the hash that the $p variable references. For each hash element, the $k and $v variables are set to the key and value, respectively.

Line 37 uses an if condition to check if the current value of the hash ($v) contains any data. If there is data, the program continues into the block of code.

Line 38 uses a regular expression to remove any leading spaces in $v. The ^ means at the beginning of the string; then we have a space, which is what we are looking for, and finally a +, which means one or more. So this regular expression roughly translates to remove all spaces from the beginning of the string.

Line 39 removes the carriage returns and/or linefeeds from the end of the string. The \015 is the octal value for a carriage return. The ? means that there may or may not be a carriage return-if there is, remove it; if there is not, that is ok too. The \012 is the octal value for a linefeed. The $ at the end means remove them only if they are at the end of the string.

Line 40 creates a new XML element that contains the value of the current hash element. The XML element name is the key value, $k. So if the current key were name and the current value were Logan, the tag generated would be <name>Logan</name>.

Line 41 closes the if block, which begins on line 37.

Line 42 closes the while block, which begins on line 36.

43:         $rval .= qq( </catalog_item>\n); 44:     } 45:     $dbh->disconnect; 46:     return($rval); 47: }

Line 43 adds a closing tag to the string contained in $rval.

Line 44 closes the block of code that begins on line 33.

Line 45 disconnects the program from the database.

Line 46 returns $rval. $rval contains the XML document as a large string. The XML document in $rval is the data for the product information.

Line 47 ends the Get_Product_Data subroutine.

48: sub _get_image { 49:     my $input = $Catalog::input; 50:     my $url = "http://www.amazon.com/exec/obidos/ASIN/" .            $input;

Line 48 begins the _get_image subroutine. This subroutine is prefixed with an underscore, and the name is in lowercase because this subroutine is only used internally by other subroutines in this program. This is a common practice so that you can easily see which subroutines are "private" and which subroutines are "public."

Line 49 declares a new scalar variable named $input and sets it to the value that is in $Catalog::input.

Line 50 declares a new scalar variable named $url and creates a URL based upon a fixed string and the value that is in the $input variable. The value in the $input variable should be the part number.

51:     my $p = HTML::LinkExtor 52:              ->new(\&_callback) 53:              ->parse(LWP::Simple::get($url));

Line 51 creates a new scalar variable named $p and sets it to the value returned by the call to HTML::LinkExtor. This call to HTML::LinkExtor spans lines 51-53.

Line 52 calls the new method from HTML::LinkExtor and passes it a reference to the _callback function. The _callback function is called when we pass data to the $p reference.

Line 53 calls the parse method of HTML::LinkExtor. This method is passed the value that is returned from the call to LWP::Simple::get($url). LWP::Simple:: get($url) goes out on the Internet and fetches the HTML file that $url points to. The HTML page that this call returns is parsed by the LinkExtor module.

The _callback function, which we will cover shortly, determines what data to retrieve from the HTML page we have just fetched.

54:     return($imgs[0]); 55: } 

Line 54 returns the value of the first element in the @imgs array. This value should contain a link to the picture of the book on the Amazon Web site.

Line 55 ends the _get_image subroutine.

56: sub _callback { 57:     my($tag, %attr) = @_; 58:     return if $tag ne ‘img';

Line 56 begins the _callback subroutine. This subroutine is used to filter out the data we don't want from the HTML page that the _get_image subroutine has fetched.

The LinkExtor module parses all of the links out of the Web page. These links are then passed along to the _callback function, along with any attributes that are in the HTML tag.

Line 57 declares a scalar variable and a hash and stores the values passed to the subroutine. $tag contains the current tag, and %attr contains the attributes that are in that tag on the HTML page.

Line 58 returns from the subroutine if the current tag is not an image tag.

59:     my @foo = values %attr; 60:     return if($foo[0] !~ /$Catalog::input/i); 61:     push(@imgs, values %attr); 62: }

Line 59 declares a new array named @foo and stores the values from the %attr array in it.

The %attr hash would contain a key of src and a value of /images/foo.png if we had an image tag like this: <img src="/books/2/889/1/html/2//images/foo.png">.

Line 60 returns if the first element of the @foo array doesn't contain the data stored in $Catalog::input. The value in $Catalog::input is the part number (or ISBN in the case of this example).

Line 61 pushes the current values into the @imgs array.

Line 62 ends the _callback subroutine.

That is it! It takes 62 lines of code to make up a SOAP server that contains two useful commands.



Perl Database Programming
Perl Database Programming
ISBN: 0764549561
EAN: 2147483647
Year: 2001
Pages: 175

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net