Many corporations now offer functionality via XML Web Services to the public. One such company is Google, Inc., which offers the use of its search engine as a service. You can learn more about this at http://www.google.com/apis. This section demonstrates a small add-on to a web application via which we can use keywords to show "related links" to the user. We will just generate the keywords manually and focus mostly on our using the Google APIs. Setting Up to Use the Google APIsBefore we can begin using the Google search functionality from within web applications, you need to perform three steps, outlined at http://www.google.com/apis. For the first step, download the developer kit from the Google site. This contains a number of files and classes for Java and Microsoft's .NET platform. We will not need any of these. Pay attention only to the .wsdl file included in the kit. In the second step, you need to set up an account with Google. Because Google graciously offers this service free of charge, it places some restrictions on its use:
For our demonstration purposes, these terms are entirely reasonable and not at all restrictive. In the final step, we unpack the developer kit .zip file and place the GoogleSearch.wsdl in a location where we can access it from our web application. For this sample, we placed it in the same directory as our scripts. After you have created an account, Google will send you an e-mail with your license key, which must be passed along with queries sent to Google. Learning More About the ServiceThe first thing we will do will be to learn more about the functionality offered by the Google APIs and the GoogleSearch.wsdl file they sent us. Although we could look through the WSDL document and try to figure out what the methods are, we have another means at our disposalthe __getFunctions method on SoapClient. This enables us to verify that everything is working properly with SOAP and saves us looking through some potentially complicated XML. To demonstrate, we write this simple script to list all of the methods available to us through the APIs: <?php try { // // first load the .wsdl file that Google provides with // its API download. // $sc = @new SoapClient('GoogleSearch.wsdl'); // // next, we'll show a list of all the API functions that // this WSDL file contains: // $fns = @$sc->__getFunctions(); foreach ($fns as $fn) { // // these first four lines just extract the appropriate // parts from the API string. // ereg(' [[:alnum:]]*\(', $fn, $res); $api = substr($res[0], 0, strlen($res[0]) - 1); ereg('\(.*\)', $fn, $res); echo "<b>$api</b>: $res[0]<br/><br/>\n"; } } catch (SoapFault $sf) { echo "SOAP Error: <b>$sf->faultstring</b><br/>\n"; } catch (Exception $e) { $msg = $e->getMessage(); echo "Unknown Exception: <b>$msg</b><br/>\n"; } ?> The output of this script looks like this. (The ereg calls exist strictly to help us extract portions of the function signature for formatted output. See whether you can figure out how they work.) doGetCachedPage: (string $key, string $url) doSpellingSuggestion: (string $key, string $phrase) doGoogleSearch: (string $key, string $q, int $start, int $maxResults, boolean $filter, string $restrict, boolean $safeSearch, string $lr, string $ie, string $oe) We can see that the XML Web Service exposes three methods. We will concern ourselves with the doGoogleSearch method and leave learning about the others (at http://www.google.com/apis/reference.html) as an exercise for you. How the Search WorksThe doGoogleSearch web method has a reasonably large function signature, requiring 10 parameters, as listed in the following table.
The function returns an object with the following structure: class stdClass { public $documentFiltering; // true or false public $searchComments; // comments from Google public $estimatedTotalResultsCount; // total num. of results public $estimateIsExact; // estimated or actual public $resultElements; // array of result objs public $searchQuery; // the submitted query public $startIndex; // start index of results public $endIndex; // end index of results public $searchTips; // tips from Google public $directoryCategories; // ODP category public $searchTime; // how long it took } Most of the members are intuitive except for the $resultElements member (and $directoryCategories, which we will not use). The result elements are returned in an array of objects, each of which is as follows: class stdClass { public $summary; // summary from ODP dir public $URL; // URL of result public $snippet; // quick desc of result public $title; // title of the page public $cachedSize; // if not 0, cache avail. public $relatedInformationPresent; // true means available public $hostName; // returned when filtering public $directoryCategory; // ODP category public $directoryTitle; // ODP category title } In both of these objects, ODP refers to the Open Directory Project, an attempt to create a global directory of the Internet. Google uses this in its searches whenever possible. With an idea of how to use the doGoogleSearch function and an idea of what it is going to return to us, we can write the main portion of our sample. Searching for KeywordsTo do our work, we will write a GoogleKeywords class, with a public static method called findAndPrintRelatedPages. This class and this first method are as follows: define('GOOGLE_LICENSE_KEY', 'secret'); // from Google define('RESULTS_PER_PAGE', 10); // Google's limit class GoogleKeywords { // // this function takes a string containing keywords to // search for through Google and prints out the top // 10 results as returned by Google. // public static function findAndPrintRelatedPages($in_keywords) { try { // we need the .wsdl file to make this work! $sc = @new SoapClient('GoogleSearch.wsdl'); // full documentation for this method can be found // at http://www.google.com/apis/reference.html $results = @$sc->doGoogleSearch( GOOGLE_LICENSE_KEY, // Google key trim($in_keywords), // query string 0, // starting index RESULTS_PER_PAGE, // max # results FALSE, // filter output? '', // pref. country FALSE, // SafeSearch on? '', // preferred lang '', // ignored '' // ignored ); // start the page and summarize the results: self::emitSearchSummary($results); // now show the results: foreach ($results->resultElements as $resultObject) self::emitSearchResult($resultObject); } catch (SoapFault $sf) { echo "SOAP Fault Occurred: {$sf->faultstring}<br/>\n"; } catch (Exception $e) { echo "Exception Occurred: {$sf->faultstring}<br/>\n"; } } } This method calls two others: the emitSearchSummary function private static function emitSearchSummary($in_results) { echo <<<EOHEADER <br/> Google found approximately <em>$in_results->estimatedTotalResultsCount</em> pages related to this one.<br/><br/> Showing the first ten:<br/> EOHEADER; } and the emitSearchResult function: private static function emitSearchResult($in_result) { echo <<<EORESULT <table width='70%' border='0' cellspacing='0' cellpadding='0'> <tr> <td width='100%' bgcolor='#ebecca'> <a href='$in_result->URL'> <b>$in_result->title</b> </a> </td> </tr> <tr> <td> $in_result->snippet<br/> </td> </tr> <tr> <td bgcolor='#fbfcda'> <a href='$in_result->URL'>$in_result->URL</a> </td> </tr> </table> <br/><br/> EORESULT; } With all this ready to go, we just need to write the page to use it. We have written a small script called showarticle.php, which has three "dummy" articles including keywords. It randomly selects one of these, prints the (single-sentence) article, and then tells the GoogleKeywords class to print the related pages: <?php ob_start(); // this will let us show keywords for this article. include('google_keywords.inc'); ?> <!DOCTYPE html PUBLIC "~//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" lang="en-US" xml:lang="en-US"> <head> <title>Display Article</title> <meta http-equiv="content-type" content="text/html; charset=utf-8"/> </head> <body> <?php // // to keep this sample simple, we're going to use some fake // article placeholders here and just associate some keywords // with them. We will randomly select one of these articles // to display. // $articles = array( array('keywords' => 'Jose Maria Aznar biography', 'article' => 'All about Jose Maria Aznar, former prime minister of Spain.'), array('keywords' => 'Egyptian Mau cats', 'article' => 'Egyptian Mau cats are adorable, but quite expensive, and surprisingly annoying at 6.00 in the morning!'), array('keywords' => 'uralo altaic hypothesis', 'article' => 'The Uralo-Altaic Hypothesis suggests that languages such as Turkish and Japanese are genetically related, but is losing favour.') ); // // randomly select and display an article. // $use = rand(0, count($articles)); echo <<<EOT <h2>The Article</h2> <hr size='1'/> <p align='left'> {$articles[$use]['article']} </p> <hr size='1'/> <br/><br/> EOT; // // now display the related matches against their keywords. // GoogleKeywords::findandPrintRelatedPages( $articles[$use]['keywords']); ?> </body> </html> <?php ob_flush(); ?> The output of this page might look something like that shown in Figure 27-3. Figure 27-3. Running our keywords XML Web Service sample.With this sample, you should have a good idea how easy it is to integrate XML Web Services into your applications and how powerful they can be. |