Case Study: Monitoring Keywords


With the increasing popularity of the Internet, and Internet-driven sales in that manner, high ranking for common keywords has become an integral part of business. An entire industry, SEO (Search Engine Optimization), has evolved to offer complex solutions to customers in an attempt to increase search engine visibility. Whether or not you can afford these services, monitoring your placement within Google for keyword searches is likely of great interest and value to you.

Using the search function discussed earlier in this chapter, monitoring keywords could not be easier. Conceptually, the monitoring system is a lot of nested loops. First, you will need to loop through all of the query terms that should be checked. Inside that you need to loop through API calls until you receive a result set that contains the result you are looking for (this is necessary because you may be monitoring keywords that do not fall in the first 10 results provided by Google). Inside that you loop through the current result set looking for the appropriate result. If that result is found, the placement of that result is recorded, and the loop is broken to ensure duplicate matches are not recorded (often the same domain will offer several results on the same page). After each query term is found, a check is made to determine whether the Google ranking has risen or fallen further than the specified "allowance." If so, a message is generated to be sent to the appropriate party.

Two database tables will be required: the first to contain the Google queries that will be monitored and the second for the results from those searches.

First, the table containing the queries to be monitored:

 CREATE TABLE `06_google_monitor` (   `query` varchar(25) NOT NULL default '',   `allowance` int(11) NOT NULL default '0' ) TYPE=MyISAM; 

The query field holds the terms that will be checked on a regular basis. The maximum length of 25 characters is an arbitrary choice, but reasonable given its use. The allowance field contains the maximum variance between one check of the term and the next. It can be set to 0 to be informed of any change, or a higher value to allow some flexibility between scans. Note that allowing any flexibility would allow for drastic changes over time.

Second, the table containing the placement of those queries:

 CREATE TABLE `06_google_monitor_results` (   `query` varchar(25) NOT NULL default '',   `placement` int(11) NOT NULL default '0',   `timestamp` timestamp(14) NOT NULL ) TYPE=MyISAM; 

The query field holds the term that was searched, placement holds the number of the search result, and timestamp holds the timestamp at which the search was performed.

The check to determine whether a given search result has risen or fallen more than the allowance indicated will be performed by the checkResults function.

 function checkResults($searchQuery, $allowance) {   $query = "SELECT placement FROM 06_google_monitor_results   WHERE `query` = '$searchQuery' ORDER BY timestamp DESC LIMIT 2";   $recentResults = getAssoc($query, 2);   $thisRun = $recentResults[0]['placement'];   $lastRun = $recentResults[1]['placement']; 

The query to be examined as well as the allowance is handed to the function. The query will select the two most recent keyword checks, and the getAssoc() function will hand back an array of the results. The $thisRun and $lastRun variables are created merely for clarity, and to ensure things fit on one line.

   if (($thisRun - $lastRun) > $allowance)   {     return "Ranking for $searchQuery has dropped from $lastRun to $thisRun\n";   }else if (($lastRun - $thisRun) > $allowance)   {     return "Ranking for $searchQuery has increased from $lastRun to $thisRun\n";   }else   {     return "";   } } 

Checks are done to determine whether there has been a large enough change in the ranking to warrant a message. If so, it is generated and returned.

The rest of the program will follow. Note that the error checking presented in previous examples has been omitted for the sake of brevity; it is, however, present in the version you can download from the Wrox website.

 require("../common_db.php"); require('../lib/nusoap.php'); $client = new soapclient("http://example.org/googleapi/GoogleSearch.wsdl", true); 

The required files are included, and the SOAP object is initialized.

 $desiredURL = "http://www.example.com"; $length = strlen($desiredURL); $message = ""; 

The $desiredURL should point to your domain. This program will compare each result to this URL seeking to match the desired domain. The length of the URL is used later in combination with substr() to perform that check. Finally, $message will contain any messages relating to a change in placement in any of the result sets.

 $query = "SELECT * FROM 06_google_monitor"; $searchTerms = getAssoc($query); foreach($searchTerms as $term) { 

All of the query terms are retrieved from the database and stored in the $searchTerms object to be iterated through. This allows you to easily add more terms later; just add them to the database.

   $placement = 1;   $start = 0;   $found = 0;   $searchQuery = $term['query'];   $allowance = $term['allowance']; 

This code contains a lot of variable assignments. $placement will be incremented with each result, so the position of a result within the set can be accurately determined. This is needed because the Google API does not return the results placement with the information. $start will be used as usual, to indicate the desired offset when requesting results from the API. $found will indicate whether the specified domain has been found, to break out of the upcoming while loop. $searchQuery and $allowance indicate the query and allowance for the current item; they exist only for clarity.

   while ($found == 0 && $start < 50)   {     $result = runGoogleSearch(&$client, $searchQuery, $start);     $queryResults = $result['resultElements'];     foreach($queryResults as $item)     { 

This while loop is required in the event that the desired result isn't in the first 10 results. The second conditional ($start < 50) is used to avoid creating an endless loop looking for a search result that isn't there; 50 seems appropriate because if you aren't on the first five pages of search results, it's unlikely users will find you, so tracking seems meaningless. Next, the API call is made (note this call is made directly, rather than using the caching functions introduced earlier), and as usual the results to the query are identified and iterated through.

       if(substr($item['URL'], 0, $length) == $desiredURL)       {         $query = "INSERT INTO 06_google_monitor_results (`query`, `placement`,           `timestamp`)         VALUES ('$searchQuery', '$placement', null)";         insertQuery($query);         $found = 1;         break;       }       $placement++; 

Here a check is made to determine if this result matches the desired URL. When this occurs, the location of the result, as well as the time it was located, are saved to the database. $found is set to 1 to break from the while loop, and the break statement is used to exit the foreach loop. This allows the code to continue with the next query term, if applicable. If this wasn't the desired result, the $placement value is incremented.

Note 

I know the break; statement in there looks ugly and seems unnecessary, but it is needed. Without the break statement, the same check would be performed on the remaining results in this set. Should the desired URL show up again, it will be recorded again. Then, when the check is made to determine if a change was made in ranking, it will compare the two results from this scan, and report that you moved up in ranking, every time. Structurally it is possible to write around that and remove the break, but this method seems the most clear.

     }     $start = $start + 10;   }   if ($found == 0)   {      $query = "INSERT INTO 06_google_monitor_results (`query`, `placement`, `timestamp`)         VALUES ('$searchQuery', '999', null)";   }   $message .= checkResults($searchQuery, $allowance); } 

The start value is incremented to allow the next API call to deal with the next set of results. If after scanning multiple sets of results the desired URL was never found, record a suitable high value as a placeholder to the database. Finally, regardless of whether or not a result was found, make the call to checkResults() to report on the activity for the result in question.

This example hasn't included any code showing what to do with the $message variable; it is really up to you. If this code is executed as part of a script that mails results to an administrator, you could merely echo them. Alternatively (and more attractively), email them to someone in marketing who gets paid to worry about this sort of thing.

Because all results are saved over time, it would be easy to either generate a graph with PHP or export the data to another program to generate one for you. Showing the movement of key query terms over time can present interesting information.

Possible Changes to This Code

This code has been structured in such a way as to allow a lot of changes. Here are a couple good ideas on how to use it differently.

Monitor Page Placement, Rather Than Domain Placement

Add an additional field to the monitor table for the specific URL you want to monitor, and use it to populate the $desiredURL variable on each iteration through the query set. This has the added benefit of being able to track pages from multiple domains.

Rather than Track Movement, Warn if Results Fall Below a Certain Threshold.

Treat the allowance field as a minimum, and only examine the most recent result. If it is below the specified minimum value, email the appropriate parties informing them of the broken threshold.




Professional Web APIs with PHP. eBay, Google, PayPal, Amazon, FedEx, Plus Web Feeds
Professional Web APIs with PHP. eBay, Google, PayPal, Amazon, FedEx, Plus Web Feeds
ISBN: 764589547
EAN: N/A
Year: 2006
Pages: 130

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net