Hack 83 Spidering GameStop.com Game Prices

Hack 83 Spidering GameStop .com Game Prices

figs/moderate.gif figs/hack83.gif

Looking to get notification when "Army Men: Quest for Some Semblance of Quality" goes on sale at $5.99? With this hack, you'll be able to keep an eye on your most desired (or derisive) video game titles .

All work and no play makes Jack a dull geek. Of course, having to hunt down game prices to figure out what he can afford to play on his PlayStation 2 makes Jack even duller. It's so much better to get a spider to do it for him.

We like GameStop.com (http://www.gamestop.com), a retail site for console and PC video games , so we came up with a simple spider that gathers up information about a certain platform of gamesthe way the script is written, it gathers information on XBox gamesbut, as you'll see, it's easy to adapt the script to other uses.

The Code

Save the following code as gamestop.pl :

 #!/usr/bin/perl -w use strict; use HTML::TokeParser; use LWP::Simple; # the magical URL. my $url = "http://www.gamestop.com/search.asp?keyword=&platform=26".           "&lookin=title&range=all&genre=0&searchtype=adv&sortby=title"; # the magical data. my $data = get($url) or die $!; # the magical parser. my $p = HTML::TokeParser->new($data); # now, find every table that's 510 and 75. while (my $token = $p->get_tag("table")) {     next unless defined($token->[1]{height});     next unless defined($token->[1]{width});     next unless $token->[1]{height} == 75;     next unless $token->[1]{width} == 510;     # get our title.     $p->get_tag("font"); $p->get_tag("a");     my $title = $p->get_trimmed_text;     # and our price.     $p->get_tag("font"); $p->get_tag("/b");     my $ptoken = $p->get_token;     my $price = $ptoken->[1];     $price =~ s/$//;     # comma spliced.     print "\"$title\",$price\n"; } 

Running the Hack

The hack is simple enough. It gathers information about XBox games, sorted by title, and puts that information into a comma-delimited file, as per the following output:

 %  perl gamestop.pl  "4x4 Evolution 2 - Preowned",16.99 "Aggressive Inline - Preowned",16.99 "Air Force Delta Storm - Preowned",27.99 "Alias",49.99 ...etc... 

It's very basic right now, but there's some fun stuff we can build in.

Hacking the Hack

Let's start by making the request keyword-based instead of platform-based; maybe you're interested in racing games and don't care about the platform.

GameStop by keyword

Add these two lines to the top of the script, after the use statements:

 # get our query, else die miserably. my $query = shift @ARGV; die unless $query; 

Then, change your magical URL, like this:

 # the magical URL.   my $url = "http://www.gamestop.com/search.asp?  keyword=$query  &platform=".             "&lookin=title&range=all&genre=0&searchtype=adv&sortby=title"; 

This'll give you 10 results based on your keyword. For example:

 %  perl gamestop.pl racing  "All Star Racing",7.99 "Andretti Racing - Preowned",9.99 "Andretti Racing - Preowned",7.99 "Antz Extreme Racing - Preowned",16.99 "Antz Racing",4.99 "Antz Racing - Preowned",29.99 "ATV Quad Power Racing 2 - Preowned",24.99 "ATV Quad Power Racing 2 - Preowned",17.99 "ATV Quad Power Racing 2 - Preowned",17.99 "ATV: Quad Power Racing 2",19.99 "Batman: Gotham City Racer - Preowned",27.99 "Beetle Adventure Racing - Preowned",29.99 
Putting the results in a different format

Of course, getting the results in a comma-delimited format might not be what you want. How about sorting results by price and saving them to an RSS file, so you can have an RSS feed of the cheapest games that match a keyword? (Unabashed capitalist hackers could even add an affiliate code to the link URL.)

Here's how to do it. The first thing you want to do is add use XML::RSS to the use lines at the top of the script. Then, as in the first example, you can add the query word from the command line, or you can hardcode it into the query. In this example, I hardcode it into the query, with the idea that you can add this to your server and run it as a cron job periodically:

 # the magical URL.   my $url = "http://www.gamestop.com/search.asp?".             "keyword=   your search keyword here   &platform=".             "&lookin=title&range=all&genre=0&searchtype=adv&sortby=title"; 

Now, you want to change the output from a comma-delimited file to an RSS feed. Remove these lines:

 # comma spliced.  print "\"$title\",$price\n"; 

and add these lines above the magical URL line:

 # start the RSS feed. my $rss = XML::RSS->new(version => '0.91'); $rss->channel(     'link'       => http://www.gamestop.com,      title        => "Game Prices from GameStop",      description  => "Great Games and Stuff!" ); 

Then, add the lines that create the RSS feed itself:

 # add this item # to our RSS feed. $rss->add_item(    title       => "$title, $price...",     'link'      => "http://www.gamestop.com/search.asp?keyword=$title".                   "&platform=0&lookin=title&range=all&genre=0&sortby=title" ); 

Finally, add this as the last lines of the script, to save your output as a feed:

 # and save our RSS. $rss->save("gamestop.rdf"); 

There are several minor hacks you can try with this script. GameStop.com offers several different search options; try experimenting with the different searches and see how they impact the result URLs. Experimenting with the URL options in the magical URL lines can get you lots of different results. Likewise, as written, the script reports on the first page of results; catering to the entire listing of search results can be done with WWW::Mechanize [Hack #21] or a manual loop.



Spidering Hacks
Spidering Hacks
ISBN: 0596005776
EAN: 2147483647
Year: 2005
Pages: 157

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net