71 Building Web Pages on the Fly


#71 Building Web Pages on the Fly

Many websites have graphics and other elements that change on a daily basis. One good example of this is sites associated with specific comic strips , such as Kevin & Kell, by Bill Holbrook. On his site, the home page always features the most recent strip, and it turns out that the image-naming convention the site uses for the strip is easily reverse-engineered, allowing you to include the cartoon on your own page.

A word from our lawyers  

There are a lot of copyright issues to consider when scraping the content off another website for your own. For this example, we received explicit permission from Bill Holbrook to include his comic strip in this book. I encourage you to get permission to reproduce any copyrighted materials on your own site before you dig yourself into a deep hole surrounded by lawyers.

The Code

 #!/bin/sh # kevin-and-kell.cgi - Builds a web page on the fly to display the latest #     strip from the cartoon strip Kevin and Kell, by Bill Holbrook. #     <Strip referenced with permission of the cartoonist> month="$(date +%m)"   day="$(date +%d)"  year="$(date +%y)" echo "Content-type: text/html" echo "" echo "<html><body bgcolor=white><center>" echo "<table border=\"1\" cellpadding=\"2\" cellspacing=\"1\">" echo "<tr bgcolor=\"#000099\">" echo "<th><font color=white>Bill Holbrook's Kevin &amp; Kell</font></th></tr>" echo "<tr><td><img " # Typical URL: http://www.kevinandkell.com/2003/strips/kk20031015.gif echo -n " src=\"http://www.kevinandkell.com/20${year}/" echo "strips/kk20${year}${month}${day}.gif\">" echo "</td></tr><tr><td align=\"center\">" echo "&copy; Bill Holbrook. Please see " echo "<a href=\"http://www.kevinandkell.com/\">kevinandkell.com</a>" echo "for more strips, books, etc." echo "</td></tr></table></center></body></html>" exit 0 

How It Works

A quick View Source of the home page for Kevin & Kell reveals that the URL for the graphic is built from the current year, month, and day, as demonstrated here:

 http://www.kevinandkell.com/2003/strips/kk20031015.gif 

To build a page that includes this strip on the fly, therefore, the script needs to ascertain the current year (as a two-digit value), month, and day (both with a leading zero, if needed). The rest of the script is just HTML wrapper to make the page look nice. In fact, this is a remarkably simple shell script, given the resultant functionality.

Running the Script

Like the other CGI scripts in this chapter, this script must be placed in an appropriate directory so that it can be accessed via the Web, with the appropriate file permissions. Then it's just a matter of invoking the proper URL from a browser.

The Results

The web page changes every day, automatically. For the strip of 9 October, 2003, the resulting page is shown in Figure 8-3.

click to expand
Figure 8-3: The Kevin & Kell web page, built on the fly

Hacking the Script

This concept can be applied to almost anything on the Web if you're so inspired. You could scrape the headlines from CNN or the South China Morning Post , or get a random advertisement from a cluttered site. Again, if you're going to make it an integral part of your site, make sure that it's either considered public domain or that you've arranged for permission.

Turning Web Pages into Email Messages

Combining the method of reverse-engineering file-naming conventions with the website tracking utility shown in the previous chapter (Script #68, Tracking Changes on Web Pages ), you can email yourself a web page that updates not only its content but its filename as well.

As an example, Cecil Adams writes a very witty and entertaining column for the Chicago Reader called "The Straight Dope." The specific page of the latest column has a URL of http://www.straightdope.com/ columns /${now}.html , where now is the year, month, and day, in the format YYMMDD. The page is updated with a new column every Friday. To have the new column emailed to a specified address automatically is rather amazingly straightforward:

 #!/bin/sh # getdope - grab the latest column of 'The Straight Dope' #  Set it up in cron to be run every Friday. now="$(date +%y%m%d)" url="http://www.straightdope.com/columns/${now}.html" to="testing@yourdomain.com"   # change this as appropriate ( cat << EOF Subject: The Straight Dope for $(date "+%A, %d %B, %Y") From: Cecil Adams <dont@reply.com> Content-type: text/html To: $to <html> <body border=0 leftmargin=0 topmargin=0> <div style='background-color:309;color:fC6;font-size:45pt;  font-style:sans-serif;font-weight:900;text-align:center; margin:0;padding:3px;'> THE STRAIGHT DOPE</div> <div style='padding:3px;line-height:1.1'> EOF   lynx -source "$url"  \     sed -n '/<hr>/,$p'  \     sed 'ssrc="../artsrc="http://www.straightdope.com/art' \     sed 'shref="..href="http://www.straightdope.comg'   echo "</div></body></html>" )  /usr/sbin/sendmail -t exit 0 

Notice that this script adds its own header to the message and then sends it along, including all the footer and copyright information on the original web page.




Wicked Cool Shell Scripts. 101 Scripts for Linux, Mac OS X, and Unix Systems
Wicked Cool Shell Scripts
ISBN: 1593270127
EAN: 2147483647
Year: 2004
Pages: 150
Authors: Dave Taylor

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net