A cool use of a shell-based CGI script is to log events by using a wrapper. Suppose that I'd like to have a Yahoo! search box on my web page, but rather than feed the queries directly to Yahoo!, I'd like to log them first, to build up a database of what people seek from my site.
First off, a bit of HTML and CGI: Input boxes on web pages are created inside forms, and forms have user information to be processed by sending that information to a remote program specified in the value of the form's action attribute. The Yahoo! query box on any web page can be reduced to the following:
<form method="get" action="http://search.yahoo.com/bin/search"> Search Yahoo: <input type="text" name="p"> <input type="submit" value="search"> </form>
However, rather than hand the search pattern directly to Yahoo!, we want to feed it to a script on our own server, which will log the pattern and then redirect the query along to the Yahoo! server. The form therefore changes in only one small regard: The action field becomes a local script rather than a direct call to Yahoo!:
<!-- Tweak action value if script is placed in /cgi-bin/ or other --> <form method="get" action="log-yahoo-search.cgi">
The log-yahoo-search.cgi script is remarkably simple, as you will see.
#!/bin/sh # log-yahoo-search - Given a search request, logs the pattern, then # feeds the entire sequence to the real Yahoo! search system. # Make sure the directory path and file listed as 'logfile' are writable by # user nobody, or whatever user you have as your web server uid. logfile="/var/www/wicked/scripts/searchlog.txt" if [ ! -f $logfile ] ; then touch $logfile chmod a+rw $logfile fi if [ -w $logfile ] ; then echo "$(date): $QUERY_STRING" sed 's/p=//g;s/+/ /g' >> $logfile fi echo "Location: http://search.yahoo.com/bin/search?$QUERY_STRING" echo "" exit 0
The most notable elements of the script have to do with how web servers and web clients communicate. The information entered into the search box is sent to the server as the variable QUERY_STRING , encoded by replacing spaces with the + sign and other non- alphanumeric characters with the appropriate character sequences. Then, when the search pattern is logged, all + signs are translated back to spaces safely and simply, but otherwise the search pattern is not decoded, to ensure that no tricky hacks are attempted by users. (See the introduction to this chapter for more details.)
Once logged, the web browser is redirected to the actual Yahoo! search page with the Location: http header value. Notice that simply appending ?$QUERY_STRING is sufficient to relay the search pattern, however simple or complex it may be, to its final destination.
The log file produced by this script has each query string prefaced by the current date and time, to build up a data file that not only shows popular searches but can also be analyzed by time of day, day of week, month of year, and so forth. There's lots of information that this script could mine on a busy site!
To run this script, you need to create the HTML form, as shown earlier, and you need to have the script executable and located on your server. (See the earlier section "Running the Scripts in This Chapter" for more details.) Then simply submit a search query to the form, perhaps "nostarch." The results are from Yahoo!, exactly as expected, as shown in Figure 8-2.
As you can see, the user is prompted with a Yahoo! search box, submits a query, and, as shown in Figure 8-2, gets standard Yahoo! search results. But there's now a log of the searches:
$ cat searchlog.txt Fri Sep 5 11:16:37 MDT 2003: starch Fri Sep 5 11:17:12 MDT 2003: nostarch
On a busy website, you will doubtless find that monitoring searches with the command tail -f searchlog.txt is quite informative as you learn what kind of things people seek online.