HTML and CGI, Part Two

team lib

Sending user input to a CGI program.

This is the fourth installment in a series of tutorials about providing Internet and intranet services. In the tutorial "HTML and CGI Part One," we began to explore HTML forms, which make it possible to collect information from users and route it to a common gateway interface (CGI) program. I showed a simple HTML form, along with the code that makes it work. I didn't get all the way through my discussion of the sample form before running out of column space, so let's conclude that now, before moving on to CGI.

For your reference, I've reproduced the code listing for the sample in Listing 1. If you'd like to see how this listing looks when displayed by a Web browser, see Figure 1.

click to expand
Figure 1

I need to correct a statement I made about the listing last month. In discussing the text field, line 4 of Listing 1, I mentioned that if the maximum length you specify for the field (the maxlength option) exceeds the displayed size of the text box (the size option), a scroll bar will appear when the user types a long text string. After further testing, I've found that what you actually get is a scrolling text box, not a scroll bar. For example, if you have set size to 25 and maxlength to 30, the text begins to scroll once you type in more than 25 characters. Once you exceed 30 characters , the text box will no longer accept keystrokes, and you'll get beeps instead.

Also in Listing 1 last month, the statements for creating check boxes used the syntax input type="check box" with a space between check and box. To get this to work correctly, you must specify the input type as one word, in this case checkbox. Listing 1 as displayed here is correct.

Another modification I've made to Listing 1 is to place a specific URL into the form's action statement (line 1). This lets me show more detail about how information gathered by the form gets turned over to a CGI program for processing.

Finishing Off The Form

Last month's discussion covered line 1 through line 11 of Listing 1, so let's take up with line 12, which creates a Submit button. When the user finishes filling out the form, he or she then clicks the Submit button to submit the completed form to a CGI program running on the Web server. As you can see from line 12, you create a Submit button by setting the input type to submit. You apply a label to the button by setting value equal to a text string.

When the user clicks on the Submit button, the Web browser takes whatever action you specified in the action statement for the form (for example, line 1 in Listing 1). In this example, the Web server will execute the Perl script named example.pl, located in the cgi-bin directory.

How does the Web server know that it should execute the referenced program, rather than simply delivering it to the requesting Web browser? It knows by virtue of the directory that the program is stored in cgi-bin. On most Web servers, this directory is reserved specifically for executable files. (You may have other directories that hold executable files, but cgi-bin should hold all of the files that can be remotely executed via the Web.)

Keeping all executable files in cgi-bin is also a good security measure. As Webmaster, you should allow the Web server to execute programs only if they are located in this directory. Also, thoroughly scrutinize and test any programs you place in this directory to ensure that they cannot be used to damage or replace files or circumvent your security measures. If, for example, someone can use one of your programs to read system password files, your system security is obviously at risk.

In my example, I mentioned Perl, the Practical Extraction and Report Language. For those unfamiliar with Perl, it's an interpreted language that was originally developed for use on Unix systems, but has since been ported to many other operating systems. Perl interpreters are available for Macintosh and Windows NT systems, for example. Perl programs are usually referred to as scripts, as they are relatively quick and easy to dish up compared with C or other "full-blown" programming languages. Perl is much closer to DOS batch file programming than it is to the more classical programming languages. One area where Perl excels is in string manipulation, whichas we'll seeis very important in CGI programming.

Line 13 of Listing 1 creates a Reset button. If the user clicks on this button, all values will be reset back to their original default settings. In this example, the radio button for Bridge is checked, as are all three of the check boxes (IP, IPX, and AppleTalk), so the form would revert back to these settings.

When the user clicks on the Submit button, the information will be transmitted to the Web server using pairs of field names (also called keys) and values. For example, suppose the user had entered the name SuperDuper as the product name in our sample form, checked the Router radio button, and checked IP and IPX for protocols handled. In this case, when the user clicks the Submit button, the form will be submitted with name =SuperDuper, ProdType=Router, ip=on, and ipx=on. (If the AppleTalk check box is unchecked, you won't get "AppleTalk=off"AppleTalk will simply not be returned as a field.)

We've seen how HTML forms collect information from a Web user and submit it to a CGI program as key/value pairs. Because the action specified in the opening <form> tag references a file located in the cgi-bin directory, the Web server knows that the referenced file should be executed as opposed to being displayed by the requesting Web browser, as it would be with a typical HTML document.

Getting To Know CGI

Writing CGI programs is only slightly different from writing other types of programs. At the risk of oversimplifying, I'll characterize programming as obtaining input from a user or from a data file, storing that input in program variables, manipulating those variables to achieve some desired purpose, and sending the results to a file or video display. If you're a programmer, you know that programs typically get their input from the logical device known as standard input (STDIN, for short), and send output to the device standard output (STDOUT). STDIN and STDOUT typically represent the computer console (keyboard and video display), but most operating systems support redirection, so STDIN and STDOUT could be disk files or other devices.

Many operating systems and programming languages also support the use of environment variablesvariables that can be set in the operating system and read by programs, or vice versa. Environment variables allow information to be passed between the operating system and running programs or between programs written in different languages.

CGI programs are similar to regular programs. They typically get their input from STDIN or from environment variables and send output to STDOUT.

To understand how a user's Web browser and a CGI program interact, we need to take a step back and examine how a browser submits simple HTML requests and how a Web server responds.

Suppose you embed the following hypertext link in an HTML document:

 <A HREF="TEST.HTML"> 

If you were to click on this link, the browser would issue the following request to the Web server:

 GET /TEST.HTML HTTP/1.0  Accept: text/plain Accept: text/html 

Each of these lines is referred to as a header. The first one is the get header, which tells the Web server that the browser wants to get the document test.html, and that it's using version 1.0 of the Hypertext Transport Protocol. Because only the file name was specified in this case, the Web server defaults to looking for the file in the server's Web-document root directory. If you want to obtain a file that is located in a subsidiary directory, your hypertext link must specify the complete path name to the file, relative to the server's root directory for Web documents. If you want to reference a file located on another Web server, your hypertext link has to specify the complete URL for the new file.

Following the get header are two Accept headers, which state that the browser can accept plain text or HTML-formatted text files. If the browser can accept more data types, there will be more Accept headers, detailing each type in terms of Multipurpose Internet Mail Extensions (MIME).

The Web server's response to this request would look like this:

 HTTP /1.0 200 OK Date: Monday, 24-May-96  11:09:05 GMT Server: NCSA/1.3  MIME-version 1.0  Content-type: text/html  Content-length: 231      <HTML> <HEAD> <TITLE>This is the document title</TITLE> </HEAD> This is a test HTML page </HTML> 

The server's header gives the Web server name and version number, and the version of HTTP used. Other headers describe the content type (HTML-formatted text, in this case) and content length (231 bytes) of the material being sent. The Web browser then reads and executes the HTML portion of the file.

With CGI, things are not much different, except that the file being requested will be in the cgi-bin directory, which tells the Web server that the requested file is to be executed, instead of merely sent to the Web browser for display as an HTML document.

In the Tutorial "HTML and CGI, Part I", I mentioned that information gathered from users or their Web browsers can be sent to the Web server using one of two methods : the get method or the post method.

With the get method, all the form data is included in the URL in what's known as a query string. As an example, suppose we have a simple form that has only two fields, named color and size, and that the user typed sky blue and large, respectively, in response. Let's also assume that the CGI program that's going to process the data is a Perl script, named example.pl and located in the cgi-bin directory. When the user clicks on the Submit button, an HTTP request will be generated and sent to the Web server.

The code for our HTML form must contain an action statement, as well as tell the Web server which method (get, in this case) is being used to send data. Thus, the first statement for our form must read:

 <FORM ACTION="/CGI-BIN/ EXAMPLE.PL" METHOD="GET"> 

This lets the Web server know the complete path name of the program to be executed ("/cgi-bin/example.pl"), and that the get method will be used. As mentioned earlier, the get method uses a query string to pass data to the CGI program. In this example, when the user clicks on the Submit button, his or her Web browser will send the following request to the Web server:

 GET /CGI-BIN/EXAMPLE.PL? COLOR=SKY%20BLUE&size= LARGE HTTP/1.0 

The continuous string of text that follows the question mark represents the query string. In response to this request from the Web browser, the server executes the script example.pl and places the string color=sky%20blue&size=large in the query_string environment variable. Your CGI program will then be able to read the query_string environment variable.

In the tutorial "CGI and Web Servers," I'll discuss what your CGI program must do to process the query string. I'll also cover the post method of submitting data to CGI programs.

Listing 1: HTML code for a sample form
start example

"Sample Form"

 1  <FORM ACTION="/CGI-BIN/EXAMPLE.PL" METHOD="GET">  2  <H5>Please enter the following information:</H5><BR> 3  Product name: 4  <INPUT TYPE="TEXT" NAME="NAME" size="25" MAXLENGTH="30"><BR>  5  Select product type: 6  <INPUT TYPE="RADIO" NAME="ProdType" VALUE="Bridge" CHECKED> Bridge  7  <INPUT TYPE="RADIO" NAME="ProdType" VALUE="Router"> Router  8  <INPUT TYPE="RADIO" NAME="ProdType" VALUE="Switch"> Switch<P>  9  Protocols handled: <INPUT TYPE="CHECKBOX" NAME="IP" CHECKED> IP 10 <INPUT TYPE="CHECKBOX" NAME="IPX" CHECKED> IPX 11 <INPUT TYPE="CHECKBOX" NAME="AppleTalk" CHECKED> AppleTalk  12 <INPUT TYPE="SUBMIT" VALUE="Press here to submit your entry">  13 <INPUT TYPE="RESET" VALUE="Clear Form">  14 </FORM> 
end example
 

This tutorial, number 97, by Alan Frank, was originally published in the September 1996 issue of LAN Magazine/Network Magazine.

 
team lib


Network Tutorial
Lan Tutorial With Glossary of Terms: A Complete Introduction to Local Area Networks (Lan Networking Library)
ISBN: 0879303794
EAN: 2147483647
Year: 2003
Pages: 193

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net