|CGI Programming with Perl|
A full discussion of HTML and user interface design is clearly beyond the scope of this book. Many other books are available which discuss these topics at length, such as HTML: The Definitive Guide, by Chuck Musciano and Bill Kennedy (O'Reilly & Associates, Inc.). However, many of these other resources do not discuss the relationship between HTML form elements and the corresponding data sent to the web server when a form is submitted. So let's run through a quick review of HTML form elements before we see how CGI scripts process them.
Before we get going, Table 4-1 shows a short list of all the available form tags.
<FORM ACTION="/cgi/register.cgi" METHOD="POST">
Start the form
<INPUT TYPE="text" NAME="name"
<INPUT TYPE="password" NAME="name"
<INPUT TYPE="hidden" NAME="name"
<INPUT TYPE="checkbox" NAME="name"
<INPUT TYPE="radio" NAME="name"
<SELECT NAME="name" SIZE=1>
<SELECT NAME="name" SIZE=n MULTIPLE>
<TEXTAREA ROWS=yy COLS=xx NAME="name">
Multiline text field
<INPUT TYPE="submit" NAME="name"
<INPUT TYPE="image" src="/books/1/64/1/html/2//image.gif"
<INPUT TYPE="reset" VALUE="Message!">
End the form
All forms begin with a <FORM> tag and end with a </FORM> tag:
<FORM ACTION="/cgi/register.cgi" METHOD="POST"> . . . </FORM>
Submitting a form generates an HTTP request just like clicking on a hyperlink, but a request generated by a form is almost always directed at a CGI script (or a similar dynamic resource). You specify the format of the HTTP request via attributes of the <FORM> tag:
METHOD specifies the HTTP request method used when calling the CGI script. The options are GET and POST, and they correspond to the request methods we've already seen as part of the HTTP request line, although they are not case-sensitive here. If the method is not specified, it defaults to GET.
ACTION specifies the URL of the CGI script that should receive the HTTP request made by the CGI script. By default, it is the same URL from which the browser retrieved the form. You are not limited to using a CGI program on your server to decode form information; you can specify a URL of a remote host if a program that does what you want is available elsewhere.
ENCTYPE specifies the media type used to encode the content of the HTTP request. Because GET requests do not have a body, this attribute is only meaningful if the form has POST as its method. This attribute is rarely included since the default -- application/x-www-form-urlencoded -- is appropriate in almost all cases. The only real reason to specify another media type is when creating a form that accepts file uploads. File uploads must use multipart/form-data instead. We will discuss this second option later.
A document can consist of multiple forms, but one form cannot be nested inside another form.
The <INPUT> tag generates a wide array of form widgets. They are differentiated by the TYPE attribute. Each <INPUT> tag has the same general format:
<INPUT TYPE="text" NAME="element_name" VALUE="Default value">
Like <BR>, this tag has no closing tag. The basic attributes that all input types share are as follows:
TYPE determines the type of the input widget to display. A presentation of each type follows this section.
The NAME attribute is important because the CGI script uses this name to access the value of those elements that are submitted.
The meaning of VALUE varies depending on the type of the input element. We will discuss this property in our discussion of each type.
Let's look at each of the input types.
One of the most basic uses of the <INPUT> tag is to generate a text fields where users may enter a line of data (see Figure 4-2). Text fields are the default input type; if you omit the TYPE attribute, you will get a text field. The HTML for a text field looks like this:
<INPUT TYPE="text" NAME="quantity" VALUE="1" SIZE="3" MAXLENGTH="3">
Here are the attributes that apply to text fields:
The VALUE of text fields is the default text displayed in the text field when the form is initially presented to the user. It defaults to an empty string. The user can edit the value of text fields; updates change what is displayed as well as the value passed when the form is submitted.
The SIZE attribute specifies the width of the text field displayed. It roughly corresponds to the number of characters the field can hold, but this is generally only accurate if the element is surrounded by <TT> or <PRE> tags, which indicate that a monospace font should be used. Unfortunately, Netscape and Internet Explorer render the width of fields very differently when monospaced fonts are not used, so certainly test your form with both browsers. The default SIZE for text fields is 20.
The MAXLENGTH attribute specifies the maximum number of characters that a text field can hold. Browsers generally do not allow users to enter more characters than this. Because the size of text fields can vary with variable-width fonts, it is possible to set MAXLENGTH and SIZE to the same value and yet have a field that appears too large or too small for that number of characters. A text field can have a MAXLENGTH set to more characters than its SIZE can display. By default, there is no specified limit on the size of text fields.
A password field is similar to a text field, except that instead of displaying the true value of the field, the browser represents each character with an asterisk or bullet (refer back to Figure 4-2):
<INPUT TYPE="password" NAME="set_password" VALUE="old_password" SIZE="8" MAXLENGTH="8">
This field does not provide any true security; it simply provides basic protection against someone looking over the shoulder of the user. The value is not encrypted when it is transferred to the web server, which means that passwords are displayed as part of the query string for GET requests.
All the attributes that apply to text fields also apply to password fields.
Hidden fields are not visible to the user. They are generally used only with forms which are themselves generated by a CGI script and are useful for passing information between a series of forms:
<INPUT TYPE="hidden" NAME="username" VALUE="msmith">
Like password fields, hidden fields provide no security. Users can view the name and value of hidden fields by viewing the HTML source in their browsers.
We'll discuss hidden fields in much more detail in our discussion of maintaining state in Chapter 11, "Maintaining State".
Hidden fields only use NAME and VALUE attributes.
Checkboxes are useful when users simply need to indicate whether they desire an option. See Figure 4-3.
The user can toggle between two states on a checkbox: checked or unchecked. The tag looks like this:
<INPUT TYPE="checkbox" NAME="toppings" VALUE="lettuce" CHECKED>
In this example, if the user selects the checkbox, then "toppings" returns a value of "lettuce". If the checkbox is not selected, neither the name nor the value is returned for the checkbox.
It is possible to have multiple checkboxes use the same name. In fact, this is not uncommon. The most typical situation in which you might do this is if you have a dynamic list of related options and the user could choose a similar action for all of them. For example, you may wish to list multiple options this way:
<INPUT TYPE="checkbox" NAME="lettuce"> Lettuce<BR> <INPUT TYPE="checkbox" NAME="tomato"> Tomato<BR> <INPUT TYPE="checkbox" NAME="onion"> Onion<BR>
If, however, the CGI script does not need to know the name of each of the options in order to perform its task, you may wish to do this instead:
<INPUT TYPE="checkbox" NAME="toppings" VALUE="lettuce"> Lettuce<BR> <INPUT TYPE="checkbox" NAME="toppings" VALUE="tomato"> Tomato<BR> <INPUT TYPE="checkbox" NAME="toppings" VALUE="onion"> Onion<BR>
If someone selects "lettuce" and "tomato" but not "onion", then the browser will encode this as toppings=lettuce&toppings=tomato. The CGI script can process these multiple toppings, and you may not need to update the CGI script if you later add items to the list. Attributes for checkboxes include:
The VALUE attribute is the value included in the request if the checkbox is checked. If a VALUE attribute is not specified, the checkbox will return "ON" as its value. If the checkbox is not checked, then neither its name nor value will be sent.
The CHECKED attribute indicates that the checkbox should be selected by default. Omitting this attribute causes the checkbox to be unselected by default.
Radio buttons are very similar to checkboxes except that any group of radio buttons that share the same name are exclusive: only one of them may be selected. See Figure 4-4.
The tag is used just like a checkbox:
<INPUT TYPE="radio" NAME="bread" VALUE="wheat" CHECKED> Wheat<BR> <INPUT TYPE="radio" NAME="bread" VALUE="white"> White<BR> <INPUT TYPE="radio" NAME="bread" VALUE="rye"> Rye<BR>
In this example, "wheat" is selected by default. Selecting "white" or "rye" will cause "wheat" to be unselected.
Although you may omit the VALUE attribute with checkboxes, doing so with radio buttons is meaningless since the CGI script will not be able to differentiate between different radio buttons if they all return "ON".
Using the CHECKED attribute with multiple radio buttons with the same name is not valid. Browsers will generally render both as selected, but they will be unselected as soon as the user selects a different option and the user will be unable to return the form to this initial state (unless it has a reset button of course).
Radio buttons use the same attributes as checkboxes.
The HTML for a submit button looks like this:
<INPUT TYPE="submit" NAME="submit_button" VALUE="Submit the Form">
Virtually all forms have a submit button, and you can have multiple submit buttons on one form:
<INPUT TYPE="submit" NAME="option" VALUE="Option 1"> <INPUT TYPE="submit" NAME="option" VALUE="Option 2">
Only the name and value of the submit button clicked is included in the form submission. Here are the attributes it supports:
The VALUE attribute for submit buttons specifies the text that should be displayed on the button as well as the value supplied for this element when the form is submitted. If the value is omitted, browsers supply a default label -- generally "Submit" -- and refrain from submitting a name and value for this element.
A reset button allows users to reset the value of all the fields in a form to their default values. From the user's perspective, it generally accomplishes the same thing as reloading the form but is much faster and more convenient. Because the browser accomplishes this event without consulting the web server, CGI scripts never respond to it. The HTML tag looks like this:
<INPUT TYPE="reset" VALUE="Reset the form fields">
You may have multiple reset buttons on the same form, although this would almost certainly be redundant.
The VALUE attribute specifies the text label that should appear on the button.
You can also have images as buttons. Image buttons function as submit buttons but give you much more flexibility over how the button looks. Keep in mind that users are generally used to having buttons displayed a particular way by the browser and operating system, and a button in a different format may be confusing to a novice. The HTML for an image button tag looks like this:
<INPUT TYPE="image" src="/books/1/64/1/html/2//icons/button.gif" NAME="result" VALUE="text only">
Graphical and text-only browsers treat this element very differently. A text-only browser, such as Lynx, sends the name and value together like most other form elements:
However, a graphical browser, like Netscape and Internet Explorer, send the coordinates where the user clicked on the image in addition to the name of the button. The value is not sent. These coordinates are measured in pixels from the upper-left corner of the image (see Figure 4-6).
In this example, a graphical browser would submit:
Here are the attributes for image buttons:
The VALUE attribute is sent as the value for this element by text browsers.
The SRC attribute specifies the URL to the image displayed for the button, just as it does in the more common <IMG> tag (if the <IMG> tag looks unfamiliar to you, it's because you probably only recognize it when combined with the SRC attribute: <IMG SRC=...>).
This attribute behaves just as it does with standard submit buttons.
The last type of button is just that -- a button; it has no special function. To avoid confusing this button with the other button types, we will refer to it as a plain button. A plain button tag looks like a submit or reset button:
<INPUT TYPE="button" VALUE="Click for a greeting..." onClick="alert( 'Hello!' );">
The name and value of a plain button is never passed to a CGI script. Because a plain button has no special action, it is meaningless without an onClick attribute:
The VALUE attribute specifies the name of the button.
The onClick attribute specifies the code to run when the button is clicked. The code's return value has no effect because plain buttons do not cause other behavior.
The <SELECT> tag is used to create a list for users to choose from. It can create two different elements that look quite different but have similar function: a scrolling box or a menu (also commonly referred to as a drop-down). Both elements are displayed in Figure 4-7. Unlike the <INPUT> elements, <SELECT> tags have an opening as well as a closing tag.
Here is an example of a menu:
Choose a method of payment: <SELECT NAME="card" SIZE=1> <OPTION SELECTED>American Express</OPTION> <OPTION>Discover</OPTION> <OPTION>Master Card</OPTION> <OPTION>Visa</OPTION> </SELECT>
Here is an example of a scrolling box:
Choose the activities you enjoy: <SELECT NAME="activity" SIZE=4 MULTIPLE> <OPTION>Aerobics</OPTION> <OPTION>Aikido</OPTION> <OPTION>Basketball</OPTION> <OPTION>Bicycling</OPTION> <OPTION>Golfing</OPTION> <OPTION>Hiking</OPTION> ... </SELECT>
Scrolling boxes may optionally allow the user to select multiple entries. Multiple options are encoded as separate name-value pairs, as if they had been entered by multiple form elements. For example, if someone selects Aikido, Bicycling, and Hiking, the browser will encode it as activity=Aikido&activity=Bicycling& activity=Hiking.
Attributes for the <SELECT> tag are:
The SIZE attribute determines the number of lines visible in the list. Specifying 1 for the SIZE indicates that the list should be a menu instead.
The MULTIPLE attribute allows the user to select multiple values. It is only possible if the SIZE attribute is assigned a value greater than 1. On some operating systems, the user may need to hold down certain modifier keys on their keyboard in order to select multiple items.
The <SELECT> tag does not have a value attribute. Each of its possible values must have an <OPTION> tag around it.
You may override the value used by a particular option by specifying a VALUE attribute like this:
<OPTION VALUE="AMEX" >American Express</OPTION>
Options have two optional attributes:
The SELECTED attribute specifies that the option should be selected by default. When a form is submitted, the name of the <SELECT> tag is submitted along with the value of the selected options.
The VALUE attribute is the value that is passed for the option if it is selected. If this attribute is omitted, then it defaults to the text between the <OPTION> and </OPTION> tags.
The final form element, the <TEXTAREA> tag, allows users to enter multiple lines of text. See Figure 4-8.
Text areas have an opening and a closing tag:
<TEXTAREA ROWS=10 COLS=40 NAME="comments" WRAP="virtual">Default text</TEXTAREA>
This creates a scrolled text field with a visible area of ten rows and forty columns.
There is no VALUE property for the <TEXTAREA> tag. Default text should be placed between the opening and closing tags. Unlike other HTML tags, white space -- including newlines -- is not ignored between <TEXTAREA> and </TEXTAREA> tags. A browser will render the example above with "Default" and "text" on separate lines.
Attributes for the <TEXTAREA> tag are:
The COLUMNS attribute specifies the width of the text area, but like the size of text fields, browsers size columns differently for variable-width fonts.
The ROWS attribute specifies the number of lines that the text area should display. Text bars have scrollbars to access text that does not fit within the display area.
The WRAP attribute specifies what the browser should do if the user types beyond the right margin, but note that the WRAP attribute is not implemented as uniformly as other tags and attributes. Although most browsers support it, it is actually not included in the HTML 4.0 standard. In general, specifying "virtual" as the WRAP results in the text wrapping within the text area, but it is submitted without newlines. Specifying "physical" as the WRAP also results in the text wrapping for the user, but the line breaks are submitted as part of the text. Users on different operating systems will submit different characters for end-of-line characters. If you specify to omit the WRAP attribute or specify "none" for it, then text will typically scroll beyond the right side of the text area.
|4. Forms and CGI||4.3. Decoding Form Input|
Copyright © 2001 O'Reilly & Associates. All rights reserved.