HTML Forms

HTML Forms

Almost all interactive Web applications make use of HTML forms in one way or another. If you use the Internet, chances are that you have encountered an HTML form in almost all your Web surfing sessions when using a search engine, checking your Web-based e-mail, checking your credit card statement, sending an e-greeting card, and many other tasks. The HTML form is the interface that an application uses to interact with the user via the Web browser.

There are two key aspects of HTML forms: browser-side handling and server-side processing. While displaying a form, the browser's task consists of ensuring that the various HTML input elements are displayed properly and that the user is allowed to fill these HTML input elements with data in an appropriate manner. Once the user has filled in the data, she is allowed to submit the completed form. The browser ensures that the data in the form is properly encoded, using URL encoding, and in turn gets submitted to the application program that is meant to receive this data in the form.

Once the data is received by the application program, the various input elements need to be separated and processed. The early CGI specification left it up to the application programmer to insert the necessary logic for parsing the URL encoded data and extracting the input element names and the submitted values. Modern Web servers and platforms have built-in routines to process form data automatically on submission, simplifying the task of the Web application programmer.

However, from a security point of view, the Web application programmer needs to be careful of the inputs allowed by the browser. At the end of this discussion we present some rules of thumb for processing HTML forms. Meanwhile, let's look at a few examples of how HTML forms work (and don't work), from the security perspective.

Anatomy of an HTML Form

An HTML form is identified by the <FORM> </FORM> tags (as discussed in Chapter 1). All HTML tags embedded in the form tags are treated as a part of the form. Among other HTML tags, the <INPUT> tags comprise the input elements of the form. They allow the user to enter data in the HTML form displayed on the browser. Figure 5-4 shows the HTML elements that make up a form.

Figure 5-4. Elements of an HTML form

graphics/05fig04.gif

You should keep in mind some key concepts regarding HTML forms.

         Method: Each form must have one form submission method, either GET or POST. It specifies which HTTP method the browser should use while sending form data to the server.

         Action: Each form must have an associated server-side application. The application should be designed to receive data from the various input elements of the form.

         Input elements: Each input element must have a name, which is used by the server-side application for parsing parameters and their values.

         Submit button: Each form must have a Submit button, which is a special type of input element shown as a clickable button by the browser. When the Submit button is clicked on, the browser gathers and encodes the user-supplied data from the various form fields and sends them to the server-side application.

Let's take a look at the small login form on the main page of www.acme-art.com as an example. The source code of the form, with the form elements shown in boldface, is:

<form method=POST action="/cgi-bin/login.cgi">
<table border=0>
<tr>
<td>Username:</td> <td><input name=user type=text width=20></td>
</tr>
<tr>
<td>Password:</td> <td><input name=pass type=password width=20></td>
</tr>
</table>
<input type=submit value="login">
</form>

The various elements of the acme-art.com's login form can be summarized as follows:

Element

Type

Value

Method

POST

Action

http://www.acme-art.com/cgi-bin/login.cgi

Input

Text

"user"

Input

Password

"pass"

Submit button

"login"

Input Elements

In the preceding example, we encountered three types of input elements: TEXT, PASSWORD, and SUBMIT. A complete discussion of the various types of HTML form input elements found in the HTML 4.0 draft are at http://www.w3c.org/. Table 5-4 lists a few commonly used input elements and their uses.

Parameter Passing Via GET and POST

We wrap up our discussion of HTML forms with a description of how parameters are passed to server-side application programs via the two HTTP methods GET and POST. To best explain this procedure, we will use an example. Suppose that we have an HTML page called form_elements.html on server 192.168.7.102. The form_elements.html file has two HTML forms on the same page and each form contains the same set of input element fields. The only difference between them is that one form uses the GET method to submit the data and the other uses the POST method. The HTML source code for both these forms is:

<FORM METHOD="GET" ACTION="/cgi-bin/print-query.cgi"><P>
TEXT:<BR> <INPUT TYPE=TEXT NAME=1_text_elem SIZE=20><P>
PASSWORD:<BR> <INPUT TYPE=PASSWORD NAME=2_password_elem SIZE=20><P>
TEXTAREA:<BR> <TEXTAREA ROWS=5 COLS=20 NAME=3_textarea_elem></TEXTAREA><P>
HIDDEN FIELD:<BR> <INPUT TYPE=HIDDEN NAME=4_hidden_elem VALUE="Cant see me">
Yup, it's hidden.<P>
SUBMIT: <BR> <INPUT TYPE=SUBMIT VALUE="GET Method">
</FORM>
<FORM METHOD="POST" ACTION="/cgi-bin/print-query.cgi"><P>
TEXT:<BR> <INPUT TYPE=TEXT NAME=1_text_elem SIZE=20><P>
PASSWORD:<BR> <INPUT TYPE=PASSWORD NAME=2_password_elem SIZE=20><P>
TEXTAREA:<BR> <TEXTAREA ROWS=5 COLS=20 NAME=3_textarea_elem></TEXTAREA><P>
HIDDEN FIELD:<BR> <INPUT TYPE=HIDDEN NAME=4_hidden_elem VALUE="Cant see me">
Yup, it's hidden.<P>
SUBMIT: <BR> <INPUT TYPE=SUBMIT VALUE="POST Method">
</FORM>

Both forms submit data to a program called print-query.cgi. The full URL of this program is http://192.168.7.102/cgi-bin/print-query.cgi. The print-query.cgi file is a simple program that prints all input parameters and values in a table. It also displays the HTTP method used to send the data. The following is the source code for print-query.cgi:

Table 5-4. HTML Form Input Elements

Input Type

Tag

Description

TEXT

<INPUT TYPE=TEXT>

The default input element. It allows single-line ASCII character input.

PASSWORD

<INPUT TYPE=PASSWORD>

Used for entering secret data. The browser displays the contents typed into this field by using asterisks in place of the characters typed. Note that the only security offered is preventing peeping-over-the-shoulder attacks. Internally, the PASSWORD type is no different than the TEXT type.

TEXTAREA

<TEXTAREA> </TEXTAREA>

Used for multiline ASCII character input.

CHECKBOX

<INPUT TYPE=CHECKBOX>

Displays a check box on the browser. It can be toggled on or off and is used to pass boolean data.

RADIO

<INPUT TYPE=RADIO>

Displays a radio button. Only one radio button within a group of radio buttons can be activated at a time. It is used for multiple-choice type of data.

SELECT/OPTION

<SELECT>
    <OPTION> </OPTION>
    <OPTION> </OPTION>
     :
</SELECT>

Displays a scrollable list of items and allows for the selection of one or more items from within the list. Each <SELECT> </SELECT> tag set contains one or more <OPTION> tags, one for each option within the list.

HIDDEN

<INPUT TYPE=HIDDEN>

Does not display any input element on the browser. However, a hidden field may be used to pass predetermined information or preconfigured parameters to the server-side application program. Hidden fields are hidden only from the browser view. They can be easily spotted while going through the HTML page source. We cover this topic in more detail in Chapter 6.

SUBMIT

<INPUT TYPE=SUBMIT>

Displays a form Submit button. It causes the browser to gather the form data and pass it to the server-side application specified in the FORM ACTION= tag.

#!/usr/bin/perl
require "cgi-lib.pl";
&ReadParse(*input);
print "Content-type: text/html\n\n";
print "<H1>Input received:</H1>";
print "Form method: $ENV{'REQUEST_METHOD'} \n";
print "<table border=1>\n";
foreach $param (sort(keys(%input))) {
  print "<tr><td>$param</td><td><pre>$input{$param} </pre></td></tr>\n";
}
print "</table>\n";

The print-query.cgi program is written in Perl. The function ReadParse() is defined in cgi-lib.pl, which is a Perl library containing standard CGI processing routines. All the parameters sent to print-query.cgi are stored in an associative array called "input." The foreach loop steps through each pair in the associative array and prints out the parameter name and value. The environment variable REQUEST_METHOD gets set by the Web server while invoking print-query.cgi. The REQUEST_METHOD variable is either GET or POST.

Let's see what happens if we insert some data in each form in form_elements.html and submit it, first with the GET method and then with the POST method. Figure 5-5 shows what we inserted in the forms.

Figure 5-5. form_elements.html filled with data

graphics/05fig05.gif

Now let's see what happens if we submit the form with the GET request. Figure 5-6 shows the results returned in our browser.

Figure 5-6. Results from print-query.cgi with the GET request

graphics/05fig06.gif

Take a look at the URL bar in the browser shown in Figure 5-6. Even though only part of the URL is displayed, all the data that we inserted was sent via the query string portion of the URL, using URL encoding, to the Web server for interpretation and action. The URL passed to the server looked like this:

http://192.168.7.102/cgi-bin/print-
query.cgi?1_text_elem=Jack&2_password_elem=Jill&3_textarea_elem=Jack+and+Jill%0D%0Asent+
a%0D%0AGET+request&4_hidden_elem=Cant+see+me

Note that the contents of both the password field, "Jill," and the hidden field, "Cant see me," are visible in plain text on the URL string. The carriage return and line-feed characters in the text area element were encoded as %0D%0A, in accordance with the URL encoding format.

The HTTP request sent by the browser is:

GET /cgi-bin/print-
query.cgi?1_text_elem=Jack&2_password_elem=Jill&3_textarea_elem=Jack+and+Jill%0D%0Asent+
a%0D%0AGET+request&4_hidden_elem=Cant+see+me HTTP/1.0
Referer: http://192.168.7.102/form_elements.html
Connection: Keep-Alive
User-Agent: Mozilla/4.76 [en] (Windows NT 5.0; U)
Host: 192.168.7.102
Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, image/png, */*
Accept-Encoding: gzip
Accept-Language: en
Accept-Charset: iso-8859-1,*,utf-8

What happens when this form is submitted via the POST request? Figure 5-7 shows the results returned by print-query.cgi when we submit the form with the POST request.

Figure 5-7. Results from print-query.cgi with the POST request

graphics/05fig07.gif

When a form is submitted via the POST method, the URL's query string isn't used to pass the parameters. Instead they're passed at the end of the entire HTTP header. The HTTP request sent by the browser is:

POST /cgi-bin/print-query.cgi HTTP/1.0
Referer: http://192.168.7.102/form_elements.html
Connection: Keep-Alive
User-Agent: Mozilla/4.76 [en] (Windows NT 5.0; U)
Host: 192.168.7.102
Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, image/png, */*
Accept-Encoding: gzip
Accept-Language: en
Accept-Charset: iso-8859-1,*,utf-8
Content-type: application/x-www-form-urlencoded
Content-length: 123
 
1_text_elem=Jack&2_password_elem=Jill&3_textarea_elem=Jack+and+Jill%0D%0Asent+a%0D%0APOS
T+request&4_hidden_elem=Cant+see+me

When the POST method is used instead of the GET method to request the resource print-query.cgi., the first difference is on the first line of the HTTP header. Another difference is that two more HTTP headers are added:

         Content-type

         Content-length

Content-type indicates the type of input content that follows. For forms, the Content-type is mostly application/x-www-form-urlencoded, which means that the input data is encoded by using the standard URL encoding format. Content-length gives the length of the input data in bytes.

The POST method is used for sending large amounts of input data to the server-side application, and hence the finite space URL query string isn't used. After Content-length, the HTTP header is finished and a blank line is sent. What follows after the blank line is the input data in URL encoded format.

As with the GET request, the contents of the password field and the hidden field are in plain text, the only difference is that they aren't sent as a part of the URL.

 



Web Hacking(c) Attacks and Defense
Web Hacking: Attacks and Defense
ISBN: 0201761769
EAN: 2147483647
Year: 2005
Pages: 156

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net