|< Day Day Up >|
Introduction to Web Programming
Writing an application for the Web is not as simple as writing an application or script that executes on a local machine. Web applications must obey the HTTP protocol, which, by design, is stateless and connectionless. This poses a problem for anything beyond simple programs that submit a form.
To understand the problem, consider the steps in running a normal piece of software from the Tiger desktop (this is a generic fictitious application):
To translate these operations into a web application, however, requires working around the limitations of the HTTP protocol.
Understanding the Stateless Nature of HTTP
When HTTP (Hypertext Transfer Protocol) was developed, the Web was never expected to become the consumer-driven mish-mash that it is today. HTTP was created to be simple and fast. When retrieving a web page, the client performs four actions. It first opens a connection to the remote server. The client then requests a resource from the server and sends form data, if necessary. Next, the client receives the results, and finally, it closes the connection.
This happens repeatedly for different page elements (or, depending on the browser and server, multiple requests can be made in one connection). When the browser has finished downloading data, that data is displayed on the user's screen. At this point in time, there is no connection between the client computer and the server. They have effectively forgotten each other's existence.
If the user clicks a link to visit another page on the server, the same process is repeated. The server has no advance knowledge of who the client is, even though they've just been talking. If you've seen the movie Memento, you'll understand this concept. The HTTP protocol suffers from a severe lack of short-term memory (statelessness).
Applying this new knowledge to the steps of using an application, the problems become obvious:
So, how do you work around a protocol that was never designed to keep information between accesses? By employing session management techniques.
Maintaining State Through Session Management
A session, in web-speak, is the equivalent to the process of running an application from start to finish. The goal of session management is to help the web server remember information about a user and what that user has done in previous requests for the server. Using session management techniques, you can quickly create web applications that function like conventional desktop applications. Unfortunately, there is no perfect session management technique. There are several ways to approach the problem, but none offers a completely satisfying solution.
URL Variable Passing
URL variable passing is the simplest form of session management. To make a value available on any number of web pages, you can use the URL to pass information from page to page. For example, suppose that I had a variable, name, with the value of johnray that I wanted to be available even after clicking a link to another portion of the program. I could create links that looked like this:
Each of the three web applications would receive the variable name with the value johnray upon clicking the links. These applications could then pass the values along even further by appending the same information (?name=johnray) to links within themselves. Obviously, this would require the web applications to generate links dynamically, but it's a small price to pay for being able to reliably pass information from page to page.
This technique relies on the HTTP GET method. When a browser sends a GET request for a web resource, it can append additional data onto the request by adding it in the format:
The trouble with this approach is that to send large amounts of data between pages, you must construct extremely large URLs. Visually, this creates an ugly URL reference in the browser's URL field and could lead users to bookmark a URL that contains information about the current execution of the web application that might not be valid in subsequent executions such as the date or other time-sensitive information.
In addition, users can easily modify the URL line of the browser to send back any information to the server that they want. If you've just created a shopping cart application that passes a user's total to a final billing page where it is charged against that user's credit card, it is unlikely that you want him to be able to adjust the price of the merchandise he's purchasing.
Form Variable Passing
Similar to passing variables within a URL (the GET method) is using the POST method of transferring data. Instead of passing data directly in the request for a page, data is sent after the initial page request and cannot be directly modified by the user.
With POST, developers can use hidden form fields to hold values before they are needed. Assume that you have two forms: the first collects a first and last name, and the second collects an email address and phone number. Submitting the first form opens the second form, which, when submitted, saves the data to a file.
Each form could save its data to a file independently, but this is problematic when considering applications in which all data must be present before it can be saved. Session management can be used to ensure that all data is present when the final form is submitted.
For example, assume that the first form looks something like this:
<form action="form2.cgi" method="post"> First Name: <input type="text" name="first"><br> Last Name: <input type="text" name="last"><br> <input type="submit"> </form>
This form submits two fields (first and last) to the form2.cgi. If the second form must collect an email address and phone number and submit them simultaneously with the first and last values, the form2.cgi could dynamically create a form that stored the original two fields in two hidden input fields:
<form action="savedata.cgi" method="post"> Email Address: <input type="text" name="email"><br> Phone Number: <input type="text" name="phone"><br> <input type="hidden" name="first" value="first-value"> <input type="hidden" name="last" value="last-value"> <input type="submit"> </form>
Submitting this form would make all the field data available to the subsequent page (savedata.cgi).
Unfortunately, the trouble with this approach is that only pages with forms can transfer data between one another. Form variable passing is usually used in conjunction with URL passing to cover all bases.
Data integrity is also an issue with this method because a savvy user could easily save an HTML form locally, edit the hidden field values, and then submit the data from the edited form.
Another way to pass information is to use a cookie. Cookies are variable/value pairs that are stored on a user's computer and can be retrieved by the remote web server. Many people are cautious about cookies because of the fear of information being stolen from the cookie without their knowledge. Cookies, however, can be a valuable tool for web developers and users alike.
From the developer's perspective, assigning a cookie is much like setting a variable. You can name the cookie and give it a value and an expiration day/time. That value then becomes globally available regardless of whether the user jumps to another page, retypes the URL, or starts over. Only if the cookie is reassigned or reaches its expiration does the value cease to exist. There is even a special type of cookie expiration that can limit a cookie's lifetime to the current browser session. In this case, the values are never stored on the client computer and are forgotten when the user exits the program. Using this special type of expiration, a programmer can create a web application that, after the user exits, leaves no remnants of the login information. This is as close to traditional programming-language variables as a web developer can hope to get.
Cookies are saved to the local computer's drive and can be viewed in many popular browsers. Safari, for example, enables the user to examine stored cookies within the Security Preferences pane, shown in Figure 24.1.
Figure 24.1. Popular browsers, such as Safari, enable the user to browse stored cookies.
Although it is possible to use other techniques for passing information, cookies are the fastest and easiest. Regardless of the technique used to maintain information two final elements are missing from the big picture the session database and session ID. Together they form the Holy Grail of session management, session variables.
A session variable is a variable that can be set to any value, will be accessible by any portion of a web application, and will last only while the web application is being used. In principle, any of the techniques we've looked at so far can do this. Unfortunately, they all fall short when applied to a large system.
For example, imagine that you're passing variables using the URL method:
This works great for one or two variables, but extend it to a few thousand! Suddenly a two- or three-line URL seems short. There is a limit to the amount of data that can be contained within a URL, making this impossible for large amounts of information.
When using cookies or forms to pass data, you aren't necessarily limited by the size of the request string but by the overhead and complexity of the coding. For each variable that must be stored, a hidden field must be added to a form or a cookie sent back to the server. This process must be repeated on every page. This adds up, in terms of transmission time and processing.
Luckily, there is a solution that can be used with any of the approaches to variable passing the use of a session database and a session ID.
The concept is simple when a user comes to a website, his session starts. He is assigned a unique ID, called the session ID, by the remote web application. As the user interacts with the website, the web application passes the session ID from page to page. This process can be done using the URL, forms, or cookies. When the web application software wants to store a value, it stores it on the server, in a local database that is keyed to that particular session ID.
For programmers, this is a dream come true. They can store any information they want (including sensitive data), and it is never transmitted over the network. The only piece of data that is visible on the network wire is the session ID.
Because a single piece of information can keep track of an unlimited number of variables, the session management system can be written to pass the session ID using URL/form methods or a cookie. Either way is entirely feasible. To make things even easier, developers have included these capabilities in programming languages such as JSP and PHP. For example, in PHP, you can activate session management and store a variable for use on another web page using syntax like this:
<?php session_start(); $_SESSION["x"] = $_SESSION["x"] + 1; print $_SESSION["x"]; ?>
This example uses session_start() to create a new session ID, which is automatically stored in a cookie. Next, the variable x is incremented and stored again in the global $_SESSION array. Finally, the value of x is displayed. The result is a web page that displays an increasing count each time a user loads it.
More traditional languages (such as Perl or C) weren't created with web programming in mind. To implement session variables within Perl, you must create, manipulate, and manage session IDs and session databases. This has already been done so many times that a number of prebuilt solutions are available to work with, but none is as elegant as a language designed for the purposes of creating web applications.
|< Day Day Up >|