16.2. What's a Server-Side CGI Script?
Simply put, CGI scripts implement much of the interaction you typically experience on the Web. They are a standard and widely used mechanism for programming web-based systems and web site interaction. There are other ways to add interactive behavior to web sites with Python, including client-side solutions (e.g., Jython applets and Active Scripting), as well as server-side technologies that build on the basic CGI model (e.g., Python Server Pages, and the Zope, Webware, CherryPy, and Django frameworks). We will discuss such alternatives in Chapter 18.
But by and large, CGI server-side scripts are used to program much of the activity on the Web. They are perhaps the most primitive approach to implementing web sites, and they do not by themselves offer the tools that are often built into larger frameworks. CGI scripts, however, are in many ways the simplest technique for server-side scripting. As a result, they are an ideal way to get started with programming on the server side of the Web. Especially for simpler sites that do not require enterprise-level tools, CGI is sufficient, and can be augmented with additional libraries as needed.
16.2.1. The Script Behind the Curtain
Formally speaking, CGI scripts are programs that run on a server machine and adhere to the Common Gateway Interfacea model for browser/server communications, from which CGI scripts take their name. CGI is an application protocol that web servers use to transfer input data and results between web browsers and server-side scripts. Perhaps a more useful way to understand CGI, though, is in terms of the interaction it implies.
Most people take this interaction for granted when browsing the Web and pressing buttons in web pages, but a lot is going on behind the scenes of every transaction on the Web. From the perspective of a user, it's a fairly familiar and simple process:
And, believe it or not, that simple model is what makes most of the Web hum. But internally, it's a bit more complex. In fact, a subtle client/server socket-based architecture is at workyour web browser running on your computer is the client, and the computer you contact over the Web is the server. Let's examine the interaction scenario again, with all the gory details that users usually never see:
In other words, CGI scripts are something like callback handlers for requests generated by web browsers that require a program to be run dynamically. They are automatically run on the server machine in response to actions in a browser. Although CGI scripts ultimately receive and send standard structured messages over sockets, CGI is more like a higher-level procedural convention for sending and receiving information between a browser and a server.
16.2.2. Writing CGI Scripts in Python
If all of this sounds complicated, relaxPython, as well as the resident HTTP server, automates most of the tricky bits. CGI scripts are written as fairly autonomous programs, and they assume that startup tasks have already been accomplished. The HTTP web server program, not the CGI script, implements the server side of the HTTP protocol itself. Moreover, Python's library modules automatically dissect information sent up from the browser and give it to the CGI script in an easily digested form. The upshot is that CGI scripts may focus on application details like processing input data and producing a result page.
As mentioned earlier, in the context of CGI scripts, the stdin and stdout streams are automatically tied to sockets connected to the browser. In addition, the HTTP server passes some browser information to the CGI script in the form of shell environment variables, and possibly command-line arguments. To CGI programmers, that means:
The most complex parts of this scheme include parsing all the input information sent up from the browser and formatting information in the reply sent back. Happily, Python's standard library largely automates both tasks:
We'll study both of these interfaces in detail later in this chapter. For now, keep in mind that although any language can be used to write CGI scripts, Python's standard modules and language attributes make it a snap.
Perhaps less happily, CGI scripts are also intimately tied to the syntax of HTML, since they must generate it to create a reply page. In fact, it can be said that Python CGI scripts embed HTML, which is an entirely distinct language in its own right.[*] As we'll also see, the fact that CGI scripts create a user interface by printing HTML syntax means that we have to take special care with the text we insert into a web page's code (e.g., escaping HTML operators). Worse, CGI scripts require at least a cursory knowledge of HTML forms, since that is where the inputs and target script's address are typically specified.
This book won't teach HTML in depth; if you find yourself puzzled by some of the arcane syntax of the HTML generated by scripts here, you should glance at an HTML introduction, such as HTML and XHTML: The Definitive Guide. Also keep in mind that higher-level tools and frameworks can sometimes hide the details of HTML generation from Python programmers; with the HTMLgen extension we'll meet in Chapter 18, for instance, it's possible to deal in Python objects, not HTML syntax.