Section 16.6. The Hello World Selector


16.6. The Hello World Selector

Let's get back to writing some code again. It's time for something a bit more useful than the examples we've seen so far (well, more entertaining, at least). This section presents a program that displays the basic syntax required by various programming languages to print the string "Hello World," the classic language benchmark.

To keep it simple, this example assumes that the string is printed to the standard output stream in the selected language, not to a GUI or web page. It also gives just the output command itself, not the complete programs. The Python version happens to be a complete program, but we won't hold that against its competitors here.

Structurally, the first cut of this example consists of a main page HTML file, along with a Python-coded CGI script that is invoked by a form in the main HTML page. Because no state or database data is stored between user clicks, this is still a fairly simple example. In fact, the main HTML page implemented by Example 16-17 is mostly just one big pull-down selection list within a form.

Example 16-17. PP3E\Internet\Web\languages.html

 <html><title>Languages</title> <body> <h1>Hello World selector</h1> <P>This demo shows how to display a "hello world" message in various programming languages' syntax.  To keep this simple, only the output command is shown (it takes more code to make a complete program in some of these languages), and only text-based solutions are given (no GUI or HTML construction logic is included). This page is a simple HTML file; the one you see after pressing the button below is generated by a Python CGI script which runs on the server. Pointers: <UL> <LI>To see this page's HTML, use the 'View Source' command in your browser. <LI>To view the Python CGI script on the server,     <A HREF="cgi-bin/languages-src.py">click here</A> or     <A HREF="cgi-bin/getfile.py?filename=cgi-bin/languages.py">here</A>. <LI>To see an alternative version that generates this page dynamically,     <A HREF="cgi-bin/languages2.py">click here</A>. </UL></P> <hr> <form method=POST action="cgi-bin/languages.py">     <P><B>Select a programming language:</B>     <P><select name=language>         <option>All         <option>Python         <option>Perl         <option>Tcl         <option>Scheme         <option>SmallTalk         <option>Java         <option>C         <option>C++         <option>Basic         <option>Fortran         <option>Pascal         <option>Other     </select>     <P><input type=Submit> </form> </body></html> 

For the moment, let's ignore some of the hyperlinks near the middle of this file; they introduce bigger concepts like file transfers and maintainability that we will explore in the next two sections. When visited with a browser, this HTML file is downloaded to the client and is rendered into the new browser page shown in Figure 16-21.

Figure 16-21. The "Hello World" main page


That widget above the Submit button is a pull-down selection list that lets you choose one of the <option> tag values in the HTML file. As usual, selecting one of these language names and pressing the Submit button at the bottom (or pressing your Enter key) sends the selected language name to an instance of the server-side CGI script program named in the form's action option. Example 16-18 contains the Python script that is run by the web server upon submission.

Example 16-18. PP3E\Internet\Web\cgi-bin\languages.py

 #!/usr/bin/python ############################################################################# # show hello world syntax for input language name; note that it uses r'...' # raw strings so that '\n' in the table are left intact, and cgi.escape( ) # on the string so that things like '<<' don't confuse browsers--they are # translated to valid HTML code; any language name can arrive at this script, # since explicit URLs "http://servername/cgi-bin/languages.py?language=Cobol" # can be typed in a web browser or sent by a script (e.g., urllib.urlopen). # caveats: the languages list appears in both the CGI and HTML files--could # import from single file if selection list generated by a CGI script too; ############################################################################# debugme  = False                                 # True=test from cmd line inputkey = 'language'                            # input parameter name hellos = {     'Python':    r" print 'Hello World'               ",     'Perl':      r' print "Hello World\n";            ',     'Tcl':       r' puts "Hello World"                ',     'Scheme':    r' (display "Hello World") (newline) ',     'SmallTalk': r" 'Hello World' print.              ",     'Java':      r' System.out.println("Hello World"); ',     'C':         r' printf("Hello World\n");          ',     'C++':       r' cout << "Hello World" << endl;    ',     'Basic':     r' 10 PRINT "Hello World"            ',     'Fortran':   r" print *, 'Hello World'             ",     'Pascal':    r" WriteLn('Hello World');            " } class dummy:                                     # mocked-up input obj     def _ _init_ _(self, str): self.value = str import cgi, sys if debugme:     form = {inputkey: dummy(sys.argv[1])}          # name on cmd line else:     form = cgi.FieldStorage( )                    # parse real inputs print 'Content-type: text/html\n'                  # adds blank line print '<TITLE>Languages</TITLE>' print '<H1>Syntax</H1><HR>' def showHello(form):                             # HTML for one language     choice = form[inputkey].value     print '<H3>%s</H3><P><PRE>' % choice     try:         print cgi.escape(hellos[choice])     except KeyError:         print "Sorry--I don't know that language"     print '</PRE></P><BR>' if not form.has_key(inputkey) or form[inputkey].value == 'All':     for lang in hellos.keys( ):         mock = {inputkey: dummy(lang)}         showHello(mock) else:     showHello(form) print '<HR>' 

And as usual, this script prints HTML code to the standard output stream to produce a response page in the client's browser. Not much is new to speak of in this script, but it employs a few techniques that merit special focus:


Raw strings and quotes

Notice the use of raw strings (string constants preceded by an "r" character) in the language syntax dictionary. Recall that raw strings retain \ backslash characters in the string literally, instead of interpreting them as string escape-code introductions. Without them, the \n newline character sequences in some of the language's code snippets would be interpreted by Python as line feeds, instead of being printed in the HTML reply as \n. The code also uses double quotes for strings that embed an unescaped single-quote character, per Python's normal string rules.


Escaping text embedded in HTML and URLs

This script takes care to format the text of each language's code snippet with the cgi.escape utility function. This standard Python utility automatically translates characters that are special in HTML into HTML escape code sequences, so that they are not treated as HTML operators by browsers. Formally, cgi.escape translates characters to escape code sequences, according to the standard HTML convention: <, >, and & become &lt;, &gt;, and &amp;. If you pass a second true argument, the double-quote character (") is translated to &quot;.

For example, the << left-shift operator in the C++ entry is translated to &lt;&lt;a pair of HTML escape codes. Because printing each code snippet effectively embeds it in the HTML response stream, we must escape any special HTML characters it contains. HTML parsers (including Python's standard htmllib module) translate escape codes back to the original characters when a page is rendered.

More generally, because CGI is based upon the notion of passing formatted strings across the Net, escaping special characters is a ubiquitous operation. CGI scripts almost always need to escape text generated as part of the reply to be safe. For instance, if we send back arbitrary text input from a user or read from a data source on the server, we usually can't be sure whether it will contain HTML characters, so we must escape it just in case.

In later examples, we'll also find that characters inserted into URL address strings generated by our scripts may need to be escaped as well. A literal & in a URL is special, for example, and must be escaped if it appears embedded in text we insert into a URL. However, URL syntax reserves different special characters than HTML code, and so different escaping conventions and tools must be used. As we'll see later in this chapter, cgi.escape implements escape translations in HTML code, but urllib.quote (and its relatives) escapes characters in URL strings.


Mocking up form inputs

Here again, form inputs are "mocked up" (simulated), both for debugging and for responding to a request for all languages in the table. If the script's global debugme variable is set to a true value, for instance, the script creates a dictionary that is plug-and-play compatible with the result of a cgi.FieldStorage callits "languages" key references an instance of the dummy mock-up class. This class in turn creates an object that has the same interface as the contents of a cgi.FieldStorage resultit makes an object with a value attribute set to a passed-in string.

The net effect is that we can test this script by running it from the system command line: the generated dictionary fools the script into thinking it was invoked by a browser over the Net. Similarly, if the requested language name is "All," the script iterates over all entries in the languages table, making a mocked-up form dictionary for each (as though the user had requested each language in turn).

This lets us reuse the existing showHello logic to display each language's code in a single page. As always in Python, object interfaces and protocols are what we usually code for, not specific datatypes. The showHello function will happily process any object that responds to the syntax form['language'].value.[*] Notice that we could achieve similar results with a default argument in showHello, albeit at the cost of introducing a special case in its code.

[*] If you are reading closely, you might notice that this is the second time we've used mock-ups in this chapter (see the earlier tutor4.cgi example). If you find this technique generally useful, it would probably make sense to put the dummy class, along with a function for populating a form dictionary on demand, into a module so that it can be reused. In fact, we will do that in the next section. Even for two-line classes like this, typing the same code the third time around will do much to convince you of the power of code reuse.

Now let's get back to interacting with this program. If we select a particular language, our CGI script generates an HTML reply of the following sort (along with the required content-type header and blank line). Use your browser's View Source option to see this:

 <TITLE>Languages</TITLE> <H1>Syntax</H1><HR> <H3>Scheme</H3><P><PRE>  (display "Hello World") (newline) </PRE></P><BR> <HR> 

Program code is marked with a <PRE> tag to specify preformatted text (the browser won't reformat it like a normal text paragraph). This reply code shows what we get when we pick Scheme. Figure 16-22 shows the page served up by the script after selecting Python in the pull-down selection list.

Figure 16-22. Response page created by languages.py


Our script also accepts a language name of "All" and interprets it as a request to display the syntax for every language it knows about. For example, here is the HTML that is generated if we set the global variable debugme to true and run from the system command line with a single argument, All. This output is the same as what is printed to the client's web browser in response to an "All" selection:[*]

[*] Interestingly, we also get the "All" reply if debugme is set to False when we run the script from the command line. Instead of throwing an exception, the cgi.FieldStorage call returns an empty dictionary if called outside the CGI environment, so the test for a missing key kicks in. It's likely safer to not rely on this behavior, however.

 C:\...\PP3E\Internet\Web\cgi-bin>python languages.py All Content-type: text/html <TITLE>Languages</TITLE> <H1>Syntax</H1><HR> <H3>C</H3><P><PRE>  printf("Hello World\n"); </PRE></P><BR> <H3>Java</H3><P><PRE>  System.out.println("Hello World"); </PRE></P><BR> <H3>Python</H3><P><PRE>  print 'Hello World' </PRE></P><BR> <H3>Pascal</H3><P><PRE>  WriteLn('Hello World'); </PRE></P><BR> <H3>C++</H3><P><PRE>  cout &lt;&lt; "Hello World" &lt;&lt; endl; </PRE></P><BR> <H3>Perl</H3><P><PRE>  print "Hello World\n"; </PRE></P><BR> <H3>Fortran</H3><P><PRE>  print *, 'Hello World' </PRE></P><BR> <H3>Tcl</H3><P><PRE>  puts "Hello World" </PRE></P><BR> <H3>Basic</H3><P><PRE>  10 PRINT "Hello World" </PRE></P><BR> <H3>Scheme</H3><P><PRE>  (display "Hello World") (newline) </PRE></P><BR> <H3>SmallTalk</H3><P><PRE>  'Hello World' print. </PRE></P><BR> <HR> 

Each language is represented here with the same code patternthe showHello function is called for each table entry, along with a mocked-up form object. Notice the way that C++ code is escaped for embedding inside the HTML stream; this is the cgi.escape call's handiwork. Your web browser translates the &lt; escapes to < characters when the page is rendered. When viewed with a browser, the "All" response page is rendered as shown in Figure 16-23.

Figure 16-23. Response page for "All" languages choice


16.6.1. Checking for Missing and Invalid Inputs

So far, we've been triggering the CGI script by selecting a language name from the pull-down list in the main HTML page. In this context, we can be fairly sure that the script will receive valid inputs. Notice, though, that there is nothing to prevent a client from passing the requested language name at the end of the CGI script's URL as an explicit query parameter, instead of using the HTML page form. For instance, a URL of the following kind typed into a browser's address field or submitted with the module urllib:

 http://localhost/cgi-bin/languages.py?language=Python 

yields the same "Python" response page shown in Figure 16-22. However, because it's always possible for a user to bypass the HTML file and use an explicit URL, a user could invoke our script with an unknown language name, one that is not in the HTML file's pull-down list (and so not in our script's table). In fact, the script might be triggered with no language input at all if someone explicitly submits its URL with no language parameter (or no parameter value) at the end. Such an erroneous URL could be entered into a browser's address field, or be sent by another script using the urllib module techniques described earlier in this chapter:

 >>> from urllib import urlopen >>> request = 'http://localhost/cgi-bin/languages.py?language=Python' >>> reply = urlopen(request).read( ) >>> print reply <TITLE>Languages</TITLE> <H1>Syntax</H1><HR> <H3>Python</H3><P><PRE>  print 'Hello World' </PRE></P><BR> <HR> 

To be robust, the script checks for both cases explicitly, as all CGI scripts generally should. For instance, here is the HTML generated in response to a request for the fictitious language GuiDO (you can also see this by selecting your browser's View Source option, after typing the URL manually into your browser's address field):

 >>> request = 'http://localhost/cgi-bin/languages.py?language=GuiDO' >>> reply = urlopen(request).read( ) >>> print reply <TITLE>Languages</TITLE> <H1>Syntax</H1><HR> <H3>GuiDO</H3><P><PRE> Sorry--I don't know that language </PRE></P><BR> <HR> 

If the script doesn't receive any language name input, it simply defaults to the "All" case (this can also be triggered if the URL ends with just ?language= and no language name value):

 >>> reply = urlopen('http://localhost/cgi-bin/languages.py').read( ) >>> print reply <TITLE>Languages</TITLE> <H1>Syntax</H1><HR> <H3>C</H3><P><PRE>  printf("Hello World\n"); </PRE></P><BR> <H3>Java</H3><P><PRE>  System.out.println("Hello World"); </PRE></P><BR> <H3>Python</H3><P><PRE>  print 'Hello World' ...more... 

If we didn't detect these cases, chances are that our script would silently die on a Python exception and leave the user with a mostly useless half-complete page or with a default error page (we didn't assign stderr to stdout here, so no Python error message would be displayed). Figure 16-24 shows the page generated if the script is invoked with an explicit URL like this:

 http://localhost/cgi-bin/languages.py?language=COBOL 

Figure 16-24. Response page for unknown language


To test this error case interactively, the pull-down list includes an "Other" name, which produces a similar error page reply. Adding code to the script's table for the COBOL "Hello World" program is left as an exercise for the reader.




Programming Python
Programming Python
ISBN: 0596009259
EAN: 2147483647
Year: 2004
Pages: 270
Authors: Mark Lutz

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net