This section presents the source code of the utility modules imported and used by the page scripts shown above. There aren't any new screen shots to see here, because these are utilities, not CGI scripts (notice their .py extensions). Moreover, these modules aren't all that useful to study in isolation, and are included here primarily to be referenced as you go through the CGI scripts' code. See earlier in this chapter for additional details not repeated here.
13.6.1 External Components
When I install PyMailCgi and other server-side programs shown in this book, I simply upload the contents of the Cgi-Web examples directory on my laptop to the top-level web directory on my server account (public_html ). The Cgi-Web directory also lives on this book's CD (see http://examples.oreilly.com/python2), a mirror of the one on my PC. I don't copy the entire book examples distribution to my web server, because code outside the Cgi-Web directory isn't designed to run on a web server.
When I first installed PyMailCgi, however, I ran into a problem: it's written to reuse modules coded in other parts of the book, and hence in other directories outside Cgi-Web. For example, it reuses the mailconfig and pymail modules we wrote in Chapter 11, but neither lives in the CGI examples directory. Such external dependencies are usually okay, provided we use package imports or configure sys.path appropriately on startup. In the context of CGI scripts, though, what lives on my development machine may not be what is available on the web server machine where the scripts are installed.
To work around this (and avoid uploading the full book examples distribution to my web server), I define a directory at the top-level of Cgi-Web called Extern, to which any required external modules are copied as needed. For this system, Extern includes a subdirectory called Email, where the mailconfig and pymail modules are copied for upload to the server.
Redundant copies of files are less than ideal, but this can all be automated with install scripts that automatically copy to Extern and then upload Cgi-Web contents via FTP using Python's ftplib module (discussed in Chapter 11). Just in case I change this structure, though, I've encapsulated all external name accesses in the utility module in Example 13-10.
Example 13-10. PP2EInternetCgi-WebPyMailCgiexterns.py
############################################################## # Isolate all imports of modules that live outside of the # PyMailCgi PyMailCgi directory. Normally, these would come # from PP2E.Internet.Email, but when I install PyMailCgi, # I copy just the Cgi-Web directory's contents to public_html # on the server, so there is no PP2E directory on the server. # Instead, I either copy the imports referenced in this file to # the PyMailCgi parent directory, or tweak the dir appended to # the sys.path module search path here. Because all other # modules get the externals from here, there is only one place # to change when they are relocated. This may be arguably # gross, but I only put Internet code on the server machine. ############################################################## import sys sys.path.append('..') # see dir where Email installed on server from Extern import Email # assumes a ../Extern dir with Email dir from Extern.Email import pymail # can use names Email.pymail or pymail from Extern.Email import mailconfig
This module appends the parent directory of PyMailCgi to sys.path to make the Extern directory visible (remember, PYTHONPATH might be anything when CGI scripts are run as user "nobody") and preimports all external names needed by PyMailCgi into its own namespace. It also supports future changes; because all external references in PyMailCgi are made through this module, I have to change only this one file if externals are later installed differently.
As a reference, Example 13-11 lists part of the external mailconfig module again. For PyMailCgi, it's copied to Extern, and may be tweaked as desired on the server (for example, the signature string differs slightly in this context). See the pymail.py file in Chapter 11, and consider writing an automatic copy-and-upload script for the Cgi-WebExtern directory a suggested exercise; it's not proved painful enough to compel me to write one of my own.
Example 13-11. PP2EInternetCgi-WebExternEmailmailconfig.py
############################################ # email scripts get server names from here: # change to reflect your machine/user names; # could get these in command line instead ############################################ # SMTP email server machine (send) smtpservername = 'smtp.rmi.net' # or starship.python.net, 'localhost' # POP3 email server machine, user (retrieve) popservername = 'pop.rmi.net' # or starship.python.net, 'localhost' popusername = 'lutz' # password is requested when run ...rest omitted # personal info used by PyMailGui to fill in forms; # sig-- can be a triple-quoted block, ignored if empty string; # addr--used for initial value of "From" field if not empty, myaddress = 'lutz@rmi.net' mysignature = '--Mark Lutz (http://rmi.net/~lutz) [PyMailCgi 1.0]'
13.6.2 POP Mail Interface
The loadmail utility module in Example 13-12 depends on external files and encapsulates access to mail on the remote POP server machine. It currently exports one function, loadnewmail, which returns a list of all mail in the specified POP account; callers are unaware of whether this mail is fetched over the Net, lives in memory, or is loaded from a persistent storage medium on the CGI server machine. That is by design -- loadmail changes won't impact its clients.
Example 13-12. PP2EInternetCgi-WebPyMailCgiloadmail.py
################################################################### # mail list loader; future--change me to save mail list between # cgi script runs, to avoid reloading all mail each time; this # won't impact clients that use the interfaces here if done well; # for now, to keep this simple, reloads all mail on each operation ################################################################### from commonhtml import runsilent # suppress print's (no verbose flag) from externs import Email # load all mail from number 1 up # this may trigger an exception def loadnewmail(mailserver, mailuser, mailpswd): return runsilent(Email.pymail.loadmessages, (mailserver, mailuser, mailpswd))
It's not much to look at -- just an interface and calls to other modules. The Email.pymail.loadmessages function (reused here from Chapter 11) uses the Python poplib module to fetch mail over sockets. All this activity is wrapped in a commonhtml.runsilent function call to prevent pymail print statements from going to the HTML reply stream (although any pymail exceptions are allowed to propagate normally).
As it is, though, loadmail loads all incoming email to generate the selection list page, and reloads all email again every time you fetch a message from the list. This scheme can be horribly inefficient if you have lots of email sitting on your server; I've noticed delays on the order of a dozen seconds when my mailbox is full. On the other hand, servers can be slow in general, so the extra time taken to reload mail isn't always significant; I've witnessed similar delays on the server for empty mailboxes and simple HTML pages too.
More importantly, loadmail is intended only as a first-cut mail interface -- something of a usable prototype. If I work on this system further, it would be straightforward to cache loaded mail in a file, shelve, or database on the server, for example. Because the interface exported by loadmail would not need to change to introduce a caching mechanism, clients of this module would still work. We'll explore server storage options in the next chapter.
13.6.3 POP Password Encryption
Time to call the cops. We discussed the approach to password security adopted by PyMailCgi earlier. In brief, it works hard to avoid ever passing the POP account username and password across the Net together in a single transaction, unless the password is encrypted according to module secret.py on the server. This module can be different everywhere PyMailCgi is installed and can be uploaded anew at any time -- encrypted passwords aren't persistent and live only for the duration of one mail-processing interaction session.[4] Example 13-13 is the encryptor module I installed on my server while developing this book.
[4] Note that there are other ways to handle password security, beyond the custom encryption schemes described in this section. For instance, Python's socket module now supports the server-side portion of the OpenSSL secure sockets protocol. With it, scripts may delegate the security task to web browsers and servers. On the other hand, such schemes do not afford as good an excuse to introduce Python's standard encryption tools in this book.
Example 13-13. PP2EInternetCgi-WebPyMailCgisecret.py
############################################################################### # PyMailCgi encodes the pop password whenever it is sent to/from client over # the net with a user name as hidden text fields or explicit url params; uses # encode/decode functions in this module to encrypt the pswd--upload your own # version of this module to use a different encryption mechanism; pymail also # doesn't save the password on the server, and doesn't echo pswd as typed, but # this isn't 100% safe--this module file itself might be vulnerable to some # malicious users; Note: in Python 1.6, the socket module will include standard # (but optional) support for openSSL sockets on the server, for programming # secure Internet transactions in Python; see 1.6 socket module docs; ############################################################################### forceReadablePassword = 0 forceRotorEncryption = 1 import time, string dayofweek = time.localtime(time.time( ))[6] ############################################################################### # string encoding schemes ############################################################################### if not forceReadablePassword: # don't do anything by default: the urllib.quote or # cgi.escape calls in commonhtml.py will escape the # password as needed to embed in in URL or HTML; the # cgi module undoes escapes automatically for us; def stringify(old): return old def unstringify(old): return old else: # convert encoded string to/from a string of digit chars, # to avoid problems with some special/nonprintable chars, # but still leave the result semi-readable (but encrypted); # some browser had problems with escaped ampersands, etc.; separator = '-' def stringify(old): new = '' for char in old: ascii = str(ord(char)) new = new + separator + ascii # '-ascii-ascii-ascii' return new def unstringify(old): new = '' for ascii in string.split(old, separator)[1:]: new = new + chr(int(ascii)) return new ############################################################################### # encryption schemes ############################################################################### if (not forceRotorEncryption) and (dayofweek % 2 == 0): # use our own scheme on evenly-numbered days (0=monday) # caveat: may fail if encode/decode over midnite boundary def do_encode(pswd): res = '' for char in pswd: res = res + chr(ord(char) + 1) # add 1 to each ascii code return str(res) def do_decode(pswd): res = '' for char in pswd: res = res + chr(ord(char) - 1) return res else: # use the standard lib's rotor module to encode pswd # this does a better job of encryption than code above import rotor mykey = 'pymailcgi' def do_encode(pswd): robj = rotor.newrotor(mykey) # use enigma encryption return robj.encrypt(pswd) def do_decode(pswd): robj = rotor.newrotor(mykey) return robj.decrypt(pswd) ############################################################################### # top-level entry points ############################################################################### def encode(pswd): return stringify(do_encode(pswd)) # encrypt plus string encode def decode(pswd): return do_decode(unstringify(pswd))
This encryptor module implements two alternative encryption schemes: a simple ASCII character code mapping, and Enigma-style encryption using the standard rotor module. The rotor module implements a sophisticated encryption strategy, based on the "Enigma" encryption machine used by the Nazis to encode messages during World War II. Don't panic, though; Python's rotor module is much less prone to cracking than the Nazis'!
In addition to encryption, this module also implements an encoding method for already-encrypted strings. By default, the encoding functions do nothing, and the system relies on straight URL encoding. An optional encoding scheme translates the encrypted string to a string of ASCII code digits separated by dashes. Either encoding method makes non-printable characters in the encrypted string printable.
13.6.3.1 Default encryption scheme: rotor
To illustrate, let's test this module's tools interactively. First off, we'll experiment with Python's standard rotor module, since it's at the heart of the default encoding scheme. We import the module, make a new rotor object with a key (and optionally, a rotor count), and call methods to encrypt and decrypt:
C:...PP2EInternetCgi-WebPyMailCgi>python >>> import rotor >>> r = rotor.newrotor('pymailcgi') # (key, [,numrotors]) >>> r.encrypt('abc123') # may return non-printable chars ' 323an 21224' >>> x = r.encrypt('spam123') # result is same len as input >>> x '* _344 11pY' >>> len(x) 7 >>> r.decrypt(x) 'spam123'
Notice that the same rotor object can encrypt multiple strings, that the result may contain non-printable characters (printed as ascii escape codes when displayed, possibly in octal form), and that the result is always the same length as the original string. Most importantly, a string encrypted with rotor can be decrypted in a different process (e.g., in a later CGI script) if we recreate the rotor object:
C:...PP2EInternetCgi-WebPyMailCgi>python >>> import rotor >>> r = rotor.newrotor('pymailcgi') # can be decrypted in new process >>> r.decrypt('* _344 11pY') # use "ascii" escapes for two chars 'spam123'
Our secret module by default simply uses rotor to encrypt, and does no additional encoding of its own. It relies on URL encoding when the password is embedded in a URL parameter, and HTML escaping when the password is embedded in hidden form fields. For URLs, the following sorts of calls occur:
>>> from secret import encode, decode >>> x = encode('abc$#<>&+') # CGI scripts do this (rotor) >>> x ' 323a 16317326 23 163' >>> import urllib # urllib.urlencode does this >>> y = urllib.quote_plus(x) >>> y '+%d3a%0e%cf%d6%13%0e3' >>> a = urllib.unquote_plus(y) # cgi.FieldStorage does this >>> a ' 323a 16317326 23 163' >>> decode(a) # CGI scripts do this (rotor) 'abc$#<>&+'
13.6.3.2 Alternative encryption schemes
To show how to write alternative encryptors and encoders, secret also includes a digits-string encoder and a character-code shuffling encryptor; both are enabled with global flag variables at the top of the module:
forceReadablePassword
If set to true, the encrypted password is encoded into a string of ASCII code digits separated by dashes. Defaults to false to fall back on URL and HTML escape encoding.
forceRotorEncryption
If set to false and the encryptor is used on an even-numbered day of the week, the simple character-code encryptor is used instead of rotor. Defaults to true to force rotor encryption.
To show how these alternatives work, lets's set forceReadablePassword to 1 and forceRotorEncryption to 0, and reimport. Note that these are global variables that must be set before the module is imported (or reloaded), because they control the selection of alternative def statements. Only one version of each kind of function is ever made by the module:
C:...PP2EInternetCgi-WebPyMailCgi>python >>> from secret import * >>> x = encode('abc$#<>&+') >>> x '-98-99-100-37-36-61-63-39-44' >>> y = decode(x) >>> y 'abc$#<>&+'
This really happens in two steps, though -- encryption and then encoding (the top-level encode and decode functions orchestrate the two steps). Here's what the steps look like when run separately:
>>> t = do_encode('abc$#<>&+') # just our encryption >>> t "bcd%$=?'," >>> stringify(t) # add our own encoding '-98-99-100-37-36-61-63-39-44' >>> unstringify(x) # undo encoding "bcd%$=?'," >>> do_decode(unstringify(x)) # undo both steps 'abc$#<>&+'
This alternative encryption scheme merely adds 1 to the each character's ASCII code value, and the encoder inserts the ASCII code integers of the result. It's also possible to combine rotor encryption and our custom encoding (set both forceReadablePassword and forceRotorEncryption to 1), but URL encoding provided by urllib works just as well. Here are a variety of schemes in action; secret.py is edited and saved before each reload:
>>> import secret >>> secret.encode('spam123') # default: rotor, no extra encoding '* _344 11pY' >>> reload(secret) # forcereadable=1, forcerotor=0 >>> secret.encode('spam123') '-116-113-98-110-50-51-52' >>> reload(secret) # forcereadable=1, forcerotor=1 >>> secret.encode('spam123') '-42-32-95-228-9-112-89' >>> ord('Y') # the last one is really a 'Y' 89 >>> reload(secret) # back to default rotor, no stringify >>> import urllib >>> urllib.quote_plus(secret.encode('spam123')) '%2a+_%e4%09pY' >>> 0x2a # the first is really 42, '*' 42 >>> chr(42) '*'
You can provide any kind of encryption and encoding logic you like in a custom secret.py, as long as it adheres to the expected protocol -- encoders and decoders must receive and return a string. You can also alternate schemes by days of the week as done here (but note that this can fail if your system is being used when the clock turns over at midnight!), and so on. A few final pointers:
Other Python encryption tools
There are additional encryption tools that come with Python or are available for Python on the Web; see http://www.python.org and the library manual for details. Some encryption schemes are considered serious business and may be protected by law from export, but these rules change over time.
Secure sockets support
As mentioned, Python 1.6 (not yet out as I wrote this) will have standard support for OpenSSL secure sockets in the Python socket module. OpenSSL is an open source implementation of the secure sockets protocol (you must fetch and install it separately from Python -- see http://www.openssl.org). Where it can be used, this will provide a better and less limiting solution for securing information like passwords than the manual scheme we've adopted here.
For instance, secure sockets allow usernames and passwords to be entered into and submitted from a single web page, thereby supporting arbitrary mail readers. The best we can do without secure sockets is to either avoid mixing unencrypted user and password values and assume that some account data and encryptors live on the server (as done here), or to have two distinct input pages or URLs (one for each value). Neither scheme is as user-friendly as a secure sockets approach. Most browsers already support SSL; to add it to Python on your server, see the Python 1.6 (and beyond) library manual.
Internet security is a much bigger topic than can be addressed fully here, and we've really only scratched its surface. For additional information on security issues, consult books geared exclusively towards web programming techniques.
|
13.6.4 Common Utilities Module
The file commonhtml.py, shown in Example 13-14, is the Grand Central Station of this application -- its code is used and reused by just about every other file in the system. Most of it is self-explanatory, and I've already said most of what I wanted to say about it earlier, in conjunction with the CGI scripts that use it.
I haven't talked about its debugging support, though. Notice that this module assigns sys.stderr to sys.stdout, in an attempt to force the text of Python error messages to show up in the client's browser (remember, uncaught exceptions print details to sys.stderr). That works sometimes in PyMailCgi, but not always -- the error text shows up in a web page only if a page_header call has already printed a response preamble. If you want to see all error messages, make sure you call page_header (or print Content-type: lines manually) before any other processing. This module also defines functions that dump lots of raw CGI environment information to the browser (dumpstatepage), and that wrap calls to functions that print status messages so their output isn't added to the HTML stream (runsilent).
I'll leave the discovery of any remaining magic in this code up to you, the reader. You are hereby admonished to go forth and read, refer, and reuse.
Example 13-14. PP2EInternetCgi-WebPyMailCgicommonhtml.py
#!/usr/bin/python ######################################################### # generate standard page header, list, and footer HTML; # isolates html generation-related details in this file; # text printed here goes over a socket to the client, # to create parts of a new web page in the web browser; # uses one print per line, instead of string blocks; # uses urllib to escape parms in url links auto from a # dict, but cgi.escape to put them in html hidden fields; # some of the tools here are useful outside pymailcgi; # could also return html generated here instead of # printing it, so it could be included in other pages; # could also structure as a single cgi script that gets # and tests a next action name as a hidden form field; # caveat: this system works, but was largely written # during a 2-hour layover at the Chicago O'Hare airport: # some components could probably use a bit of polishing; # to run standalone on starship via a commandline, type # "python commonhtml.py"; to run standalone via a remote # web brower, rename file with .cgi and run fixcgi.py. ######################################################### import cgi, urllib, string, sys sys.stderr = sys.stdout # show error messages in browser from externs import mailconfig # from a package somewhere on server # my address root urlroot = 'http://starship.python.net/~lutz/PyMailCgi' def pageheader(app='PyMailCgi', color='#FFFFFF', kind='main', info=''): print 'Content-type: text/html ' print '
%s: %s page (PP2E)' % (app, kind) print '
' % (color, app, (info or kind)) def pagefooter(root='pymailcgi.html'): print '
<a href="http://www.python.org">' print '</a>' print '<a href="%s">Back to root page</a>' % root print '
' def formatlink(cgiurl, parmdict): """ make "%url?key=val&key=val" query link from a dictionary; escapes str( ) of all key and val with %xx, changes ' ' to + note that url escapes are different from html (cgi.escape) """ parmtext = urllib.urlencode(parmdict) # calls urllib.quote_plus return '%s?%s' % (cgiurl, parmtext) # urllib does all the work def pagelistsimple(linklist): # show simple ordered list print '
' def pagelisttable(linklist): # show list in a table print '
' # escape text to be safe count = 1 for (text, cgiurl, parmdict) in linklist: link = formatlink(cgiurl, parmdict) text = cgi.escape(text) print '
<a href="%s">View</a> %d | %s' % (link, count, text) count = count+1 print ' |
---|
' def listpage(linkslist, kind='selection list'): pageheader(kind=kind) pagelisttable(linkslist) # [('text', 'cgiurl', {'parm':'value'})] pagefooter( ) def messagearea(headers, text, extra=''): print '' for hdr in ('From', 'To', 'Cc', 'Subject'): val = headers.get(hdr, '?') val = cgi.escape(val, quote=1) print '
%s:' % hdr print ' | ' % (hdr, val, extra) print ' |
---|---|
Text:' print ' | ' % extra print '%s |
' % (cgi.escape(text) or '?') # if has s def viewpage(msgnum, headers, text, form): """ on View + select (generated link click) very subtle thing: at this point, pswd was url encoded in the link, and then unencoded by cgi input parser; it's being embedded in html here, so we use cgi.escape; this usually sends nonprintable chars in the hidden field's html, but works on ie and ns anyhow: in url: ?user=lutz&mnum=3&pswd=%8cg%c2P%1e%f0%5b%c5J%1c%f3&... in html: could urllib.quote the html field here too, but must urllib.unquote in next script (which precludes passing the inputs in a URL instead of the form); can also fall back on numeric string fmt in secret.py """ pageheader(kind='View') user, pswd, site = map(cgi.escape, getstandardpopfields(form)) print '' % urlroot print '' % msgnum print '' % user # from page|url print '' % site # for deletes print '' % pswd # pswd encoded messagearea(headers, text, 'readonly') # onViewSubmit.quotetext needs date passed in page print '' % headers.get('Date','?') print '
Action:' print ' | ' print 'ReplyForwardDelete' print '' print ' |
---|
' # no 'reset' needed here pagefooter( ) def editpage(kind, headers={}, text=''): # on Send, View+select+Reply, View+select+Fwd pageheader(kind=kind) print '
' % urlroot if mailconfig.mysignature: text = ' %s %s' % (mailconfig.mysignature, text) messagearea(headers, text) print '' print '' print '
' pagefooter( ) def errorpage(message): pageheader(kind='Error') # or sys.exc_type/exc_value exc_type, exc_value = sys.exc_info( )[:2] # but safer,thread-specific print '
', message print '
', cgi.escape(str(exc_type)) print '
', cgi.escape(str(exc_value)) pagefooter( ) def confirmationpage(kind): pageheader(kind='Confirmation') print '
' % kind print '
Press the link below to return to the main page.
' pagefooter( ) def getfield(form, field, default=''): # emulate dictionary get method return (form.has_key(field) and form[field].value) or default def getstandardpopfields(form): """ fields can arrive missing or '' or with a real value hard-coded in a url; default to mailconfig settings """ return (getfield(form, 'user', mailconfig.popusername), getfield(form, 'pswd', '?'), getfield(form, 'site', mailconfig.popservername)) def getstandardsmtpfields(form): return getfield(form, 'site', mailconfig.smtpservername) def runsilent(func, args): """ run a function without writing stdout ex: suppress print's in imported tools else they go to the client/browser """ class Silent: def write(self, line): pass save_stdout = sys.stdout sys.stdout = Silent( ) # send print to dummy object try: # which has a write method result = apply(func, args) # try to return func result finally: # but always restore stdout sys.stdout = save_stdout return result def dumpstatepage(exhaustive=0): """ for debugging: call me at top of a cgi to generate a new page with cgi state details """ if exhaustive: cgi.test( ) # show page with form, environ, etc. else: pageheader(kind='state dump') form = cgi.FieldStorage( ) # show just form fields names/values cgi.print_form(form) pagefooter( ) sys.exit( ) def selftest(showastable=0): # make phony web page links = [ # [(text, url, {parms})] ('text1', urlroot + '/page1.cgi', {'a':1}), ('text2', urlroot + '/page1.cgi', {'a':2, 'b':'3'}), ('text3', urlroot + '/page2.cgi', {'x':'a b', 'y':'a4', urlroot + '/page2.cgi', {'':'', 'y':'', 'z':None})] pageheader(kind='View') if showastable: pagelisttable(links) else: pagelistsimple(links) pagefooter( ) if __name__ == '__main__': # when run, not imported selftest(len(sys.argv) > 1) # html goes to stdout
Introducing Python
Part I: System Interfaces
System Tools
Parallel System Tools
Larger System Examples I
Larger System Examples II
Part II: GUI Programming
Graphical User Interfaces
A Tkinter Tour, Part 1
A Tkinter Tour, Part 2
Larger GUI Examples
Part III: Internet Scripting
Network Scripting
Client-Side Scripting
Server-Side Scripting
Larger Web Site Examples I
Larger Web Site Examples II
Advanced Internet Topics
Part IV: Assorted Topics
Databases and Persistence
Data Structures
Text and Language
Part V: Integration
Extending Python
Embedding Python
VI: The End
Conclusion Python and the Development Cycle