Section 17.5. Reading POP Email


17.5. Reading POP Email

So far, we've stepped through the path the system follows to send new mail. Let's now see what happens when we try to view incoming POP mail.

17.5.1. The POP Password Page

If you flip back to the main page in Figure 17-2, you'll see a View link; pressing it triggers the script in Example 17-6 to run on the server.

Example 17-6. PP3E\Internet\Web\PyMailCgi\cgi-bin\onRootViewLink.py

 #!/usr/bin/python ############################################################## # on view link click on main/root HTML page # this could almost be an HTML file because there are likely # no input params yet, but I wanted to use standard header/ # footer functions and display the site/usernames which must # be fetched;  On submission, doesn't send the user along with # password here, and only ever sends both as URL params or # hidden fields after the password has been encrypted by a # user-uploadable encryption module; put HTML in commonhtml? ############################################################## # page template pswdhtml = """ <form method=post action=%sonViewPswdSubmit.py> <p> Please enter POP account password below, for user "%s" and site "%s". <p><input name=pswd type=password> <input type=submit value="Submit"></form></p> <hr><p><i>Security note</i>: The password you enter above will be transmitted over the Internet to the server machine, but is not displayed, is never transmitted in combination with a username unless it is encrypted, and is never stored anywhere: not on the server (it is only passed along as hidden fields in subsequent pages), and not on the client (no cookies are generated). This is still not guaranteed to be totally safe; use your browser's back button to back out of PyMailCgi at any time.</p> """ # generate the password input page import commonhtml                                         # usual parms case: user, pswd, site = commonhtml.getstandardpopfields({})    # from module here, commonhtml.pageheader(kind='POP password input')          # from html|url later print pswdhtml % (commonhtml.urlroot, user, site) commonhtml.pagefooter( ) 

This script is almost all embedded HTML: the triple-quoted pswdhtml string is printed, with string formatting to insert values, in a single step. But because we need to fetch the username and server name to display on the generated page, this is coded as an executable script, not as a static HTML file. The module commonhtml either loads usernames and server names from script inputs (e.g., appended as query parameters to the script's URL) or imports them from the mailconfig file; either way, we don't want to hardcode them into this script or its HTML, so a simple HTML file won't do.

Since this is a script, we can also use the commonhtml page header and footer routines to render the generated reply page with a common look-and-feel, as shown in Figure 17-7.

Figure 17-7. PyMailCGI view password login page


At this page, the user is expected to enter the password for the POP email account of the user and server displayed. Notice that the actual password isn't displayed; the input field's HTML specifies type=password, which works just like a normal text field, but shows typed input as stars. (See also the pymail program in Chapter 14 for doing this at a console and PyMailGUI in Chapter 15 for doing this in a GUI.)

17.5.2. The Mail Selection List Page

After you fill out the last page's password field and press its Submit button, the password is shipped off to the script shown in Example 17-7.

Example 17-7. PP3E\Internet\Web\PyMailCgi\cgi-bin\onViewPswdSubmit.py

 #!/usr/bin/python ############################################################ # On submit in POP password input window--make view list; # in 2.0 we only fetch mail headers here, and fetch 1 full # message later upon request; we still fetch all headers # each time the index page is made: caching requires a db; ############################################################ import cgi import loadmail, commonhtml from   externs import mailtools from   secret  import encode       # user-defined encoder module MaxHdr = 35                        # max length of email hdrs in list # only pswd comes from page here, rest usually in module formdata = cgi.FieldStorage( ) mailuser, mailpswd, mailsite = commonhtml.getstandardpopfields(formdata) try:     newmails = loadmail.loadmailhdrs(mailsite, mailuser, mailpswd)     mailnum  = 1     maillist = []     for mail in newmails:                                     # list of hdr text         msginfo = []         hdrs = mailtools.MailParser( ).parseHeaders(mail)    # email.Message         for key in ('Subject', 'From', 'Date'):             msginfo.append(hdrs.get(key, '?')[:MaxHdr])         msginfo = ' | '.join(msginfo)         maillist.append((msginfo, commonhtml.urlroot + 'onViewListLink.py',                                       {'mnum': mailnum,                                        'user': mailuser,          # data params                                        'pswd': encode(mailpswd),  # pass in URL                                        'site': mailsite}))        # not inputs         mailnum += 1     commonhtml.listpage(maillist, 'mail selection list') except:     commonhtml.errorpage('Error loading   mail index') 

This script's main purpose is to generate a selection list page for the user's email account, using the password typed into the prior page (or passed in a URL). As usual with encapsulation, most of the details are hidden in other files:


loadmail.loadmailhdrs

Reuses the mailtools module package from Chapter 14 to fetch email with the POP protocol; we need a message count and mail headers here to display an index list. In this version, the software fetches only mail header text to save time, not full mail messages (provided your server supports the TOP command of the POP interfaceif not, see mailconfig to disable this).


commonhtml.listpage

Generates HTML to display a passed-in list of tuples (text, URL, parameter-dictionary) as a list of hyperlinks in the reply page; parameter values show up as query parameters at the end of URLs in the response.

The maillist list built here is used to create the body of the next pagea clickable email message selection list. Each generated hyperlink in the list page references a constructed URL that contains enough information for the next script to fetch and display a particular email message. As we learned in the last chapter, this is a simple kind of state retention between pages and scripts.

If all goes well, the mail selection list page HTML generated by this script is rendered as in Figure 17-8. If your inbox is as large as some of mine, you'll probably need to scroll down to see the end of this page. This page follows the common look-and-feel for all PyMailCGI pages, thanks to commonhtml.

Figure 17-8. PyMailCGI view selection list page, top


If the script can't access your email account (e.g., because you typed the wrong password), its try statement handler instead produces a commonly formatted error page. Figure 17-9 shows one that gives the Python exception and details as part of the reply after a genuine exception is caught; as usual, the exception details are fetched from sys.exc_info, and Python's traceback module is used to generate a stack trace.

Figure 17-9. PyMailCGI login error page


17.5.3. Passing State Information in URL Link Parameters

The central mechanism at work in Example 17-7 is the generation of URLs that embed message numbers and mail account information. Clicking on any of the View links in the selection list triggers another script, which uses information in the link's URL parameters to fetch and display the selected email. As mentioned in Chapter 16, because the list's links are effectively programmed to "know" how to load a particular message, they effectively remember what to do next. Figure 17-10 shows part of the HTML generated by this script (use your web browser View Source option to see this for yourself).

Figure 17-10. PyMailCGI view list, generated HTML


Did you get all that? You may not be able to read generated HTML like this, but your browser can. For the sake of readers afflicted with human-parsing limitations, here is what one of those link lines looks like, reformatted with line breaks and spaces to make it easier to understand:

 <tr><th><a href="onViewListLink.py?                     pswd=%C5%D9%E3%E1%A5%2C%C0D%A4u%AD%A3d%96%7EB&                     mnum=8&                     user=pp3e&                     site=pop.earthlink.net">View</a> 8 <td>America MP3 file | lutz@rmi.net | Wed, 18 Jan 2006 15:45:10 -0500 (ES 

PyMailCGI generates relative minimal URLs (server and pathname values come from the prior page, unless set in commonhtml). Clicking on the word View in the hyperlink rendered from this HTML code triggers the onViewListLink script as usual, passing it all the parameters embedded at the end of the URL: the POP username, the POP message number of the message associated with this link, and the POP password and site information. These values will be available in the object returned by cgi.FieldStorage in the next script run. Note that the mnum POP message number parameter differs in each link because each opens a different message when clicked and that the text after <td> comes from message headers extracted by the mailtools package, using the email package.

The commonhtml module escapes all of the link parameters with the urllib module, not cgi.escape, because they are part of a URL. This is obvious only in the pswd password parameterits value has been encrypted, but urllib additionally escapes nonsafe characters in the encrypted string per URL convention (that's where all the %xx characters come from). It's OK if the encryptor yields oddeven nonprintablecharacters because URL encoding makes them legible for transmission. When the password reaches the next script, cgi.FieldStorage undoes URL escape sequences, leaving the encrypted password string without % escapes.

It's instructive to see how commonhtml builds up the stateful link parameters. Earlier, we learned how to use the urllib.quote_plus call to escape a string for inclusion in URLs:

 >>> import urllib >>> urllib.quote_plus("There's bugger all down here on Earth") 'There%27s+bugger+all+down+here+on+Earth' 

The module commonhtml, though, calls the higher-level urllib.urlencode function, which translates a dictionary of name:value pairs into a complete URL query parameter string, ready to add after a ? marker in a URL. For instance, here is urlencode in action at the interactive prompt:

 >>> parmdict = {'user': 'Brian', ...             'pswd': '#!/spam', ...             'text': 'Say no more, squire!'} >>> urllib.urlencode(parmdict) 'pswd=%23%21/spam&user=Brian&text=Say+no+more,+squire%21' >>> "%s?%s" % ("http://scriptname.py", urllib.urlencode(parmdict)) 'http://scriptname.py?pswd=%23%21/spam&user=Brian&text=Say+no+more,+squire%21' 

Internally, urlencode passes each name and value in the dictionary to the built-in str function (to make sure they are strings), and then runs each one through urllib.quote_plus as they are added to the result. The CGI script builds up a list of similar dictionaries and passes it to commonhtml to be formatted into a selection list page.[*]

[*] Technically, again, you should generally escape & separators in generated URL links by running the URL through cgi.escape, if any parameter's name could be the same as that of an HTML character escape code (e.g., &amp=high). See Chapter 16 for more details; they aren't escaped here because there are no clashes between URL and HTML.

In broader terms, generating URLs with parameters like this is one way to pass state information to the next script (along with cookies, hidden form input fields, and server databases, discussed in Chapter 16). Without such state information, users would have to reenter the username, password, and site name on every page they visit along the way.

Incidentally, the list generated by this script is not radically different in functionality from what we built in the PyMailGUI program in Chapter 15, though the two differ cosmetically. Figure 17-11 shows this strictly client-side GUI's view on the same email list displayed in Figure 17-8.

Figure 17-11. PyMailGUI displaying the same view list


However, PyMailGUI uses the Tkinter GUI library to build up a user interface instead of sending HTML to a browser. It also runs entirely on the client and downloads mail from the POP server directly to the client machine over sockets on demand. Because it retains memory for the duration of the session, PyMailGUI can easily minimize mail server access. After the initial header load, it needs to load only newly arrived email headers on load requests. Moreover, it can update its email index in-memory on deletions instead of reloading anew from the server, and it has enough state to perform safe deletions of messages that check for server inbox matches. PyMailGUI also remembers emails you've already viewedthey need not be reloaded again while the program runs.

In contrast, PyMailCGI runs on the web server machine and simply displays mail text on the client's browsermail is downloaded from the POP server machine to the web server, where CGI scripts are run. Due to the autonomous nature of CGI scripts, PyMailCGI by itself has no automatic memory that spans pages and may need to reload headers and already viewed messages during a single session. These architecture differences have some important ramifications, which we'll discuss later in this chapter.

17.5.4. Security Protocols

In onViewPswdSubmit's source code (Example 17-7), notice that password inputs are passed to an encode function as they are added to the parameters dictionary; hence they show up encrypted in hyperlinked URLs. They are also URL encoded for transmission (with % escapes) and are later decoded and decrypted within other scripts as needed to access the POP account. The password encryption step, encode, is at the heart of PyMailCGI's security policy.

In Python today, the standard library's socket module supports Secure Sockets Layer (SSL), if the required library is built into your Python. SSL automatically encrypts transmitted data to make it safe to pass over the Net. Unfortunately, for reasons we'll discuss when we reach the secret.py module later in this chapter (see Example 17-13), this wasn't a universal solution for PyMailCGI's password data. (In short, the web server we're using doesn't directly support its end of a secure HTTP encrypted dialog.) Because of that, an alternative scheme was devised to minimize the chance that email account information could be stolen off the Net in transit.

Here's how it works. When this script is invoked by the password input page's form, it gets only one input parameter: the password typed into the form. The username is imported from a mailconfig module installed on the server; it is not transmitted together with the unencrypted password (that could be intercepted).

To pass the POP username and password to the next page as state information, this script adds them to the end of the mail selection list URLs, but only after the password has been encrypted by secret.encodea function in a module that lives on the server and may vary in every location that PyMailCGI is installed. In fact, PyMailCGI was written to not have to know about the password encryptor at all; because the encoder is a separate module, you can provide any flavor you like. Unless you also publish your encoder module, the encoded password shipped with the username won't be of much help to snoopers.

The upshot is that normally, PyMailCGI never sends or receives both username and password values together in a single transaction, unless the password is encrypted with an encryptor of your choice. This limits its utility somewhat (since only a single account username can be installed on the server), but the alternative of popping up two pagesone for password entry and one for userseems even less friendly. In general, if you want to read your mail with the system as coded, you have to install its files on your server, edit its mailconfig.py to reflect your account details, and change its secret.py encryptor as desired.

17.5.4.1. Reading mail with direct URLs

One exception: since any CGI script can be invoked with parameters in an explicit URL instead of form field values, and since commonhtml tries to fetch inputs from the form object before importing them from mailconfig, it is possible for any person to use this script to check his mail without installing and configuring a copy of PyMailCGI. For example, a URL such as the following typed into your browser's address field or submitted with tools such as urllib (but without the line break used to make it fit here):

 http://localhost:8000/cgi-bin/   onViewPswdSubmit.py?user=pp3e&pswd=guess&site=pop.earthlink.net 

will actually load email into a selection list page such as that in Figure 17-8, using whatever user, password, and mail site names are appended to the URL. From the selection list, you may then view, reply, forward, and delete email.

Notice that at this point in the interaction, the password you send in a URL of this form is not encrypted. Later scripts expect that the password inputs will be sent encrypted, though, which makes it more difficult to use them with explicit URLs (you would need to match the encrypted form produced by the secret module on the server). Passwords are encrypted as they are added to links in the reply page's selection list, and they remain encrypted in URLs and hidden form fields thereafter.

But you shouldn't use a URL like this, unless you don't care about exposing your email password. Sending your unencrypted mail user ID and password strings across the Net in a URL such as this is unsafe and open to snoopers. In fact, it's like giving away your emailanyone who intercepts this URL, or views it in a server logfile will have complete access to your email account. It is made even more treacherous by the fact that this URL format appears in a book that will be distributed all around the world.


If you care about security and want to use PyMailCGI on a remote server, install it on your server and configure mailconfig and secret. That should at least guarantee that both your user and password information will never be transmitted unencrypted in a single transaction. This scheme may still not be foolproof, so be careful out there. Without secure HTTP and sockets, the Internet is a "use at your own risk" medium.


17.5.5. The Message View Page

Back to our page flow; at this point, we are still viewing the message selection list in Figure 17-8. When we click on one of its generated hyperlinks, the stateful URL invokes the script in Example 17-8 on the server, sending the selected message number and mail account information (user, password, and site) as parameters on the end of the script's URL.

Example 17-8. PP3E\Internet\Web\PyMailCgi\cgi-bin\onViewListLink.py

 #!/usr/bin/python ############################################################## # On user click of message link in main selection list; # cgi.FieldStorage undoes any urllib escapes in the link's # input parameters (%xx and '+' for spaces already undone); # in 2.0 we only fetch 1 mail here, not entire list again! # in 2.0 we also find mail's main text part intelligently # instead of blindly displaying full text (poss attachments), # and generate links to attachment files saved on the server; # saved attachment files only work for 1 user and 1 message; # most 2.0 enhancements inherited from the mailtools package; ############################################################## import cgi import commonhtml, secret from externs import mailtools #commonhtml.dumpstatepage(0) def saveAttachments(message, parser, savedir='partsdownload'):     """     save fetched email's parts to files on     server to be viewed in user's web browser     """     import os     if not os.path.exists(savedir):            # in CGI script's cwd on server         os.mkdir(savedir)                      # will open per your browser     for filename in os.listdir(savedir):       # clean up last message: temp!         dirpath = os.path.join(savedir, filename)         os.remove(dirpath)     typesAndNames = parser.saveParts(savedir, message)     filenames = [fname for (ctype, fname) in typesAndNames]     for filename in filenames:         os.chmod(filename, 0666)               # some srvrs may need read/write     return filenames form = cgi.FieldStorage( ) user, pswd, site = commonhtml.getstandardpopfields(form) pswd = secret.decode(pswd) try:     msgnum   = form['mnum'].value                               # from URL link     parser   = mailtools.MailParser( )     fetcher  = mailtools.SilentMailFetcher(site, user, pswd)     fulltext = fetcher.downloadMessage(int(msgnum))             # don't eval!     message  = parser.parseMessage(fulltext)                    # email.Message     parts    = saveAttachments(message, parser)                 # for URL links     mtype, content = parser.findMainText(message)               # first txt part     commonhtml.viewpage(msgnum, message, content, form, parts)  # encoded pswd except:     commonhtml.errorpage('Error loading message') 

Again, much of the work here happens in the commonhtml module, listed later in this section (see Example 17-14). This script adds logic to decode the input password (using the configurable secret encryption module) and extract the selected mail's headers and text using the mailtools module package from Chapter 14 again. The full text of the selected message is ultimately fetched and parsed by mailtools, using the standard library's poplib module and email package. Although we'll have to refetch this message if viewed again, this version does not grab all mails to get just the one selected.[*]

[*] Notice that the message number arrives as a string and must be converted to an integer in order to be used to fetch the message. But we're careful not to convert with eval here, since this is a string passed over the Net and could have arrived embedded at the end of an arbitrary URL (remember that earlier warning?).

Also new in this version, the saveAttachments function in this script splits off the parts of a fetched message and stores them in a directory on the web server machine. This was discussed earlier in this chapterthe view page is then augmented with URL links that point at the saved part files. Your web browser will open them according to their filenames. All the work of part extraction and naming is inherited from mailtools. Part files are kept temporarily; they are deleted when the next message is fetched. They are also currently stored in a single directory and so apply to only a single user.

If the message can be loaded and parsed successfully, the result page, shown in Figure 17-12, allows us to view, but not edit, the mail's text. The function commonhtml.viewpage generates a "read-only" HTML option for all the text widgets in this page.

Figure 17-12. PyMailCGI view page


View pages like this have a pull-down action selection list near the bottom; if you want to do more, use this list to pick an action (Reply, Forward, or Delete) and click on the Next button to proceed to the next screen. If you're just in a browsing frame of mind, click the "Back to root page" link at the bottom to return to the main page, or use your browser's Back button to return to the selection list page.

Figure 17-12 show the mail we sent earlier in this chapter, being viewed (we sent it to ourselves). Notice its "Parts:" linkswhen clicked, they trigger URLs that open the temporary part files on the server, according to your browser's rules for the file type. For instance, clicking in the "doc" file will likely open it in Microsoft Word on Windows; selecting the "jpg" link will open it either in a local image viewer or within the browser itself, as captured in Figure 17-13.

Figure 17-13. Attached part file link display


17.5.6. Passing State Information in HTML Hidden Input Fields

What you don't see on the view page in Figure 17-12 is just as important as what you do see. We need to defer to Example 17-14 for coding details, but something new is going on here. The original message number, as well as the POP user and (still encrypted) password information sent to this script as part of the stateful link's URL, wind up being copied into the HTML used to create this view page, as the values of hidden input fields in the form. The hidden field generation code in commonhtml looks like this:

     print '<form method=post action="%s/onViewPageAction.py">' % urlroot     print '<input type=hidden name=mnum value="%s">' % msgnum     print '<input type=hidden name=user value="%s">' % user     # from page|url     print '<input type=hidden name=site value="%s">' % site     # for deletes     print '<input type=hidden name=pswd value="%s">' % pswd     # pswd encoded 

As we've learned, much like parameters in generated hyperlink URLs, hidden fields in a page's HTML allow us to embed state information inside this web page itself. Unless you view that page's source, you can't see this state information because hidden fields are never displayed. But when this form's Submit button is clicked, hidden field values are automatically transmitted to the next script along with the visible fields on the form.

Figure 17-14 shows part of the source code generated for another message's view page; the hidden input fields used to pass selected mail state information are embedded near the top.

Figure 17-14. PyMailCGI view page, generated HTML


The net effect is that hidden input fields in HTML, just like parameters at the end of generated URLs, act like temporary storage areas and retain state between pages and user interaction steps. Both are the Web's equivalent to programming language variables. They come in handy anytime your application needs to remember something between pages.

Hidden fields are especially useful if you cannot invoke the next script from a generated URL hyperlink with parameters. For instance, the next action in our script is a form submit button (Next), not a hyperlink, so hidden fields are used to pass state. As before, without these hidden fields, users would need to reenter POP account details somewhere on the view page if they were needed by the next script (in our example, they are required if the next action is Delete).

17.5.7. Escaping Mail Text and Passwords in HTML

Notice that everything you see on the message view page in Figure 17-14 is escaped with cgi.escape. Header fields and the text of the mail itself might contain characters that are special to HTML and must be translated as usual. For instance, because some mailers allow you to send messages in HTML format, it's possible that an email's text could contain a </textarea> tag, which would throw the reply page hopelessly out of sync if not escaped.

One subtlety here: HTML escapes are important only when text is sent to the browser initially (by the CGI script). If that text is later sent out again to another script (e.g., by sending a reply mail), the text will be back in its original, nonescaped format when received again on the server. The browser parses out escape codes and does not put them back again when uploading form data, so we don't need to undo escapes later. For example, here is part of the escaped text area sent to a browser during a Reply transaction (use your browser's View Source option to see this live):

 <tr><th align=right>Text: <td><textarea name=text cols=80 rows=10 readonly> more stuff --Mark Lutz  (http://rmi.net/~lutz)  [PyMailCgi 2.0] &gt; -----Original Message----- &gt; From: lutz@rmi.net &gt; To: lutz@rmi.net &gt; Date: Tue May  2 18:28:41 2000 &gt; &gt; &lt;table&gt;&lt;textarea&gt; &gt; &lt;/textarea&gt;&lt;/table&gt; &gt; --Mark Lutz  (http://rmi.net/~lutz)  [PyMailCgi 2.0] &gt; &gt; &gt; &gt; -----Original Message----- 

After this reply is delivered, its text looks as it did before escapes (and exactly as it appeared to the user in the message edit web page):

 more stuff --Mark Lutz  (http://rmi.net/~lutz)  [PyMailCgi 2.0] > -----Original Message----- > From: lutz@rmi.net > To: lutz@rmi.net > Date: Tue May  2 18:28:41 2000 > > <table><textarea> > </textarea></table> > --Mark Lutz  (http://rmi.net/~lutz)  [PyMailCgi 2.0] > > > > -----Original Message----- 

Did you notice the odd characters in the hidden password field of the generated HTML screenshot (Figure 17-14)? It turns out that the POP password is still encrypted when placed in hidden fields of the HTML. For security, they have to be. Values of a page's hidden fields can be seen with a browser's View Source option, and it's not impossible that the text of this page could be intercepted off the Net.

The password is no longer URL encoded when put in the hidden field, however, even though it was when it appeared at the end of the smart link URL. Depending on your encryption module, the password might now contain nonprintable characters when generated as a hidden field value here; the browser doesn't care, as long as the field is run through cgi.escape like everything else added to the HTML reply stream. The commonhtml module is careful to route all text and headers through cgi.escape as the view page is constructed.

As a comparison, Figure 17-15 shows what the mail message captured in Figure 17-12 looks like when viewed in PyMailGUI, the client-side Tkinter-based email tool from Chapter 15. In that program, message parts are listed with the Parts button and are extracted, saved, and opened with the Split button; we also get quick-access buttons to parts and attachments just below the message headers. The net effect is similar.

Figure 17-15. PyMailGUI viewer, same message


PyMailGUI doesn't need to care about things such as passing state in URLs or hidden fields (it saves state in Python in-process variables) or escaping HTML and URL strings (there are no browsers, and no network transmission steps once mail is downloaded). It also doesn't have to rely on temporary server file links to give access to message partsthe message is retained in memory attached to a window object and lives on between interactions. PyMailGUI does require Python to be installed on the client, but we'll return to that in a few pages.




Programming Python
Programming Python
ISBN: 0596009259
EAN: 2147483647
Year: 2004
Pages: 270
Authors: Mark Lutz

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net