17.6. Processing Fetched Mail
At this point in our PyMailCGI web interaction, we are viewing an email message (Figure 17-12) that was chosen from the selection list page. On the message view page, selecting an action from the pull-down list and clicking the Next button invokes the script in Example 17-9 on the server to perform a reply, forward, or delete operation for the selected message.
Example 17-9. PP3E\Internet\Web\PyMailCgi\cgi-bin\onViewPageAction.py
This script receives all information about the selected message as form input field data (some hidden and encrypted, some not) along with the selected action's name. The next step in the interaction depends upon the action selected:
All these actions use data passed in from the prior page's form, but only the Delete action cares about the POP username and password and must decode the password received (it arrives here from hidden form input fields generated in the prior page's HTML).
17.6.1. Reply and Forward
If you select Reply as the next action, the message edit page in Figure 17-16 is generated by the script. Text on this page is editable, and pressing this page's Send button again triggers the send mail script we saw in Example 17-4. If all goes well, we'll receive the same confirmation page we got earlier when writing new mail from scratch (Figure 17-4).
Figure 17-16. PyMailCGI reply page
Forward operations are virtually the same, except for a few email header differences. All of this busy-ness comes "for free," because Reply and Forward pages are generated by calling commonhtml.editpage, the same utility used to create a new mail composition page. Here, we simply pass preformatted header line strings to the utility (e.g., replies add "Re:" to the subject text). We applied the same sort of reuse trick in PyMailGUI, but in a different context. In PyMailCGI, one script handles three pages; in PyMailGUI, one superclass and callback method handles three buttons, but the architecture is similar in spirit.
Selecting the Delete action on a message view page and pressing Next will cause the onViewPageAction script to immediately delete the message being viewed. Deletions are performed by calling a reusable delete utility function coded in Chapter 14's mailtools package. In the prior version, the call to the utility was wrapped in a commonhtml.runsilent call that prevents print statements in the utility from showing up in the HTML reply stream (they are just status messages, not HTML code). In this version, we get the same capability from the "Silent" classes in mailtools. Figure 17-17 shows a Delete operation in action.
Figure 17-17. PyMailCGI view page, Delete selected
As mentioned, Delete is the only action that uses the POP account information (user, password, and site) that was passed in from hidden fields on the prior message view page. By contrast, the Reply and Forward actions format an edit page, which ultimately sends a message to the SMTP server; no POP information is needed or passed.
But at this point in the interaction, the POP password has racked up more than a few frequent flyer miles. In fact, it may have crossed phone lines, satellite links, and continents on its journey from machine to machine. Let's trace through the journey:
Along the way, scripts have passed the password between pages as both a URL query parameter and an HTML hidden input field; either way, they have always passed its encrypted string and have never passed an unencrypted password and username together in any transaction. Upon a Delete request, the password must be decoded here using the secret module before passing it to the POP server. If the script can access the POP server again and delete the selected message, another confirmation page appears, as shown in Figure 17-18 (there is currently no verification for the delete, so be careful).
Figure 17-18. PyMailCGI delete confirmation
One subtlety is for replies and forwards, the onViewPageAction mail action script builds up a >-quoted representation of the original message, with original "From:", "To:", and "Date:" header lines prepended to the mail's original text. Notice, though, that the original message's headers are fetched from the CGI form input, not by reparsing the original mail (the mail is not readily available at this point). In other words, the script gets mail header values from the form input fields of the view page. Because there is no "Date" field on the view page, the original message's date is also passed along to the action script as a hidden input field to avoid reloading the message. Try tracing through the code in this chapter's listings to see whether you can follow dates from page to page.
17.6.3. Deletions and POP Message Numbers
Note that you probably should click the "Back to root page" link in Figure 17-18 after a successful deletiondon't use your browser's Back button to return to the message selection list at this point because the delete has changed the relative numbers of some messages in the list. The PyMailGUI client program worked around this problem by automatically updating its in-memory message cache and refreshing the index list on deletions, but PyMailCGI doesn't currently have a way to mark older pages as obsolete.
If your browser reruns server-side scripts as you press your Back button, you'll regenerate and hence refresh the list anyhow. If your browser displays cached pages as you go back, though, you might see the deleted message still present in the list. Worse, clicking on a view link in an old selection list page may not bring up the message you think it should, if it appears in the list after a message that was deleted.
This is a property of POP email in general, which we have discussed before in this book: incoming mail simply adds to the mail list with higher message numbers, but deletions remove mail from arbitrary locations in the list and hence change message numbers for all mail following the ones deleted.
126.96.36.199. Inbox synchronization error potential
As we saw in Chapter 15, even the PyMailGUI client has the potential to get some message numbers wrong if mail is deleted by another program while the GUI is openin a second PyMailGUI instance, for example, or in a simultaneously running PyMailCGI server session. This can also occur if the email server automatically deletes a message after the mail list has been loadedfor instance, moving it from inbox to undeliverable on errors.
This is why PyMailGUI went out of its way to detect server inbox synchronization errors on loads and deletes, using mailtools package utilities. Its deletions, for instance, match saved email headers with those for the corresponding message number in the server's inbox, to ensure accuracy. Unfortunately, without additional state information, PyMailCGI cannot detect such errors: it has no email list to compare against when messages are viewed or deleted, only the message number in a link or hidden form field.
In the worst case, PyMailCGI cannot guarantee that deletes remove the intended mailit's unlikely but not impossible that a mail earlier in the list may have been deleted between the time message numbers were fetched and a mail is deleted at the server. Without extra state information on the server, PyMailCGI cannot use the safe deletion or synchronization error checks in the mailtools modules to check whether subject message numbers are still valid.
To guarantee safe deletes, PyMailCGI would require state retention, which maps message numbers passed in pages to saved mail headers fetched when the numbers were last determined, or a broader policy, which sidesteps the issue completely. The next three sections outline suggested improvements and potential exercises.
188.8.131.52. Passing header text in hidden input fields (PyMailCGI_2.1)
Perhaps the simplest way to guarantee accurate deletions is to embed the displayed message's full header text in the message view page itself, as hidden form fields, using the following:
This would be a small code change, but it might require an extra headers fetch in the first of these scripts (it currently loads the full mail text), and it would require building a phony list to represent all mail's headers (we would have headers for and delete only one mail here). Alternatively, the header text could be extracted from the fetched full mail text, by splitting on the blank line that separates headers and message body text.
Moreover, this would increase the size of the data transmitted both from client and servermail header text is commonly greater than 1 KB in size, and it may be larger. This is a small amount of extra data in modern terms, but it's possible that this may run up against size limitations in some client or server systems.
And really, this scheme is incomplete. It addresses only deletion accuracy and does nothing about other synchronization errors in general. For example, the system still may fetch and display the wrong message from a message list page, after deletions of mails earlier in the inbox. In fact, this technique guarantees only that the message displayed in a view window will be the one deleted for that view window's delete action. It does not ensure that the mail displayed or deleted in the view window corresponds to the selection made by the user in the mail index list.
More specifically, because this scheme embeds headers in the HTML of view windows, its header matching on deletion is useful only if messages earlier in the inbox are deleted elsewhere after a mail has already been opened for viewing. If the inbox is changed elsewhere before a mail is opened in a view window, the wrong mail may be fetched from the index page. In that event, this scheme avoids deleting a mail other than the one displayed in a view window, but it assumes the user will catch the mistake and avoid deleting if the wrong mail is loaded from the index page. Though such cases are rare, this behavior is less than user friendly.
Even though it is incomplete, this change does at least avoid deleting the wrong email if the server's inbox changes while a message is being viewedthe mail displayed will be the only one deleted. A working but tentative implementation of this scheme is implemented in the following directory of the book's examples distribution:
It works under the Firefox web browser and requires just more than 10 lines of code changes among 3 source files, listed here (search for "#EXPERIMENTAL" to find the changes made in the source files yourself):
# onViewListLink.py . . . hdrstext = fulltext.split('\n\n') # use blank line commonhtml.viewpage( # encodes passwd msgnum, message, content, form, hdrstext, parts) # commonhtml.py . . . def viewpage(msgnum, headers, text, form, hdrstext, parts=): . . . # delete needs hdrs text for inbox sync tests: can be multi-K large hdrstext = cgi.escape(hdrstext, quote=True) # escape '"' too print '<input type=hidden name=Hdrstext value="%s">' % hdrstext # onViewPageAction.py . . . fetcher = mailtools.SilentMailFetcher(site, user, pswd) #fetcher.deleteMessages([msgnum]) hdrstext = getfield(form, 'Hdrstext') + '\n' hdrstext = hdrstext.replace('\r\n', '\n') # get \n from top dummyhdrslist = [None] * msgnum # only one msg hdr dummyhdrslist[msgnum-1] = hdrstext # in hidden field fetcher.deleteMessagesSafely([msgnum], dummyhdrslist) # exc on sync err commonhtml.confirmationpage('Delete')
To run this version locally, run the webserver script from Example 16-1 (in Chapter 16) with the dev subdirectory name, and a unique port number if you want to run both the original and the experimental versions. For instance:
C:\...\PP3E\Internet\Web>webserver.py dev\PyMailCGI_2.1 9000 command line http://localhost:9000/pymailcgi.html web browser URL
Although this version works on browsers tested, it is considered tentative (and was not used for this chapter) because it is an incomplete solution. In those rare cases where the server's inbox changes in ways that invalidate message numbers after server fetches, this version avoids inaccurate deletions, but index lists may still become out of sync. Messages fetches may still be inaccurate.
Note that in most cases, the message-id header would be sufficient for matching against mails to be deleted in the inbox, and it might be all that is required to pass from page to page. However, because this field is optional and can be forged to have any value, this might not always be a reliable way to identify matched messages; full header matching is necessary to be robust. See the discussion of mailtools in Chapter 14 for more details.
184.108.40.206. Server-side files for headers
The main limitation of the prior section's technique is that it addressed only deletions of already fetched emails. To catch other kinds of inbox synchronization errors, we would have to also record headers fetched when the index list page was constructed.
Since the index list page uses URL query parameters to record state, adding large header texts as an additional parameter on the URLs is not likely a viable option. In principle, the header text of all mails in the list could be embedded in the index page as a single hidden field, but this might add prohibitive size and transmission overheads.
As a more complete approach, each time the mail index list page is generated in onViewPswdSubmit.py, fetched headers of all messages could be saved in a flat file on the server, with a generated unique name (possibly from time, process ID, and username). That file's name could be passed along with message numbers in pages as an extra hidden field or query parameter.
On deletions, the header's filename could be used by onViewPageAction.py to load the saved headers from the flat file, to be passed to the safe delete call in mailtools. On fetches, the header file could also be used for general synchronization tests to avoid loading and displaying the wrong mail. Some sort of aging scheme would be required to delete the header save files eventually (the index page script might clean up old files), and we might also have to consider multiuser issues.
This scheme essentially uses server-side files to emulate PyMailGUI's in-process memory, though it is complicated by the fact that users may back up in their browserdeleting from view pages fetched with earlier list pages, attempting to refetch from an earlier list page and so on. In general, it may be necessary to analyze all possible forward and backward flows through pages (it is essentially a state machine). Header save files might also be used to detect synchronization errors on fetches and may be removed on deletions to effectively disable actions in prior page states, though header matching may suffice to ensure deletion accuracy.
220.127.116.11. Delete on load
Alternatively, mail clients could delete all email off the server as soon as it is downloaded, such that deletions wouldn't impact POP identifiers (Microsoft Outlook may use this scheme by default, for instance). However, this requires additional mechanisms for storing deleted email persistently for later access, and it means you can view fetched mail only on the machine to which it was downloaded. Since both PyMailGUI and PyMailCGI are intended to be used on a variety of machines, mail is kept on the POP server by default.