14.4. POP: Fetching EmailI admit it: up until just before 2000, I took a lowest-common-denominator approach to email. I preferred to check my messages by Telnetting to my ISP and using a simple command-line email interface. Of course, that's not ideal for mail with attachments, pictures, and the like, but its portability was staggeringbecause Telnet runs on almost any machine with a network link, I was able to check my mail quickly and easily from anywhere on the planet. Given that I make my living traveling around the world teaching Python classes, this wild accessibility was a big win. Like web site maintenance, times have changed on this front, too: when my ISP took away Telnet access, they also took away my email access. Luckily, Python came to the rescueby writing email access scripts in Python, I could still read and send email from any machine in the world that has Python and an Internet connection. Python can be as portable a solution as Telnet, but much more powerful. Moreover, I can still use these scripts as an alternative to tools suggested by the ISP, such as Microsoft Outlook. Besides not being fond of delegating control to commercial products of large companies, tools like Outlook generally download mail to your PC and delete it from the mail server as soon as you access it by default. This keeps your email box small (and your ISP happy), but it isn't exactly friendly to traveling Python salespeopleonce accessed, you cannot reaccess a prior email from any machine except the one to which it was initially downloaded. If you need to see an old email and don't have your PC handy, you're out of luck. The next two scripts represent one first-cut solution to these portability and single-machine constraints (we'll see others in this and later chapters). The first, popmail.py, is a simple mail reader tool, which downloads and prints the contents of each email in an email account. This script is admittedly primitive, but it lets you read your email on any machine with Python and sockets; moreover, it leaves your email intact on the server. The second, smtpmail.py, is a one-shot script for writing and sending a new email message. Later in this chapter, we'll implement an interactive console-based email client (pymail), and later in this book we'll code a full-blown GUI email tool (PyMailGUI) and a web-based email program (PyMailCGI). For now, we'll start with the basics.[*]
14.4.1. Mail Configuration ModuleBefore we get to the scripts, let's first take a look at a common module they import and use. The module in Example 14-17 is used to configure email parameters appropriately for a particular user. It's simply a collection of assignments to variables used by mail programs that appear in this book (each major mail client has its own version, to allow content to vary). Isolating these configuration settings in this single module makes it easy to configure the book's email programs for a particular user, without having to edit actual program logic code. If you want to use any of this book's email programs to do mail processing of your own, be sure to change its assignments to reflect your servers, account usernames, and so on (as shown, they refer to email accounts used for developing this book). Not all scripts use all of these settings; we'll revisit this module in later examples to explain more of them. Note that to avoid spamming, some ISPs may require that you be connected directly to their systems in order to use their SMTP servers to send mail. For example, when connected directly by dial-up, I can use smtp.earthlink.net (my ISP's server), but when connected via broadband, I have to route requests through smtp.comcast.net (Comcast is my cable Internet provider). You may need to adjust these settings to match your configuration. Also, some SMTP servers check domain name validity in addresses, and may require an authenticating login stepsee the SMTP section later in this chapter for interface details. Example 14-17. PP3E\Internet\Email\mailconfig.py
14.4.2. POP Mail Reader ScriptOn to reading email in Python: the script in Example 14-18 employs Python's standard poplib module, an implementation of the client-side interface to POPthe Post Office Protocol. POP is a well-defined and widely available way to fetch email from servers over sockets. This script connects to a POP server to implement a simple yet portable email download and display tool. Example 14-18. PP3E\Internet\Email\popmail.py
Though primitive, this script illustrates the basics of reading email in Python. To establish a connection to an email server, we start by making an instance of the poplib.POP3 object, passing in the email server machine's name as a string: server = poplib.POP3(mailserver) If this call doesn't raise an exception, we're connected (by socket) to the POP server listening for requests on POP port number 110 at the machine where our email account lives. The next thing we need to do before fetching messages is tell the server our username and password; notice that the password method is called pass_. Without the trailing underscore, pass would name a reserved word and trigger a syntax error: server.user(mailuser) # connect, log in to mail server server.pass_(mailpasswd) # pass is a reserved word To keep things simple and relatively secure, this script always asks for the account password interactively; the getpass module we met in the FTP section of this chapter is used to input but not display a password string typed by the user. Once we've told the server our username and password, we're free to fetch mailbox information with the stat method (number messages, total bytes among all messages), and fetch the full text of a particular message with the retr method (pass the message numberthey start at 1). The full text includes all headers, followed by a blank line, followed by the mail's text and any attached parts. The retr call sends back a tuple that includes a list of line strings representing the content of the mail: msgCount, msgBytes = server.stat( ) hdr, message, octets = server.retr(i+1) # octets is byte count When we're done, we close the email server connection by calling the POP object's quit method: server.quit( ) # else locked till timeout Notice that this call appears inside the finally clause of a TRy statement that wraps the bulk of the script. To minimize complications associated with changes, POP servers lock your email inbox between the time you first connect and the time you close your connection (or until an arbitrary, system-defined timeout expires). Because the POP quit method also unlocks the mailbox, it's crucial that we do this before exiting, whether an exception is raised during email processing or not. By wrapping the action in a TRy/finally statement, we guarantee that the script calls quit on exit to unlock the mailbox to make it accessible to other processes (e.g., delivery of incoming email). 14.4.3. Fetching MessagesHere is the popmail script of Example 14-18 in action, displaying two messages in my account's mailbox on machine pop.earthlink.netthe domain name of the mail server machine at earthlink.net, configured in the module mailconfig: C:\...\PP3E\Internet\Email>popmail.py Password for pop.earthlink.net? Connecting... +OK NGPopper vEL_6_10 at earthlink.net ready <12517.1139377094@pop-satin.atl.sa. earthlink.net> There are 2 mail messages in 1676 bytes ('+OK', ['1 876', '2 800'], 14) -------------------------------------------------------------------------------- [Press Enter key] Status: U Return-Path: <lumber.jack@thelarch.com> Received: from sccrmhc13.comcast.net ([63.240.77.83]) by mx-pinchot.atl.sa.earthlink.net (EarthLink SMTP Server) with SMTP id 1f6HNg7Ex3Nl34d0 for <pp3e@earthlink.net>; Wed, 8 Feb 2006 00:23:06 -0500 (EST) Received: from [192.168.1.117] (c-67-161-147-100.hsd1.co.comcast.net[67.161.147. 100]) by comcast.net (sccrmhc13) with ESMTP id <2006020805230401300nvnlge>; Wed, 8 Feb 2006 05:23:04 +0000 From: lumber.jack@TheLarch.com To: pp3e@earthlink.net Subject: I'm a Lumberjack, and I'm Okay Date: Wed, 08 Feb 2006 05:23:13 -0000 X-Mailer: PyMailGUI 2.1 (Python) Message-Id: <200602080023.1f6HNg7Ex3Nl34d0@mx-pinchot.atl.sa.earthlink.net> X-ELNK-Info: spv=0; X-ELNK-AV: 0 X-ELNK-Info: sbv=0; sbrc=.0; sbf=00; sbw=000; X-NAS-Language: English X-NAS-Bayes: #0: 1.55061E-015; #1: 1 X-NAS-Classification: 0 X-NAS-MessageID: 1469 X-NAS-Validation: {388D038F-95BF-4409-9404-7726720152C4} I cut down trees, I skip and jump, I like to press wild flowers... -------------------------------------------------------------------------------- [Press Enter key] Status: U Return-Path: <pp3e@earthlink.net> Received: from sccrmhc11.comcast.net ([204.127.200.81]) by mx-canard.atl.sa.earthlink.net (EarthLink SMTP Server) with SMTP id 1 f6HOh6uy3Nl36s0 for <pp3e@earthlink.net>; Wed, 8 Feb 2006 00:24:09 -0500 (EST) Received: from [192.168.1.117] (c-67-161-147-100.hsd1.co.comcast.net[67.161.147. 100]) by comcast.net (sccrmhc11) with ESMTP id <2006020805235601100dkk93e>; Wed, 8 Feb 2006 05:23:56 +0000 From: pp3e@earthlink.net To: pp3e@earthlink.net Subject: testing Date: Wed, 08 Feb 2006 05:24:06 -0000 X-Mailer: PyMailGUI 2.1 (Python) Message-Id: <200602080024.1f6HOh6uy3Nl36s0@mx-canard.atl.sa.earthlink.net> X-ELNK-Info: spv=0; X-ELNK-AV: 0 X-ELNK-Info: sbv=0; sbrc=.0; sbf=00; sbw=000; X-NAS-Classification: 0 X-NAS-MessageID: 1470 X-NAS-Validation: {388D038F-95BF-4409-9404-7726720152C4} Testing Python mail tools. -------------------------------------------------------------------------------- Bye. This interface is about as simple as it could beafter connecting to the server, it prints the complete and raw full text of one message at a time, pausing between each until you press the Enter key. The raw_input built-in is called to wait for the key press between message displays. The pause keeps messages from scrolling off the screen too fast; to make them visually distinct, emails are also separated by lines of dashes. We could make the display fancier (e.g., we can use the email package to parse headers, bodies, and attachmentswatch for examples in this and later chapters), but here we simply display the whole message that was sent. This works well for simple mails like these two, but it can be inconvenient for larger messages with attachments; we'll improve on this in later clients. If you look closely at the text in these emails, you may notice that the emails were actually sent by another program called PyMailGUI (a program we'll meet in Chapter 15). The X-Mailer header line, if present, typically identifies the sending program. In fact, a variety of extra header lines can be sent in a message's text. The Received: headers, for example, trace the machines that a message passed through on its way to the target mailbox. Because popmail prints the entire raw text of a message, you see all headers here, but you may see only a few by default in end-user-oriented mail GUIs such as Outlook. The script in Example 14-18 never deletes mail from the server. Mail is simply retrieved and printed and will be shown again the next time you run the script (barring deletion in another tool). To really remove mail permanently, we need to call other methods (e.g., server.dele(msgnum)) but such a capability is best deferred until we develop more interactive mail tools. 14.4.4. Fetching Email at the Interactive PromptIf you don't mind typing code and reading POP server messages, it's possible to use the Python interactive prompt as a simple email client too. The following session uses two additional interfaces we'll apply in later examples:
The top call also returns a tuple that includes the list of line strings sent back; its second argument tells the server how many additional lines after the headers to send, if any. If all you need are header details, top can be much quicker than the full text fetch of retr, provided your mail server implements the TOP command (most do). >>> from poplib import POP3 >>> conn = POP3('pop.earthlink.net') >>> conn.user('pp3e') '+OK' >>> conn.pass_('XXXX') '+OK pp3e has 19 messages (14827231 octets).' >>> conn.stat( ) (19, 14827231) >>> conn.list( ) ('+OK', ['1 34359', '2 1995', '3 3549', '4 1218', '5 2162', '6 6450837', '7 9666 ', '8 178026', '9 841855', '10 289869', '11 2770', '12 2094', '13 2092', '14 305 31', '15 5108864', '16 1032', '17 2729', '18 1850474', '19 13109'], 180) >>> conn.top(1, 0) ('+OK', ['Status: RO', 'To: pp3e@earthlink.net', 'X-ElinkBul: x+ZDXwyCjyELQI0yCm ...more deleted... ts, Wireless Security Tips, & More!', 'Content-Type: text/html', ''], 283) >>> conn.retr(16) ('+OK 1020 octets', ['Status: RO', 'Return-Path: <pp3e@earthlink.net>', 'Receive ...more deleted... '> Enjoy!', '> ', '', ''], 1140) >>> conn.quit( ) Printing the full text of a message is easy: simply concatenate the line strings returned by retr or top, adding a newline between ('\n'.join(lines) will usually suffice). Parsing email text to extract headers and components is more complex, especially for mails with attached and possibly encoded parts, such as images. As we'll see later in this chapter, the standard library's email package can parse the mail's full or headers text after it has been fetched with poplib (or imaplib). See the Python library manual for details on other POP module tools. As of Python 2.4, there is also a POP3_SSL class in the poplib module that connects to the server over an SSL-encrypted socket on port 995 by default (the standard port for POP over SSL). It provides an identical interface, but it uses secure sockets for the conversation where supported by servers. |