Recipe13.7.Unpacking a Multipart MIME Message


Recipe 13.7. Unpacking a Multipart MIME Message

Credit: Matthew Cowles

Problem

You want to unpack a multipart MIME message.

Solution

The walk method of message objects generated by the email package makes this task really easy. Here is a script that uses email to solve the task posed in the "Problem":

import email.Parser import os, sys def main( ):     if len(sys.argv) != 2:         print "Usage: %s filename" % os.path.basename(sys.argv[0])         sys.exit(1)     mailFile = open(sys.argv[1], "rb")     p = email.Parser.Parser( )     msg = p.parse(mailFile)     mailFile.close( )     partCounter = 1     for part in msg.walk( ):         if part.get_main_type( ) == "multipart":             continue         name = part.get_param("name")         if name == None:             name = "part-%i" % partCounter         partCounter += 1         # In real life, make sure that name is a reasonable filename          # for your OS; otherwise, mangle that name until it is!         f = open(name, "wb")         f.write(part.get_payload(decode=1))         f.close( )         print name if _ _name_ _=="_ _main_ _":     main( )

Discussion

The email package makes parsing MIME messages reasonably easy. This recipe shows how to unbundle a MIME message with the email package by using the walk method of message objects.

You can create a message object in several ways. For example, you can instantiate the email.Message.Message class and build the message object's contents with calls to its methods. In this recipe, however, I need to read and analyze an existing message, so I work the other way around, calling the parse method of an email.Parser.Parser instance. The parse method takes as its only argument a file-like object (in the recipe, I pass it a real file object that I just opened for binary reading with the built-in open function) and returns a message object, on which you can call message object methods.

The walk method is a generator (i.e., it returns an iterator object on which you can loop with a for statement). You usually will use this method exactly as I use it in this recipe:

for part in msg.walk( ):

The iterator sequentially returns (depth-first, in case of nesting) the parts that make up the message. If the message is not a container of parts (i.e., has no attachments or alternatesmessage.is_multipart returns false), no problem: the walk method will then return an iterator with a single elementthe message itself. In any case, each element of the iterator is also a message object (an instance of email.Message.Message), so you can call on it any of the methods that a message object supplies.

In a multipart message, parts with a type of 'multipart/something' (i.e., a main type of 'multipart') may be present. In this recipe, I skip them explicitly since they're just glue holding the true parts together. I use the get_main_type method to obtain the main type and check it for equality with 'multipart'; if equality holds, I skip this part and move to the next one with a continue statement. When I know I have a real part in hand, I locate its name (or synthesize one if it has no name), open that name as a file, and write the message's contents (also known as the message's payload), which I get by calling the get_payload method, into the file. I use the decode=1 argument to ensure that the payload is decoded back to a binary content (e.g., an image, a sound file, a movie) if needed, rather than remaining in text form. If the payload is not encoded, decode=1 is innocuous, so I don't have to check before I pass it.

See Also

Recipe 13.6; documentation for the standard library package email in the Library Reference.



Python Cookbook
Python Cookbook
ISBN: 0596007973
EAN: 2147483647
Year: 2004
Pages: 420

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net