Recipe 13.7. Unpacking a Multipart MIME MessageCredit: Matthew Cowles ProblemYou want to unpack a multipart MIME message. SolutionThe walk method of message objects generated by the email package makes this task really easy. Here is a script that uses email to solve the task posed in the "Problem": import email.Parser import os, sys def main( ): if len(sys.argv) != 2: print "Usage: %s filename" % os.path.basename(sys.argv[0]) sys.exit(1) mailFile = open(sys.argv[1], "rb") p = email.Parser.Parser( ) msg = p.parse(mailFile) mailFile.close( ) partCounter = 1 for part in msg.walk( ): if part.get_main_type( ) == "multipart": continue name = part.get_param("name") if name == None: name = "part-%i" % partCounter partCounter += 1 # In real life, make sure that name is a reasonable filename # for your OS; otherwise, mangle that name until it is! f = open(name, "wb") f.write(part.get_payload(decode=1)) f.close( ) print name if _ _name_ _=="_ _main_ _": main( ) DiscussionThe email package makes parsing MIME messages reasonably easy. This recipe shows how to unbundle a MIME message with the email package by using the walk method of message objects. You can create a message object in several ways. For example, you can instantiate the email.Message.Message class and build the message object's contents with calls to its methods. In this recipe, however, I need to read and analyze an existing message, so I work the other way around, calling the parse method of an email.Parser.Parser instance. The parse method takes as its only argument a file-like object (in the recipe, I pass it a real file object that I just opened for binary reading with the built-in open function) and returns a message object, on which you can call message object methods. The walk method is a generator (i.e., it returns an iterator object on which you can loop with a for statement). You usually will use this method exactly as I use it in this recipe: for part in msg.walk( ): The iterator sequentially returns (depth-first, in case of nesting) the parts that make up the message. If the message is not a container of parts (i.e., has no attachments or alternatesmessage.is_multipart returns false), no problem: the walk method will then return an iterator with a single elementthe message itself. In any case, each element of the iterator is also a message object (an instance of email.Message.Message), so you can call on it any of the methods that a message object supplies. In a multipart message, parts with a type of 'multipart/something' (i.e., a main type of 'multipart') may be present. In this recipe, I skip them explicitly since they're just glue holding the true parts together. I use the get_main_type method to obtain the main type and check it for equality with 'multipart'; if equality holds, I skip this part and move to the next one with a continue statement. When I know I have a real part in hand, I locate its name (or synthesize one if it has no name), open that name as a file, and write the message's contents (also known as the message's payload), which I get by calling the get_payload method, into the file. I use the decode=1 argument to ensure that the payload is decoded back to a binary content (e.g., an image, a sound file, a movie) if needed, rather than remaining in text form. If the payload is not encoded, decode=1 is innocuous, so I don't have to check before I pass it. See AlsoRecipe 13.6; documentation for the standard library package email in the Library Reference. |