Thursday, July 11, 2013

NOTES on Implementing an Email Parser with Python

The most common method for reading an email message is:

import email
# read a file which has just one message
f = open("mbox",'r')
emailMessage = email.message_from_file(f)
f.close()

Or you can read the entire file first by doing:

 import email
# Process the message
emailMessage = email.message_from_string(str)

Either way, the email message has been processed for ease of use. The header has been stored in a python dictionary, and the body of the message is in a blob.

The list of available rfc822 headers can be gotten by:

print emailMessage.keys()

Next, you might want to get the value from a field. The lookup is not case sensative.

print emailMessage.get('subject')

And of course you'd like to get the body of the message.

print emailMessage.payload()

No comments: