Python-Parse email Body和truncate MIME标头

时间:2016-02-26 11:19:44

标签: python email parsing mime

我有一个看起来有点像的电子邮件正文。

现在我想从中删除所有标题,并且只有对话电子邮件文本。我怎么能在python中做到这一点?

我试过了email.parser模块但是并没有给我我想要的结果。

请查看以下代码以获取更多信息。

import email
a="""--c66f5985-233d-4e89-b598-6398b60cbe00
Content-Type: multipart/alternative;
     differences="Content-Type";
    boundary="d5eff9f8-76b3-4320-adfb-1e51add8fa8f"

--d5eff9f8-76b3-4320-adfb-1e51add8fa8f
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: quoted-printable

THis is a demo email body

Thanks And Regards,
Ana
"""



b = email.message_from_string(a)
if b.is_multipart():
    for payload in b.get_payload():
        # if payload.is_multipart(): ...
        print (payload.get_payload())
else:
    print (b.get_payload())

1 个答案:

答案 0 :(得分:0)

import imaplib,email

hst = "your.host.adresse.com"
usr = "login"
pwd = "password"

imap = imaplib.IMAP4(hst)

try:
    imap.login(usr, pwd)
except Exception as e:
    raise IOError(e)

try:
    imap.select("Inbox") # Tell Imap where to go
    result, data = imap.uid('search', None, "ALL")
    latest = data[0].split()[-1]
    result, data = imap.uid('fetch', latest, '(RFC822)')
    a = data[0][1] # This contains the Mail Data


except Exception as e:
    raise IOError(e)

b = email.message_from_string(a)
if b.is_multipart():
    for payload in b.get_payload():
        b = (payload.get_payload())
else:
    b = (b.get_payload())

print b

这会删除您在最终文本中不想要的邮件中的所有内容。我已用您的代码对此进行了测试。你没有显示你如何导入邮件(你的a),所以我想你可以从中获得解码问题。

如果您在使用HTML邮件时遇到任何问题:

from bs4 import BeautifulSoup
soup = BeautifulSoup(b, 'html.parser')
soup = soup.get_text()
print soup

现在应该完成这项工作,但我建议你将默认的python解析器更改为lxml或html5lib。

相关问题