通过文本文件查找特定字符串

时间:2016-08-10 17:13:53

标签: python-2.7

对于我的问题,我已经有了一个解决方案,但为了个人的改进,我想知道更好地解决同样的问题。

从课程开始,作为测验的一部分,我们在文本中查找以:

开头的行
  

来自stephen.marquard@uct.ac.za 2008年1月5日星期六09:14:16

并从文本中提取电子邮件字符串,但首先我们在找到匹配项时使用split()。样本输出应为:

louis@media.berkeley.edu
louis@media.berkeley.edu
ray@media.berkeley.edu
cwen@iupui.edu
cwen@iupui.edu
cwen@iupui.edu
There were 27 lines in the file with From as the first word

这是我提出的用于作业的代码。我愿意提出更好的方法来编写我的程序来提取电子邮件字符串

import re

fname = raw_input("Enter file name: ")
if len(fname) < 1 : fname = "mbox-short.txt"

fh = open(fname)
count = 0
matches = []

for lines in fh :
    # look for specific characters in document text
    if not lines.startswith("From ") : continue
    # increment the count variable for each math found
    count += 1
    # append the required lines to the matches list
    matches.append(lines)
    # loop through the list to acess each line individually
    for email in matches :
        # place values in variable
        out = email
        # looking through each line for any email add found
        found = re.findall(r'[\w\.-]+@[\w\.-]+', out)
        # loop through the found emails and print them out
        for i in found :
            ans = i
    print ans       
    # print count
print "There were", count, "lines in the file with From as the first word"

0 个答案:

没有答案
相关问题