用于提取re.findall匹配的列表理解

时间:2017-06-14 22:32:28

标签: python regex functional-programming iterator generator

我有一个带有单个组的正则表达式,我想用它将字符串列表映射到匹配的字符串的过滤匹配列表。目前,我正在使用以下内容:

matches = (re.findall(r'wh(at)ever', line) for line in lines)
matches = [m[0] for m in matches if m]

如何仅使用过滤器,地图和理解来更优雅地完成此操作?显然,我可以使用for循环,但我想知道它是否可以纯粹通过操作迭代器来完成。

2 个答案:

答案 0 :(得分:1)

您可以使用地图和过滤器。这是一种方式。

matches = map(lambda x: x[0], filter(None, map(lambda x: re.findall(r'wh(at)ever', x), lines)))

如果您使用的是python3,请不要忘记最后使用list(...)

然而,我并不认为这里需要更多“优雅”。你正在做的事情非常好。

另一种礼貌@ juanpa.arrivillaga:

from functools import partial
list(map(itemgetter(0), filter(None, map(partial(re.findall, r'wh(at)ever'), lines))))

答案 1 :(得分:1)

There's no real advantage obfuscating your code with map, filter, or other functional tricks since a list comprehension is fast, simple and clear:

import re

lines = ['wh1atever wh1btever', 'wh2atever', '', 'wh4atever wh4btever wh4ctever']

'''Since you only want the first item for each line,
   using re.findall is a waste of time, re.search is more appropriate'''

pat1 = re.compile(r'wh(..)tever')
res1 = [ m.group(1) for m in (pat1.search(line) for line in lines) if m ]

print(res1)
'''['1a', '2a', '4a']'''


'''or if there are few lines, you can join them and use re.findall this time,
   with a pattern that consumes the end of the line'''

pat2 = re.compile(r'wh(..)tever.*')
res2 = pat2.findall("\n".join(lines))

print(res2)
'''['1a', '2a', '4a']'''
相关问题