python正则表达式语句在一行上

时间:2014-07-28 05:07:54

标签: python regex

我有一个正则表达式,但我希望将它们组合成单个表达式而不影响输出。下面的代码保存文本中的单词列表并保存到列表中。

import re
a=[]
with open('qwert.txt', 'r') as f:
    for line in f:
        res = re.findall(r'(?:Prof[.](\w+))', line)
        if res: 
            a.extend(res)
        res = re.findall(r'(?:As (\w+))', line)
        if res:
            a.extend(res)
        res = re.findall(r'\w+(?==\w)', line)
        if res:
            a.extend(res)

print a

qwert.txt

As every
prof.John and Prof.Keel and goodthing=him
Prof.Tensa
Keel a good person As kim
kim is fine
Prof.Jees
As John winning Nobel prize
As Mary wins all prize
sa for ask
car

he=is good

输出:

['every', 'Keel', 'goodthing', 'Tensa', 'kim', 'Jees', 'John', 'Mary', 'he']

如何将三个正则表达式stmts放在一行?

2 个答案:

答案 0 :(得分:0)

你可以使用运算符" |",它允许你找到一个或另一个表达式。  res = re.findall(r'(?:Prof[.](\w+))|(?:As (\w+))|(?:\w+(?==\w))', line)

答案 1 :(得分:0)

您需要将最后\w+内部捕获组封闭,并且还需要启用多行修改器。

>>> import re
>>> a=[]
>>> with open('qwert.txt', 'r') as f:
...     for line in f:
...         res = re.findall(r'(?:Prof[.](\w+))|(?:As (\w+))|(\w+)(?==\w)', line, re.M)
...         if res:
...             a.extend(res)
... 
>>> a
[('', 'every', ''), ('Keel', '', ''), ('', '', 'goodthing'), ('Tensa', '', ''), ('', 'kim', ''), ('Jees', '', ''), ('', 'John', ''), ('', 'Mary', ''), ('', '', 'he')]

没有任何捕获组,

>>> import re
>>> a=[]
>>> with open('qwert.txt', 'r') as f:
...     for line in f:
...         res = re.findall(r'(?<=Prof[.])\w+|(?<=As )\w+|\w+(?==\w)', line, re.M)
...         if res:
...             a.extend(res)
... 
>>> a
['every', 'Keel', 'goodthing', 'Tensa', 'kim', 'Jees', 'John', 'Mary', 'he']