使用re模块计算副词

时间:2019-06-15 14:53:29

标签: python python-3.x

使用python中的re.findall()计算给定字符串中的副词。副词是任何以"ly"结尾的单词。除最后两个字符外,带有"ly"的单词(例如“飞行”)不计入

def count_adverbs(text):

    advbs = re.findall(r"\w+ly", text)
    if advbs:
        return len(advbs)
    else:
        return 0

例如,我有这两个字符串

a = "flying"
b = "i clearly i lying lonely"

print(count_adverbs(a))给出1,但应为0,因为"ly"仅在字符串末尾才被计数 print(count_adverbs(b))工作正常。它给出2

2 个答案:

答案 0 :(得分:3)

您可以使用\b标记来指示单词边界:

\w+ly\b

但是您在这里不用Regex,用splitendswith进行字符串操作就足够了,并且应该比Regex快:

In [207]: [word for word in b.split() if word.endswith('ly')]                                                                                                                                               
Out[207]: ['clearly', 'lonely']

In [208]: re.findall(r'\w+ly\b', b)                                                                                                                                                                         
Out[208]: ['clearly', 'lonely']

时间:

In [209]: %timeit [word for word in b.split() if word.endswith('ly')]                                                                                                                                       
1.37 µs ± 13.2 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

In [210]: %timeit re.findall(r'\w+ly\b', b)                                                                                                                                                                 
2.27 µs ± 106 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

答案 1 :(得分:2)

您需要使用\b在正则表达式中定义单词边界,因此您的正则表达式将更改为\b\w+ly\b,并确保ly落在单词的末尾

您还可以通过检查三元运算符中advbs的长度来简化return语句

import re
def count_adverbs(text):

    advbs = re.findall(r"\b\w+ly\b", text)

    #Return length if advbs are non-empty, else return 0
    return len(advbs) if advbs else 0

print(count_adverbs("flying"))
print(count_adverbs("i clearly i lying lonely"))
print(count_adverbs("ly ly"))

输出将为

0
2
0
相关问题