如果包围Python正则表达式匹配

时间:2017-09-10 17:14:45

标签: python regex

我尝试做一个正则表达式,它包括在人称代词和'?'之间获取特定单词的长度。与findall。 我做了一些研究,但没有找到如何检查一个字符串是否以......开头并以...结束......是否有类似的东西?

编辑: 这是一个例子: 我有一篇很长的文字,我想知道“疯狂”一词多少次被问到这样一个问题:

Are you crazy ? -> match because there a personal pronouns AND a '?' between the word
you are crazy ? -> No, because the word is between a verb and a '?'
Is he crazy ? -> match because there a personal pronouns AND a '?' between the word

2 个答案:

答案 0 :(得分:0)

你所追求的目标似乎比正则表达更加复杂,但是这里可以提供帮助:

import re

word_to_match="crazy"
pattern = "[^\\.\\?]*\\s(he|she|you|it|they|I)\\s({})\\s?\\?".format(word_to_match)


print (re.findall(pattern, "Are you crazy? You are crazy? Is he crazy?"))

答案 1 :(得分:0)

如果你不想匹配"你疯了吗?"那应该可以解决问题。形式:

>>> import re
>>> pat = '(?:I|you|he|she|it|we|they|me|him|her|it|us|them) (?:an? )?(\w\w+)\s?\?'
>>> re.findall(pat, "is he available ? Isn't she a jerk ?")
['available', 'jerk']

否则,这可能有效:

>>> import re
>>> pat = '(?:I|you|he|she|it|we|they|me|him|her|it|us|them)((?: [a-z]+)+)\s?\?'
>>> filt = re.findall(pat, "is he available ? Isn't she a jerk ? you are crazy ?")
>>> filt
[' available', ' a jerk', ' are crazy']
#Then to get the number of times crazy appeared in a question :
>>> len([el for el in filt if "crazy" in el])
1

对于第二种方法,你实际上做了两个过滤器,一个用于提取人称代词和"?"之间的所有内容。然后,您再次过滤以计算目标词在这些问题表单中的次数。但是对于后者来说,一个更有效的方法是使用另一个正则表达式,因为我的实际肮脏的方式将计算" notsocrazy"作为比赛