Python正则表达式 - findall和sub中的不同结果

时间:2014-12-11 18:26:06

标签: python regex

我正在尝试更换工作&br早午餐的发生。与' BRUNCH'。我正在使用一个正确识别出现的正则表达式,但是当我尝试使用re.sub时,它正在替换比用re.findall标识的文本更多的文本。我正在使用的正则表达式是:

re.compile(r'(?:^|\.)(?![^.]*saturday)(?![^.]*sunday)(?![^.]*weekend)[^.]*(brunch)',re.IGNORECASE)

字符串是

str = 'Valid only for dine-in January 2 - March 31, 2015. Excludes brunch, happy hour, holidays, and February 13 - 15, 2015.'

我希望它能产生:

'Valid only for dine-in January 2 - March 31, 2015. Excludes BRUNCH, happy hour, holidays, and February 13 - 15, 2015.'

步骤:

>>> reg.findall(str)
>>> ['brunch']
>>> reg.sub('BRUNCH',str)
>>> Valid only for dine-in January 2 - March 31, 2015BRUNCH, happy hour, holidays, and February 13 - 15, 2015.

编辑:

我使用的最终解决方案是:

re.compile(r'((?:^|\.))(?![^.]*saturday)(?![^.]*sunday)(?![^.]*weekend)([^.]*)(brunch)',re.IGNORECASE)
re.sub('\g<1>\g<2>BRUNCH',str)

3 个答案:

答案 0 :(得分:2)

re.sub使用

(^|\.)(?![^.]*saturday)(?![^.]*sunday)(?![^.]*weekend)([^.]*)(brunch)

替换为\1\2BRUNCH。请参阅演示。

https://regex101.com/r/eZ0yP4/16

答案 1 :(得分:2)

通过正则表达式:

(^|\.)(?![^.]*saturday)(?![^.]*sunday)(?![^.]*weekend)([^.]*)brunch

DEMO

\1\2BRUNCH

替换匹配的字符

答案 2 :(得分:0)

为什么匹配超过brunch

因为你的正则表达式实际上比早午餐更匹配

See link on how the regex match

为什么它不显示在findall

因为您只在{panthesis

中包裹了brunch
>>> reg = re.compile(r'(?:^|\.)(?![^.]*saturday)(?![^.]*sunday)(?![^.]*weekend)[^.]*(brunch)',re.IGNORECASE)
>>> reg.findall(str)
['brunch']

将整个([^.]*brunch)包裹在paranthesis

之后
>>> reg = re.compile(r'(?:^|\.)(?![^.]*saturday)(?![^.]*sunday)(?![^.]*weekend)([^.]*brunch)',re.IGNORECASE)
>>> reg.findall(str)
[' Excludes brunch']
  • re.findall忽略那些没有被限制的