Question

为什么单词边界不起作用？

阅读this site，我知道单词边界的工作原理如下：

有三种不同的职位符合词边界：

下面的a字符串似乎符合上面列出的至少一个位置。

a = 'Builders Club The Ohio State'
re.sub('\bThe\b', '', a, flags=re.IGNORECASE)

输出。＆＃39; The＆＃39;

没有变化

'Builders Club The Ohio State'

为什么单词边界不起作用？

当我在＆＃39;之前和之后放置空格时＆＃39;模式，正则表达似乎有效。

a = 'Builders Club The Ohio State'
re.sub(' The ', ' ', a, flags=re.IGNORECASE)

输出：

'Builders Club Ohio State'

Answer 1

您需要使用raw-string作为正则表达式模式（不处理转义序列）：

>>> import re
>>> a = 'Builders Club The Ohio State'
>>> re.sub(r'\bThe\b', '', a, flags=re.IGNORECASE)
'Builders Club  Ohio State'
>>>

否则，\b将被解释为退格符：

>>> print('x\by')
y
>>> print(r'x\by')
x\by
>>>

Answer 2

试试这个

import re
p = re.compile(ur'\bThe\b', re.IGNORECASE)
test_str = u"Builders Club The Ohio State"
subst = u""

result = re.sub(p, subst, test_str)

输出：

Builders Club Ohio State

这是DEMO