python reg-ex模式不匹配

时间:2015-04-17 02:16:40

标签: python regex

我有以下模式和字符串的注册匹配问题。模式基本上是一个名称,后跟任意数量的字符,后跟一个短语(见下面的模式),后跟任意数量的字符,后跟机构名称。

pattern = "[David Maxwell|David|Maxwell] .* [educated at|graduated from|attended|studied at|graduate of] .* Eton College"
str = "David Maxwell was educated at Eton College, where he was a King's Scholar and Captain of Boats, and at Cambridge University where he rowed in the winning Cambridge boat in the 1971 and 1972 Boat Races."
match = re.search(pattern, str)

但是搜索方法返回上面的str不匹配?我的注册表适合吗?我是reg-ex的新手。任何帮助表示赞赏

1 个答案:

答案 0 :(得分:5)

[...]表示“来自这组字符的任何字符”。如果你想要“这组单词中的任何单词”,你需要使用括号:(...|...)

您的表达式中还有另一个问题,即.*(空格,点,星号,空格),这意味着“空格,后跟零个或多个字符,后跟空格”。换句话说,最短的匹配是两个空格。但是,您的文本在“受过教育的”和“伊顿公学”之间只有一个空格。

>>> pattern = '(David Maxwell|David|Maxwell).*(educated at|graduated from|attended|studied at|graduate of).*Eton College'
>>> str = "David Maxwell was educated at Eton College, where he was a King's Scholar and Captain of Boats, and at Cambridge University where he rowed in the winning Cambridge boat in the 1971 and 1972 Boat Races."
>>> re.search(pattern, str)
<_sre.SRE_Match object at 0x1006d10b8>