Question

我试图创建一个正则表达式来捕获字符串开头和结尾的数据，而不是中间的数据。这是一个贯穿概念的简化示例：

Player Hero wins the game on last minute goal. Score: 2. Opponent: 1. Points: 3. Player Doug loses the game. Score: 1. Opponent: 2. Points: 0 Player Hero loses the game. Score: 1. Opponent: 3. Points: 0. Player Guy wins the game. Score: 3. Opponent: 1. Points: 3. Player Hero ties the game [2ycs]. Score: 2. Opponent: 2. Points: 1. Player Jim has a tough go of it [1yc]. Score: 0. Opponent: 7. Points: 0.

我需要的是一个正则表达式，它抓住“ Player Hero”，但忽略文本的中间部分，而是抓住“分数：2。对手：1分：3”。数据部分与“ Player Hero”一起使用（注意：我不希望其他玩家获得数据。）

我了解了如何从头开始：

re.compile('Player Hero')

最后是：

re.compile('Score: \d*\. Opponent: \d*\. Points: \d\.')

我正在努力解决的问题是如何处理字符串中间的不符合要求的文本，以便基本上可以将上面的两个正则表达式组合在一起。

Answer 1

我相信您要查找的查询只是：

^Player Hero .+ Score: \d*\. Opponent: \d*\. Points: \d\.$

.+将匹配任何字符
^将与行首匹配
$将与行尾匹配

您可以在这里尝试： https://regex101.com/r/Y4FMXZ/1

请注意，第三个匹配项不是匹配项，因为Score没有冒号，但我假设这是一个错字。此外，该行的末尾有空格。如果发生这种情况，只需移除$。

如果您对捕获数字感兴趣，只需使用括号将它们放在捕获组中即可。

^Player Hero .+ Score: (\d*)\. Opponent:? (\d*)\. Points: (\d)\.$

Answer 2

您必须使用多行（？m）指令：

for item in re.findall(r"(?m)^Player Hero.*?Score:\s*(\d+)\.\s*Opponent:\s*(\d+)\.\s*Points:\s*(\d+)\.?\s*$",text):
   score,oppo,points=item
   print(f"score:{score},oppo:{oppo},points:{points}")

score:2,oppo:1,points:3
score:1,oppo:3,points:0
score:2,oppo:2,points:1

Python：在字符串中间创建带有不合格文本的正则表达式

2 个答案: