Python正则表达式捕获两次匹配之间的文本?

时间:2020-06-15 20:48:29

标签: python regex

我想使用匹配两个字符串之间任何文本的正则表达式:

DATE POSTED: MAY 30, 2018, some text here, Garcia Answer 1: more text, DATE POSTED: MARCH 8, 2017, some text here, Smith Answer 2: more text, DATE POSTED: JUNE 17, 2018, some text here, Jones Answer 1: more text...

在此示例中,我想搜索DATE POSTED: [*date*],, [*Name*] Answer [*number*]:并抓住它们之间的所有内容。

换句话说,我想找到所有some text here

我正在使用Python 3x。

3 个答案:

答案 0 :(得分:0)

#N/A

将所有内容打印为列表

答案 1 :(得分:0)

方法如下:

import re
s = "DATE POSTED: MAY 17, 2018, some text here, Garcia Answer 1: more text"
print(re.findall(r'(?<=DATE POSTED: MAY 17, 2018, )(.*)(?=, Garcia Answer 1)',s))

输出:

['some text here']

答案 2 :(得分:0)

您可以尝试:

DATE POSTED: .*?, \d{4}, (.*?),

上述正则表达式的解释:

DATE POSTED: .*?, -从字面上匹配DATE POSTED: 以及第一个,之前的所有内容。这就是我使用延迟匹配(即.*?)的原因。

\d{4}, -匹配年份部分,即\d{4}代表,之前的4位数字。

(.*?), -表示一个捕获组,它与第一个,字符之前懒惰后的所有内容匹配。

您可以在here.中找到上述正则表达式的演示

Pictorial Representation

在python中的实现:

import re

regex = r"DATE POSTED: .*?, \d{4}, (.*?),"

test_str = "DATE POSTED: MAY 30, 2018, some text here1, Garcia Answer 1: more text, DATE POSTED: MARCH 8, 2017, some text here2, Smith Answer 2: more text, DATE POSTED: JUNE 17, 2018, some text here3, Jones Answer 1: more text..."

matches = re.findall(regex, test_str)
print(matches)
# For making a difference between strings I used "some text here1, 2 and 3".
# output: matches = ['some text here1', 'some text here2', 'some text here3']

您可以在here.

中找到上述实现的示例运行
相关问题