Question

我不确定用一句话总结这个标题的最佳方法，所以请编辑它以便在必要时更清楚。

我有一个格式为

的字符串列表（从网页解析）

"\tLocation\tNext Available Appointment: Date\n"

我想将其转换为列表列表，每个列表的格式为

["Location", "Date"]

我知道我会使用什么正则表达式，但我不知道如何使用结果。

（作为参考，这里是找到我想要的正则表达式。）

^\t(.*)\t.*: (.*)$

我找到了如何将正则表达式与文本匹配，但不是如何将结果提取到其他内容。不过我是Python的新手，所以我承认我在搜索时可能错过了一些东西。

Answer 1

您可以在列表解析中使用re.findall()函数：

import re
[re.findall(r'^\t(.*)\t.*: (.*)$',i) for i in my_list]

例如：

>>> my_list=["\tLocation\tNext Available Appointment: Date\n","\tLocation2\tNext Available Appointment: Date2\n"]
>>> [re.findall(r'^\t(.*)\t.*: (.*)$',i) for i in my_list]
[[('Location', 'Date')], [('Location2', 'Date2')]]

您还可以re.search()方法使用groups()：

>>> [re.search(r'^\t(.*)\t.*: (.*)$',i).groups() for i in my_list]
[('Location', 'Date'), ('Location2', 'Date2')]

请注意，此处re.search的优点是您将获得元组列表而不是元组列表列表（使用findall()）。

Answer 2

您可以使用

获取一个单一列表

import re
p = re.compile(r'^\t(.*)\t.*: (.*)$')
test_str = "    Location    Next Available Appointment: Date\n"
print [item for sublist in re.findall(p, test_str) for item in sublist]

输出：

['Location', 'Date']

请参阅IDEONE demo

修改：

或者，您可以使用finditer：

import re p = re.compile(r'(?m)^\t(.*)\t.*: (.*)$') test_str = " Location Next Available Appointment: Date\n Location1 Next Available Appointment: Date1\n" print [(x.group(1), x.group(2)) for x in re.finditer(p, test_str)]

输出od another demo：

[('Location', 'Date'), ('Location1', 'Date1')]

使用正则表达式反向引用来创建数组

2 个答案: