Question

我希望使用n >= 0捕获 n 次使用 n 次重复的模式。我的字符串看起来像这样：

a = 'x="2"'
b = 'x="2,3", y="hello", z="true"'

我想提取＆＃39; x＆＃39;及其价值＆＃39; 2,3＆＃39;，＆＃39; y＆＃39;它的值＆＃39; hello＆＃39;等变量用逗号分隔，后跟空格;值在双引号内。

如何使用Python中的re库进行此操作？

我天真地尝试了以下内容：

match = re.search(r'^((?P<variable>[0-9a-zA-Z_-]+)="(?P<value>.*)"(?:,\s)?)*', b)

如果我打印match.groupdict()，它会输出：

{'variable': 'x', 'value': '2,3", y="hello", z="true'}

Answer 1

你得到的比你讨价还价更多的原因是你匹配（删除了组命名）：

".*"

由于正则表达式默认情况下使用贪婪匹配，因此只要它最终可以在"上添加，它就会尽可能多地抓取文本，即使插入的文本也包含"。你可以把它变成非贪婪的比赛：

"(?P<value>.*?)"

或贪婪地匹配非"字符：

"(?P<value>[^"]*)"

下一个问题是你会发现这只匹配字符串中最后一次出现的模式。如果您想获得所有未知数量的匹配，则需要re.findall()。很遗憾，findall()不支持groupdict。但是，它的表兄re.finditer()返回具有以下方法的匹配对象：

for match in re.finditer(r'(?P<variable>[0-9a-zA-Z_-]+)="(?P<value>[^"]*)"', b):
    print(match.groupdict())

{'variable': 'x', 'value': '2,3'}
{'variable': 'y', 'value': 'hello'}
{'variable': 'z', 'value': 'true'}

Answer 2

import re

a = 'x="2"'
b = 'x="2,3", y="hello", z="true"'

p = '(\w+)=\"([^\"]*)\"'

ms = re.findall(p, b)
print ms
ms = re.findall(p, a)
print ms

输出：

D:\>python reg.py
[('x', '2,3'), ('y', 'hello'), ('z', 'true')]
[('x', '2')]

D:\>

Answer 3

您可以在一行中尝试正面观察

import re
pattern=r'(?<=((\w)=))"(.*?)"'
string="""'x="2,3", y="hello", z="true"'"""

print([(i.group(2),i.group(3)) for i in re.finditer(pattern,string)])

输出：

[('x', '2,3'), ('y', 'hello'), ('z', 'true')]

在Python

3 个答案: