Question

每一行都有这样的形式：

[id=52, idRegion=3857, tipo=New, CustomerDetails=[id=10, countryCode=DE, ... and so on

我想要完成的是逐行读取一个带有id，idRegion等值的元组，就像这样

(52,3857,New,10,DE ....), (another line with tuple).... to later to put in an excel

我已经尝试过了，但这似乎与我想要的太相符了：

a = re.findall( "id=(\d+),.idRegion=\d+, tipo=.*?,", file_txt)
b = re.findall( "id=\d+,.idRegion=(\d+),.tipo=.*?,", file_txt)
c = re.findall( "id=\d+,.idRegion=\d+,.tipo=(.*?),", file_txt)
d = [tuple(j for j in i if j)[-1] for i in a,b,c]
print c

Answer 1

我们对您的输入数据格式知之甚少。假设您的密钥仅由字母数字字符组成，值由字母数字和空格组成，您可以使用\w+=([\w\s]+?)[,\]]正则表达式来捕获值。通过re.findall()对每一行应用表达式：

import re


data = """
[id=52, idRegion=3857, tipo=New, CustomerDetails=[id=10, countryCode=DE]
[id=100, idRegion=11, tipo=New Something, CustomerDetails=[id=20, countryCode=DE]
"""

pattern = re.compile(r"\w+=([\w\s]+?)[,\]]")

print([
    tuple(pattern.findall(line)) for line in data.splitlines() if line
])

打印：

[
    ('52', '3857', 'New', '10', 'DE'), 
    ('100', '11', 'New Something', '20', 'DE')
]

如何在python中逐行解析并将多个值放在元组中

1 个答案: