long = """ADDRESS: Some place in the world
TEL: 555 5555 5555 TYPE: Apartment/High
Data Accuracy: Very heigh building with plenty of corroborating data"""
假设我有一个这样的长字符串,我想解析它们并添加到我的字典中
mydict = {'Adress':[],'Tel':[],'Type':[],'Data Accuracy':[]}
我已经试过了
import re
x = re.split('ADRESS',long)
然而这还不够,我想将它们解析为 4 块并将它们添加到 mydict 中。x = re.split('ADRESS',long)
这只会将其解析为一个和平。
答案 0 :(得分:1)
您仍然需要进行一些清理工作,即删除前面和后面的空白,因此 strip()
但这将按要求工作:
import re
long = """ADDRESS: Some place in the world
TEL: 555 5555 5555 TYPE: Apartment/High
Data Accuracy: Very heigh building with plenty of corroborating data"""
x = re.split(r"[a-zA-Z]+:",long)
print(x)
# ['', ' Some place in the world\n ', ' 555 5555 5555 ', ' Apartment/High\n Data ', ' Very heigh building with plenty of corroborating data']
clean = []
for item in x:
if item != "":
clean.append(item.split('\n')[0].strip())
print(clean)
# ['Some place in the world', '555 5555 5555', 'Apartment/High', 'Very heigh building with plenty of corroborating data']
答案 1 :(得分:1)
不使用正则表达式。
long = """ADDRESS: Some place in the world
TEL: 555 5555 5555 TYPE: Apartment/High
Data Accuracy: Very heigh building with plenty of corroborating data"""
indicators = ["ADDRESS", "TEL", "TYPE", "Data Accuracy"]
dict_ = dict()
for indicator in reversed(indicators):
long, value = long.split(indicator + ":")
dict_[indicator] = value.strip()
print(dict_)
输出:
{'Data Accuracy': 'Very heigh building with plenty of corroborating data', 'TYPE': 'Apartment/High', 'TEL': '555 5555 5555', 'ADDRESS': 'Some place in the world'}