字典列表想要获取每个值并将它们放入单独的列表中吗?

时间:2018-12-11 20:27:56

标签: python list dictionary

我的代码输出如下:

[{'Total Population:': 4585, 'Total Water Ice Cover': 2.848142234497044, 'Total Developed': 17.205368316575324, 'Total Barren Land': 0.22439908514219134, 'Total Forest': 34.40642126612868},

 {'Total Population:': 4751, 'Total Water Ice Cover': 1.047783534830167, 'Total Developed': 37.27115716753022, 'Total Barren Land': 0.11514104778353484, 'Total Forest': 19.11341393206678},

 {'Total Population:': 3214, 'Total Water Ice Cover': 0.09166603009701321, 'Total Developed': 23.50469788404247, 'Total Barren Land': 0.2597204186082041, 'Total Forest': 20.418608204109695},

 {'Total Population:': 5005, 'Total Water Ice Cover': 0.0, 'Total Developed': 66.37545713124746, 'Total Barren Land': 0.0, 'Total Forest': 10.68671271840715},

...
]

我想做的是获取“总人口”的所有值并将其存储在一个列表中。然后获取所有“总水冰盖”并将其存储在另一个列表中,依此类推。使用这样的数据结构,如何提取出这些值并将它们存储在单独的列表中?

谢谢

5 个答案:

答案 0 :(得分:2)

如果您的目标是calculate Pearson's correlation,则应使用pandas

假设原始字典列表存储在名为output的变量中。您可以使用以下命令轻松将其转换为pandas DataFrame:

import pandas as pd
df = pd.DataFrame(output)
print(df)
#   Total Barren Land  Total Developed  Total Forest  Total Population:  Total Water Ice Cover
#0           0.224399        17.205368     34.406421               4585               2.848142 
#1           0.115141        37.271157     19.113414               4751               1.047784 
#2           0.259720        23.504698     20.418608               3214               0.091666   
#3           0.000000        66.375457     10.686713               5005               1.047784 

现在您可以轻松生成相关矩阵:

# this is just to make the output print nicer
pd.set_option("precision",4)  # only show 4 digits

# remove 'Total ' from column names to make printing smaller
df.rename(columns=lambda x: x.replace("Total ", ""), inplace=True)  

corr = df.corr(method="pearson")
print(corr)
#                 Barren Land  Developed  Forest  Population:  Water Ice Cover
#Barren Land           1.0000    -0.9579  0.7361      -0.7772           0.4001
#Developed            -0.9579     1.0000 -0.8693       0.5736          -0.6194
#Forest                0.7361    -0.8693  1.0000      -0.1575           0.9114
#Population:          -0.7772     0.5736 -0.1575       1.0000           0.2612
#Water Ice Cover       0.4001    -0.6194  0.9114       0.2612           1.0000

现在您可以通过键访问各个相关性:

print(corr.loc["Forest", "Water Ice Cover"])
#0.91135717479534217

答案 1 :(得分:1)

我猜你可以使用类似的东西:

d = [{'Total Population:': 4585, 'Total Water Ice Cover': 2.848142234497044, 'Total Developed': 17.205368316575324, 'Total Barren Land': 0.22439908514219134, 'Total Forest': 34.40642126612868},
 {'Total Population:': 4751, 'Total Water Ice Cover': 1.047783534830167, 'Total Developed': 37.27115716753022, 'Total Barren Land': 0.11514104778353484, 'Total Forest': 19.11341393206678},
 {'Total Population:': 3214, 'Total Water Ice Cover': 0.09166603009701321, 'Total Developed': 23.50469788404247, 'Total Barren Land': 0.2597204186082041, 'Total Forest': 20.418608204109695},
 {'Total Population:': 5005, 'Total Water Ice Cover': 0.0, 'Total Developed': 66.37545713124746, 'Total Barren Land': 0.0, 'Total Forest': 10.68671271840715}]

f = {}
for l in d:
    for k, v in l.items():
        if not k in f:
            f[k] = []
        f[k].append(v)
print(f)

{'Total Population:': [4585, 4751, 3214, 5005], 'Total Water Ice Cover': [2.848142234497044, 1.047783534830167, 0.09166603009701321, 0.0], 'Total Developed': [17.205368316575324, 37.27115716753022, 23.50469788404247, 66.37545713124746], 'Total Barren Land': [0.22439908514219134, 0.11514104778353484, 0.2597204186082041, 0.0], 'Total Forest': [34.40642126612868, 19.11341393206678, 20.418608204109695, 10.68671271840715]}

Python Demo

答案 2 :(得分:1)

您可以使用pandas

pd.DataFrame(my_dict).to_dict(orient='list')

返回:

{'Total Barren Land': [0.22439908514219134, 0.11514104778353484, 0.2597204186082041, 0.0],
'Total Developed': [17.205368316575324, 37.27115716753022, 23.50469788404247, 66.37545713124746],
'Total Forest': [34.40642126612868, 19.11341393206678, 20.418608204109695, 10.68671271840715],
'Total Population:': [4585, 4751, 3214, 5005],
'Total Water Ice Cover': [2.848142234497044, 1.047783534830167, 0.09166603009701321, 0.0]}

答案 3 :(得分:0)

调用您的词典列表dictionary_list。然后:

keys = {k  for d in dictionary_list for k in d.keys()}
list_of_values = [[v for d in dictionary_list for k, v in d.items() if k == key] for key in keys]

使用您的示例,输出:

[[17.205368316575324, 37.27115716753022, 23.50469788404247, 66.37545713124746],
 [0.22439908514219134, 0.11514104778353484, 0.2597204186082041, 0.0],
 [2.848142234497044, 1.047783534830167, 0.09166603009701321, 0.0],
 [4585, 4751, 3214, 5005],
 [34.40642126612868, 19.11341393206678, 20.418608204109695, 10.68671271840715]]

如果您要使用相关值列表创建新的字典,请在第二行切换为:

new_dict = {key: [v for d in dictionary_list for k, v in d.items() if k == key] for key in keys}

答案 4 :(得分:0)

如果所有字典都具有相同的键,那么您可以使用第一个字典的键:

result = {k:[d[k] for d in dictionary_list] for k in dictionary_list[0].keys()} 

如果字典可以具有不同的键集,但是您可以使用不同长度的列表,那么我将使用defaultdict来简化:

from collections import defaultdict
result = defaultdict(list)
for d in dictionary_list:
    for k, v in d.items():
        result[k].append(v)

如果字典可能具有不同的键集,并且您希望所有列表的长度都相同,则需要迭代两次。当密钥丢失时,您还需要某种占位符值来使用。如果我们要使用None,我们可以这样做:

placeholder = None
keys = set()
for d in dictionary_list:
    keys += set(d.keys())
result = {k:[] for k in keys}
for d in dictionary_list:
    for k in keys:
        result[k].append(d.get(k, placeholder))

在每种情况下,result都是列表的决定。如果您想要一个列表列表,它实际上甚至更简单:

result = [[d[k] for d in dictionary_list] for k in dictionary_list[0].keys()]

如果您希望所有列表的长度都相同并且包含占位符,那么您仍然需要使用列表字典作为中间步骤。但是从列表的字典转换为值的列表列表很容易:

list_of_lists_of_values = list(dict_of_lists_of_values.values())

也就是说,在Python 3.7之前,字典没有明确定义的迭代顺序,因此无论如何,您最好还是使用字典,因为否则很难确定您获得了正确的值(例如,不能保证“总人口”是第一批值。

相关问题