如何将带有OrderedDict的混合列表转换为数据框?

时间:2019-06-03 04:46:28

标签: python python-3.x

我还有一个混合了OrderedDict的列表。我正在尝试从该列表创建一个数据框。我不确定如何遍历列表中的OrderedDict中的列表和键值对。

以下是列表:

l = [(-2322251069948147489, [OrderedDict([('lat', '46.72161'), ('lon', '-92.45936'), ('name', 'Cloquet'), ('admin1', 'Minnesota'), ('admin2', 'Carlton County'), ('cc', 'US')])]), 
(-2542975094649810558, [OrderedDict([('lat', '38.52491'), ('lon', '-121.9708'), ('name', 'Winters'), ('admin1', 'California'), ('admin2', 'Yolo County'), ('cc', 'US')])]), 
(-1984478776812705270, [OrderedDict([('lat', '38.88101'), ('lon', '-77.10428'), ('name', 'Arlington'), ('admin1', 'Virginia'), ('admin2', 'Arlington County'), ('cc', 'US')])]), 
(-2720329071386930320, [OrderedDict([('lat', '41.70054'), ('lon', '-93.46216'), ('name', 'Bondurant'), ('admin1', 'Iowa'), ('admin2', 'Polk County'), ('cc', 'US')])])]

我正在尝试将以上列表转换为数据框

df = pd.DataFrame(l)

这仅给我2列。我想得到的是下面的

           0                 1               2
   -2322251069948147489  Minnesota    Carlton County
   -2542975094649810558  California   Yolo County
   -1984478776812705270  Virginia     Arlington County
   -2720329071386930320  Iowa         Polk County

我不确定如何查看数据框中的键值付费列。任何帮助将不胜感激。

2 个答案:

答案 0 :(得分:1)

db.User.find({$text: {$search: "kark"}}) pd.concatpd.Series结合使用:

pd.DataFrame

输出:

import pandas as pd

new_l = [(i[0], i[1][0]) for i in l]
# Unpacks the list of OrderedDict

ind, dicts = map(pd.Series, zip(*new_l))
df = pd.concat([ind, pd.DataFrame(list(dicts))], 1)

您现在可以选择所需的列: 0 lat lon name admin1 \ 0 -2322251069948147489 46.72161 -92.45936 Cloquet Minnesota 1 -2542975094649810558 38.52491 -121.9708 Winters California 2 -1984478776812705270 38.88101 -77.10428 Arlington Virginia 3 -2720329071386930320 41.70054 -93.46216 Bondurant Iowa admin2 cc 0 Carlton County US 1 Yolo County US 2 Arlington County US 3 Polk County US

df[[0, 'admin1', 'admin2']]

答案 1 :(得分:0)

from collections import OrderedDict
import pandas as pd

l=[(-2322251069948147489, [OrderedDict([('lat', '46.72161'), ('lon', '-92.45936'), 
('name', 'Cloquet'), ('admin1', 'Minnesota'), ('admin2', 'Carlton County'), ('cc', 
'US')])]), 
(-2542975094649810558, [OrderedDict([('lat', '38.52491'), ('lon', '-121.9708'), 
('name', 'Winters'), ('admin1', 'California'), ('admin2', 'Yolo County'), ('cc', 
'US')])]), 
(-1984478776812705270, [OrderedDict([('lat', '38.88101'), ('lon', '-77.10428'), 
('name', 'Arlington'), ('admin1', 'Virginia'), ('admin2', 'Arlington County'), ('cc', 
'US')])]), 
(-2720329071386930320, [OrderedDict([('lat', '41.70054'), ('lon', '-93.46216'), 
('name', 'Bondurant'), ('admin1', 'Iowa'), ('admin2', 'Polk County'), ('cc', 
'US')])])]

name=[]
admin1=[]
_id= []
for i in l:
    _id.append(i[0])
    name.append(i[1][0]['name'])
    admin1.append(i[1][0]['admin1'])

df= pd.DataFrame(data=[_id,admin1,name]).T
print(df)
相关问题