在数据框的列表中查找最后一个值

时间:2018-12-09 17:25:47

标签: python pandas dataframe

请问,我想在数据帧中找到client的最后一个值,我该怎么做?

示例:

array[]=Integer.parseInt(str);
df = pd.DataFrame({'date': 
    ['2018-06-13', '2018-06-14', '2018-06-15', '2018-06-16'],
    'gain': [[10, 12, 15],[14, 11, 15],[9, 10, 12], [6, 4, 2]],
    'how':  [['customer1', 'customer2', 'customer3'], 
            ['customer4','customer5','customer6' ],
            ['customer7', 'customer8', 'customer9'],
            ['customer5', 'customer6', 'customer10'] ]}


   df : 
           date       gain                    how
    0 2018-06-13  [10, 12, 15]    [customer1, customer2, customer3]
    1 2018-06-14  [14, 11, 15]    [customer4, customer5, customer6]
    2 2018-06-15  [9, 10, 12]     [customer7, customer8, customer9]
    3 2018-06-16  [6, 4, 2]       [customer5, customer6, customer10]

非常感谢

1 个答案:

答案 0 :(得分:4)

然后使用unnesting函数,drop_duplicates

newdf=unnesting(df,['gain','how']).drop_duplicates('how',keep='last')
newdf
Out[25]: 
   gain         how        date
0    10   customer1  2018-06-13
0    12   customer2  2018-06-13
0    15   customer3  2018-06-13
1    14   customer4  2018-06-14
2     9   customer7  2018-06-15
2    10   customer8  2018-06-15
2    12   customer9  2018-06-15
3     6   customer5  2018-06-16
3     4   customer6  2018-06-16
3     2  customer10  2018-06-16

然后使用reindex

输入搜索列表
l=['customer5','customer6','customer20']

newdf.loc[newdf.how.isin(l)].set_index('how').reindex(l,fill_value='not_find')
Out[34]: 
                gain        date
how                             
customer5          6  2018-06-16
customer6          4  2018-06-16
customer20  not_find    not_find

有关此类问题的解答的有趣读物

How do I unnest a column in a pandas DataFrame?

def unnesting(df, explode):
    idx=df.index.repeat(df[explode[0]].str.len())
    df1=pd.concat([pd.DataFrame({x:np.concatenate(df[x].values)} )for x in explode],axis=1)
    df1.index=idx
    return df1.join(df.drop(explode,1),how='left')