pandas pivot_table,日期为值

时间:2016-08-10 07:37:13

标签: pandas dataframe pivot-table

假设我有以下客户数据表

df = pd.DataFrame.from_dict({"Customer":[0,0,1], 
        "Date":['01.01.2016', '01.02.2016', '01.01.2016'], 
        "Type":["First Buy", "Second Buy", "First Buy"], 
        "Value":[10,20,10]})

看起来像这样:

Customer |   Date   |   Type   |   Value
-----------------------------------------
       0 |01.01.2016|First Buy |     10 
-----------------------------------------
       0 |01.02.2016|Second Buy|     20 
-----------------------------------------
       1 |01.01.2016|First Buy |     10 

我想通过Type列来旋转表。 但是,旋转仅会将数值值列作为结果。 我想要一个像这样的结构:

 Customer | First Buy Date | First Buy Value | Second Buy Date | Second Buy Value
---------------------------------------------------------------------------------

缺少的值是NAN或NAT 这是否可以使用pivot_table。如果没有,我可以想象一些解决方法,但它们非常长。还有其他建议吗?

1 个答案:

答案 0 :(得分:6)

使用unstack

df1 = df.set_index(['Customer', 'Type']).unstack()
df1.columns = ['_'.join(cols) for cols in df1.columns]
print (df1)
         Date_First Buy Date_Second Buy  Value_First Buy  Value_Second Buy
Customer                                                                  
0            01.01.2016      01.02.2016             10.0              20.0
1            01.01.2016            None             10.0               NaN

如果需要其他订单的列,请使用swaplevelsort_index

df1 = df.set_index(['Customer', 'Type']).unstack()

df1.columns = ['_'.join(cols) for cols in df1.columns.swaplevel(0,1)]
df1.sort_index(axis=1, inplace=True)
print (df1)
         First Buy_Date  First Buy_Value Second Buy_Date  Second Buy_Value
Customer                                                                  
0            01.01.2016             10.0      01.02.2016              20.0
1            01.01.2016             10.0            None               NaN