熊猫-操作数据框以创建多级列

时间:2018-10-04 00:27:32

标签: python pandas pivot-table

以下是数据框:

A  B     val  val2  loc
1  march 3    2     NY
1  april 5    1     NY
1  may   12   4     NY
2  march 4    1     NJ
2  april 7    5     NJ
2  may   12   1     NJ
3  march 1    8     CA
3  april 54   6     CA
3  may   2    9     CA

我想将其转换为:

       march march april april may  may
       val1  val2  val1  val2  val1 val2
A  B   
1  NY  3     5     12   2     1     4
2  NJ  4     7     12   1     5     5
3  CA  1     54    2    8     6     9

我正在研究数据透视表以及堆栈和堆栈,但我确实陷入困境。我不确定从哪里开始

1 个答案:

答案 0 :(得分:0)

使用pd.pivot_table和一些级别交换:

new_df = (pd.pivot_table(df,['val','val2'],['A','loc'],['B'])
          .sort_index(axis=1, level=1)
          .swaplevel(0, axis=1))


>>> new_df
B     april      march      may     
        val val2   val val2 val val2
A loc                               
1 NY      5    1     3    2  12    4
2 NJ      7    5     4    1  12    1
3 CA     54    6     1    8   2    9

如果列的顺序很重要(例如,您需要将它们分别设为marchaprilmay),则可以将其设置为有序分类:

new_df = (pd.pivot_table(df,['val','val2'],['A','loc'],
                         [pd.Categorical(df.B, categories=['march','april','may'],
                                         ordered=True)])
          .dropna(how='all')
          .sort_index(axis=1, level=1)
          .swaplevel(0, axis=1))

>>> new_df
B     march      april        may     
        val val2   val val2   val val2
A loc                                 
1 NY    3.0  2.0   5.0  1.0  12.0  4.0
2 NJ    4.0  1.0   7.0  5.0  12.0  1.0
3 CA    1.0  8.0  54.0  6.0   2.0  9.0