Pandas Dataframe用多级索引展平交叉表

时间:2017-10-19 12:22:44

标签: python pandas dataframe

我有一个Excel文件,如下所示:

+-------+-------+-------+-------+-------+-------+
|       | Cat1  | Cat1  | Cat1  | Cat1  | Cat1  |
+-------+-------+-------+-------+-------+-------+
|       | Type1 | Type1 | Type1 | Type1 | Type2 |
+-------+-------+-------+-------+-------+-------+
|       | 2018  | 2018  | 2018  | 2018  | 2018  |
+-------+-------+-------+-------+-------+-------+
| Name  | 1Q    | 2Q    | 3Q    | 4Q    | 1Q    |
+-------+-------+-------+-------+-------+-------+
| Name1 | 1     | 5     | 3     | 5     | 4     |
+-------+-------+-------+-------+-------+-------+
| Name2 | 3     | 23    | 4     | 2     | 4     |
+-------+-------+-------+-------+-------+-------+
| Name3 | 4     | 3     | 5     | 3     | 44    |
+-------+-------+-------+-------+-------+-------+
| Name4 | 3     | 6     | 5     | 4     | 2     |
+-------+-------+-------+-------+-------+-------+

......等等

我想格式化它,使它看起来像这样:

+-------+------+-------+------+---------+-------+
| Name  | Cat  | Type  | Year | Quarter | Value |
+-------+------+-------+------+---------+-------+
| Name1 | Cat1 | Type1 | 2018 | 1Q      | 5     |
+-------+------+-------+------+---------+-------+
| Name1 | Cat1 | Type1 | 2018 | 2Q      | 3     |
+-------+------+-------+------+---------+-------+
| Name1 | Cat1 | Type1 | 2018 | 3Q      | 5     |
+-------+------+-------+------+---------+-------+
| Name1 | Cat1 | Type1 | 2018 | 4Q      | 4     |
+-------+------+-------+------+---------+-------+
| Name1 | Cat1 | Type2 | 2018 | 1Q      | 6     |
+-------+------+-------+------+---------+-------+

我已将其加载到pandas DataFrame中,并且我不确定如何继续进行。是融化,堆叠,拆散,MultiIndex ......?

1 个答案:

答案 0 :(得分:0)

使用stack

print (df.columns)
MultiIndex(levels=[['Cat1'], ['Type1', 'Type2'], ['2018'], ['1Q', '2Q', '3Q', '4Q']],
           labels=[[0, 0, 0, 0, 0], [0, 0, 0, 0, 1], [0, 0, 0, 0, 0], [0, 1, 2, 3, 0]])


df = df.stack([0,1,2,3]).reset_index()
df.columns = ['Name','Cat','Type','Year','Quarter','Value']
print (df)
     Name   Cat   Type  Year Quarter  Value
0   Name1  Cat1  Type1  2018      1Q    1.0
1   Name1  Cat1  Type1  2018      2Q    5.0
2   Name1  Cat1  Type1  2018      3Q    3.0
3   Name1  Cat1  Type1  2018      4Q    5.0
4   Name1  Cat1  Type2  2018      1Q    4.0
5   Name2  Cat1  Type1  2018      1Q    3.0
6   Name2  Cat1  Type1  2018      2Q   23.0
7   Name2  Cat1  Type1  2018      3Q    4.0
8   Name2  Cat1  Type1  2018      4Q    2.0
9   Name2  Cat1  Type2  2018      1Q    4.0
10  Name3  Cat1  Type1  2018      1Q    4.0
11  Name3  Cat1  Type1  2018      2Q    3.0
12  Name3  Cat1  Type1  2018      3Q    5.0
13  Name3  Cat1  Type1  2018      4Q    3.0
14  Name3  Cat1  Type2  2018      1Q   44.0
15  Name4  Cat1  Type1  2018      1Q    3.0
16  Name4  Cat1  Type1  2018      2Q    6.0
17  Name4  Cat1  Type1  2018      3Q    5.0
18  Name4  Cat1  Type1  2018      4Q    4.0
19  Name4  Cat1  Type2  2018      1Q    2.0