我有一个如下数据框:
data =[['A','ABC001','18M01',1,3],['A','ABC002','18M01',2,4],['A','ABC001','18M02',3,3],['B','ABC001','18M01',4,3],['B','ABC002','18M02',5,4],['B','ABC002','18M02',6,4]]
df = pd.DataFrame(data,columns=['Type','Product','Month','Requirement','Inventory'])
df
输入:
Type Product Month Requirement Inventory
A ABC001 18M01 1 3
A ABC002 18M01 2 4
A ABC001 18M02 3 3
B ABC001 18M01 4 3
B ABC002 18M02 5 4
B ABC002 18M02 6 4
我想做的就是将它总结成这样的数据框
输出:
Type Product Values 18M01 18M02
A ABC001 Sum of Requirement 1 3
A ABC001 Average of Inventory 3 3
A ABC002 Sum of Requirement 2 NaN
A ABC002 Average of Inventory 4 NaN
B ABC001 Sum of Requirement 4 NaN
B ABC001 Average of Inventory 3 NaN
B ABC002 Sum of Requirement NaN 11
B ABC002 Average of Inventory NaN 4
我可以很容易地在excel枢轴中创建它,但是在使用pandas枢轴时根本不知道。请帮助
答案 0 :(得分:1)
我认为您需要通过sum
和mean
进行汇总,在列中展平MultiIndex并通过stack
通过unstack
来重塑:
df1 = (df.groupby(['Type','Product','Month'])
.agg({'Requirement': 'sum','Inventory':'mean'})
.rename(columns={'Requirement':'Sum of Requirement',
'Inventory':'Average of Inventory'})
.stack()
.unstack(2)
.reset_index()
.rename(columns={'level_2':'Values'}))
print (df1)
Month Type Product Values 18M01 18M02
0 A ABC001 Sum of Requirement 1.0 3.0
1 A ABC001 Average of Inventory 3.0 3.0
2 A ABC002 Sum of Requirement 2.0 NaN
3 A ABC002 Average of Inventory 4.0 NaN
4 B ABC001 Sum of Requirement 4.0 NaN
5 B ABC001 Average of Inventory 3.0 NaN
6 B ABC002 Sum of Requirement NaN 11.0
7 B ABC002 Average of Inventory NaN 4.0
答案 1 :(得分:1)
一种pivot_table
的处理方式-
df1 = df.pivot_table('Requirement', ['Type','Product'], 'Month', aggfunc='sum')
df1['Values'] = 'Sum of Requirement'
df2 = df.pivot_table('Inventory', ['Type','Product'], 'Month', aggfunc='mean')
df2['Values'] = 'Average of Inventory'
df1.append(df2)
输出
Month 18M01 18M02 Values
Type Product
A ABC001 1.0 3.0 Sum of Requirement
ABC002 2.0 NaN Sum of Requirement
B ABC001 4.0 NaN Sum of Requirement
ABC002 NaN 11.0 Sum of Requirement
A ABC001 3.0 3.0 Average of Inventory
ABC002 4.0 NaN Average of Inventory
B ABC001 3.0 NaN Average of Inventory
ABC002 NaN 4.0 Average of Inventory
您可以添加reset_index()
使其变得更好-
df1.append(df2).reset_index()
Month Type Product 18M01 18M02 Values
0 A ABC001 1.0 3.0 Sum of Requirement
1 A ABC002 2.0 NaN Sum of Requirement
2 B ABC001 4.0 NaN Sum of Requirement
3 B ABC002 NaN 11.0 Sum of Requirement
4 A ABC001 3.0 3.0 Average of Inventory
5 A ABC002 4.0 NaN Average of Inventory
6 B ABC001 3.0 NaN Average of Inventory
7 B ABC002 NaN 4.0 Average of Inventory