按列分组后如何获取平均值?

时间:2018-09-24 08:33:34

标签: python pandas dataframe

我有一个名为df的DataFrame,我想通过不同的time组来获取不同apps的平均使用量gender

import pandas as pd 
df=pd.DataFrame({'user':[2,3,4,4,5,5],'gender':[0,0,1,1,1,1],
'app':['k','k','k','k','s','s'],'time':[6,10,10,6,3,1]})

Input:

  app  gender  time  user
0   k    0     6     2
1   k    0    10     3
2   k    1    10     4
3   k    1     6     4
4   s    1     3     5
5   s    1     1     5

对于app kgender 0组使用app k的总时间为16 (10 + 6 ),因此平均使用时间0_k为{{1} }。

8.0组使用gender 1的总时间为app k,因此平均使用时间16 (10 + 6 + 0 + 0)1_k

4.0

Expected:

2 个答案:

答案 0 :(得分:2)

IIUC我认为您需要:

df['new_col'] = df.gender.astype(str)+'_'+df.app
df['Average'] = df.groupby(['gender','app'])['time'].transform('sum')/\
                df.groupby(['gender'])['time'].transform('count')

print(df)
   user  gender app  time new_col  Average
0     2       0   k     6     0_k      8.0
1     3       0   k    10     0_k      8.0
2     4       1   k    10     1_k      4.0
3     4       1   k     6     1_k      4.0
4     5       1   s     3     1_s      1.0
5     5       1   s     1     1_s      1.0

d = dict(df[['new_col','Average']].values)

print(d)
{'0_k': 8.0, '1_k': 4.0, '1_s': 1.0}

答案 1 :(得分:0)

(df.groupby(["app", "gender"]).sum()/df.groupby(["gender"]).count()).time


app  gender
k    0         8.0
     1         4.0
s    1         1.0

将其转换为字典:

dict = (df.groupby(["app", "gender"]).sum()/df.groupby(["gender"]).count()).time.to_dict()

{('k', 0): 8.0, ('s', 1): 2.0, ('k', 1): 8.0}