pandas groupby with count,sum和avg

时间:2017-03-03 09:12:37

标签: python python-3.x pandas

我在熊猫中有以下DF:

+---------+-----------+------------+------------------------------------------------+
| keyword | frequency | avg weight |                  sum other keywords            |
+---------+-----------+------------+------------------------------------------------+
| dog     |         3 | 0.14       | [cat, horse, pig, cat, horse, cat, horse, pig] |
| cat     |         1 | 0.5        | [dog, pig, camel]                              |
| horse   |         2 | 0.185      | [dog, camel, cat, camel]                       |
+---------+-----------+------------+------------------------------------------------+

我想要执行的任务是按关键字进行分组,同时计算关键字频率,按权重平均并按其他关键字求和。结果将是这样的:

{{1}}

现在,我知道如何在许多单独的操作中执行它:value_counts,groupby.sum(),groupby.avg()然后合并它。然而,效率非常低,我不得不进行大量的手动调整。

我想知道是否可以在一次操作中完成它?

1 个答案:

答案 0 :(得分:10)

您可以使用agg

df = df.groupby('keyword').agg({'keyword':'size', 'weight':'mean', 'other keywords':'sum'})
#set new ordering of columns
df = df.reindex_axis(['keyword','weight','other keywords'], axis=1)
#reset index
df = df.rename_axis(None).reset_index()
#set new column names
df.columns = ['keyword','frequency','avg weight','sum other keywords']

print (df)
  keyword  frequency  avg weight  \
0     cat          1       0.500   
1     dog          3       0.140   
2   horse          2       0.185   

                               sum other keywords  
0                               [dog, pig, camel]  
1  [cat, horse, pig, cat, horse, cat, horse, pig]  
2                        [dog, camel, cat, camel]