计算列的连续出现次数

时间:2021-06-01 19:47:42

标签: python pandas numpy

我正在尝试计算 Products 列的连续出现次数。结果应如“总计数”列中所示。我尝试将 groupby 与 cumsum 一起使用,但我的逻辑无法正常工作

+----------+--------------+
| Products | Total counts |
+----------+--------------+
| 1        | 3            |
+----------+--------------+
| 1        | 3            |
+----------+--------------+
| 1        | 3            |
+----------+--------------+
| 2        | 1            |
+----------+--------------+
| 3        | 3            |
+----------+--------------+
| 3        | 3            |
+----------+--------------+
| 3        | 3            |
+----------+--------------+
| 4        | 2            |
+----------+--------------+
| 4        | 2            |
+----------+--------------+

1 个答案:

答案 0 :(得分:1)

使用 groupbytransform 并计数,

df['Total counts'] = df.groupby('Products').transform('count')

输出:

   Products  Total counts
0         1             3
1         1             3
2         1             3
3         2             1
4         3             3
5         3             3
6         3             3
7         4             2
8         4             2

连续产品,稍后在数据框中重复:

grp = (df['Products'] != df['Products'].shift()).cumsum()
df['Total counts'] = df.groupby(grp)['Products'].transform('count')

输出:

   Products  Total counts
0         1             3
1         1             3
2         1             3
3         2             1
4         3             3
5         3             3
6         3             3
7         4             2
8         4             2