第n次出现一个值

时间:2017-12-14 12:10:23

标签: pandas conditional

    user_login  login_type  login_time
0   a   0   14:00:00
1   b   0   08:20:03
2   c   1   09:10:03
3   b   1   10:49:03
4   a   1   11:19:03
5   a   1   12:29:03
6   c   0   13:39:03
7   c   1   14:49:03

我有df1,我想找到user_login的第二次出现,如果login_type中的相应值是1,则将login_time放入新的user_login login_type login_time 2nd_login_time a 0 14:00:00 No 2nd login_time b 0 8:20:03 No 2nd login_time c 1 9:10:03 No 2nd login_time b 1 10:49:03 10:49:03 a 1 11:19:03 11:19:03 a 1 12:29:03 No 2nd login_time c 0 13:39:03 13:39:03 c 1 14:49:03 No 2nd login_time 柱。 最终结果如下:

HttpServletRequest request

如何在熊猫中实现这一目标?

1 个答案:

答案 0 :(得分:0)

cumcount用于组中的值位置,并使用其他条件链接。最后按loc设置新值:

m = (df.groupby('user_login').cumcount() == 1) & (df['login_type'] == 1)

df.loc[m, 'new'] = df['login_time']
print (df)
  user_login  login_type login_time       new
0          a           0   14:00:00       NaN
1          b           0   08:20:03       NaN
2          c           1   09:10:03       NaN
3          b           1   10:49:03  10:49:03
4          a           1   11:19:03  11:19:03
5          a           1   12:29:03       NaN
6          c           0   13:39:03       NaN
7          c           1   14:49:03       NaN

如果想要设置两个值:

df['new'] = np.where(m, df['login_time'], 'No 2nd login_time')
print (df)
  user_login  login_type login_time                new
0          a           0   14:00:00  No 2nd login_time
1          b           0   08:20:03  No 2nd login_time
2          c           1   09:10:03  No 2nd login_time
3          b           1   10:49:03           10:49:03
4          a           1   11:19:03           11:19:03
5          a           1   12:29:03  No 2nd login_time
6          c           0   13:39:03  No 2nd login_time
7          c           1   14:49:03  No 2nd login_time

详情:

print (df.groupby('user_login').cumcount())
0    0
1    0
2    0
3    1
4    1
5    2
6    1
7    2
dtype: int64

print (m)
0    False
1    False
2    False
3     True
4     True
5    False
6    False
7    False
dtype: bool