按组创建索引列

时间:2019-02-11 13:45:49

标签: python pandas dataframe

我想索引我的数据框,以便在每个组中它从0开始到该组中的观察数。例如:

pd.DataFrame([["John","Car"],["John","House"],["Sam","Skate"],["Sam","Disco"],["Sam","Space"]])

我想要:

pd.DataFrame([["John","Car",0],["John","House",1],["Sam","Skate",0],["Sam","Disco",1],["Sam","Space",2]])

谢谢

2 个答案:

答案 0 :(得分:1)

使用:

df.groupby(0)[0].apply(lambda x:x.duplicated().cumsum())

答案 1 :(得分:1)

您正在寻找累积计数功能:

df = pd.DataFrame([["John","Car"],["John","House"],["Sam","Skate"],["Sam","Disco"],["Sam","Space"]])
df.groupby(0).cumcount()