attached the data image please go through我有一个包含2列的数据框,其中一列是“ id”,另一列是“ text”。每行在2个不同的列中包含一个ID及其文本。现在,我想合并每个id的文本行的后半部分,对id进行分组。
id text
0 AB001 hello this is samrey how may I assist you
1 AB001 thank you for contacting chat helpline
2 AB001 could you please tell me your problem
3 AB002 Thank you for contacting ou team
4 AB002 agent will be assigned to you shortly
5 AB003 I will reset the the handset
6 AB003 switch off and switch on
7 AB003 still not working
8 AB003 it should work
9 AB003 now its working thanks a lot
答案 0 :(得分:2)
您可以加入字符串,
new_df = df.groupby('id').text.apply(' '.join).reset_index()
id text
0 AB001 hello this is samrey how may I assist you tha...
1 AB002 Thank you for contacting ou team agent will b...
2 AB003 I will reset the the handset switch off and s...
编辑:基于注释,“假设一个ID有6行,我们必须为该特定ID加入最后3行”。下面将捕获行的后半部分并将其合并。如果ID中有3行,则该行将最后连接2行。
df.groupby('id').text.apply(lambda x: ''.join(x[(len(x)//2):])).reset_index()
id text
0 AB001 thank you for contacting chat helpline could you please tell me your problem
1 AB002 agent will be assigned to you shortly
2 AB003 still not working it should work now its working thanks a lot