合并每个ID的行的后半部分,但每个ID的行不相同

时间:2018-12-31 21:00:32

标签: python pandas

attached the data image please go through我有一个包含2列的数据框,其中一列是“ id”,另一列是“ text”。每行在2个不同的列中包含一个ID及其文本。现在,我想合并每个id的文本行的后半部分,对id进行分组。

    id      text
0   AB001   hello this is samrey how may I assist you
1   AB001   thank you for contacting chat helpline
2   AB001   could you please tell me your problem
3   AB002   Thank you for contacting ou team
4   AB002   agent will be assigned to you shortly
5   AB003   I will reset the the handset
6   AB003   switch off and switch on
7   AB003   still not working
8   AB003   it should work
9   AB003   now its working thanks a lot

1 个答案:

答案 0 :(得分:2)

您可以加入字符串,

new_df = df.groupby('id').text.apply(' '.join).reset_index()


    id      text
0   AB001   hello this is samrey how may I assist you tha...
1   AB002   Thank you for contacting ou team agent will b...
2   AB003   I will reset the the handset switch off and s...

编辑:基于注释,“假设一个ID有6行,我们必须为该特定ID加入最后3行”。下面将捕获行的后半部分并将其合并。如果ID中有3行,则该行将最后连接2行。

df.groupby('id').text.apply(lambda x: ''.join(x[(len(x)//2):])).reset_index()


    id      text
0   AB001   thank you for contacting chat helpline could you please tell me your problem
1   AB002   agent will be assigned to you shortly
2   AB003   still not working it should work now its working thanks a lot