Pandas - 过滤每个 ID 的最后 X 个条目以外的所有条目

时间:2021-03-24 16:04:33

标签: pandas

像这样的平面桌子:

         Date   ID  A-1  A-2  A-3  A-4
0  2020-11-01  id1   10    2    6    8
1  2020-11-01  id2    1    1   10    4
2  2020-11-01  id3    9    4   10    4
3  2020-12-01  id1    4    8    6    6
4  2020-12-01  id3    1    2    3    9
5  2021-01-01  id1    2    2    2    7
6  2021-01-01  id2    9    7   10    9
7  2021-02-01  id2    1    5    9    1
8  2021-02-01  id3   10    2    5    1

如何过滤 ID 的最后 X 个值,例如如果我只想为每个 ID 保留最近的 2 个值?

         Date   ID  A-1  A-2  A-3  A-4
3  2020-12-01  id1    4    8    6    6
4  2020-12-01  id3    1    2    3    9
5  2021-01-01  id1    2    2    2    7
6  2021-01-01  id2    9    7   10    9
7  2021-02-01  id2    1    5    9    1
8  2021-02-01  id3   10    2    5    1

虚拟数据:

d = {'Date': {0: '2020-11-01', 1: '2020-11-01', 2: '2020-11-01', 3: '2020-12-01', 4: '2020-12-01', 5: '2021-01-01', 6: '2021-01-01', 7: '2021-02-01', 8: '2021-02-01'}, 'ID': {0: 'id1', 1: 'id2', 2: 'id3', 3: 'id1', 4: 'id3', 5: 'id1', 6: 'id2', 7: 'id2', 8: 'id3'}, 'A-1': {0: 10, 1: 1, 2: 9, 3: 4, 4: 1, 5: 2, 6: 9, 7: 1, 8: 10}, 'A-2': {0: 2, 1: 1, 2: 4, 3: 8, 4: 2, 5: 2, 6: 7, 7: 5, 8: 2}, 'A-3': {0: 6, 1: 10, 2: 10, 3: 6, 4: 3, 5: 2, 6: 10, 7: 9, 8: 5}, 'A-4': {0: 8, 1: 4, 2: 4, 3: 6, 4: 9, 5: 7, 6: 9, 7: 1, 8: 1}}
df = pd.DataFrame(data=d)

1 个答案:

答案 0 :(得分:2)

groupby/tail

这假设您的数据已按 'Date' 排序。
如果不是,请先执行此操作。

df.groupby('ID').tail(2)

         Date   ID  A-1  A-2  A-3  A-4
3  2020-12-01  id1    4    8    6    6
4  2020-12-01  id3    1    2    3    9
5  2021-01-01  id1    2    2    2    7
6  2021-01-01  id2    9    7   10    9
7  2021-02-01  id2    1    5    9    1
8  2021-02-01  id3   10    2    5    1