像这样的平面桌子:
Date ID A-1 A-2 A-3 A-4
0 2020-11-01 id1 10 2 6 8
1 2020-11-01 id2 1 1 10 4
2 2020-11-01 id3 9 4 10 4
3 2020-12-01 id1 4 8 6 6
4 2020-12-01 id3 1 2 3 9
5 2021-01-01 id1 2 2 2 7
6 2021-01-01 id2 9 7 10 9
7 2021-02-01 id2 1 5 9 1
8 2021-02-01 id3 10 2 5 1
如何过滤 ID
的最后 X 个值,例如如果我只想为每个 ID 保留最近的 2 个值?
Date ID A-1 A-2 A-3 A-4
3 2020-12-01 id1 4 8 6 6
4 2020-12-01 id3 1 2 3 9
5 2021-01-01 id1 2 2 2 7
6 2021-01-01 id2 9 7 10 9
7 2021-02-01 id2 1 5 9 1
8 2021-02-01 id3 10 2 5 1
虚拟数据:
d = {'Date': {0: '2020-11-01', 1: '2020-11-01', 2: '2020-11-01', 3: '2020-12-01', 4: '2020-12-01', 5: '2021-01-01', 6: '2021-01-01', 7: '2021-02-01', 8: '2021-02-01'}, 'ID': {0: 'id1', 1: 'id2', 2: 'id3', 3: 'id1', 4: 'id3', 5: 'id1', 6: 'id2', 7: 'id2', 8: 'id3'}, 'A-1': {0: 10, 1: 1, 2: 9, 3: 4, 4: 1, 5: 2, 6: 9, 7: 1, 8: 10}, 'A-2': {0: 2, 1: 1, 2: 4, 3: 8, 4: 2, 5: 2, 6: 7, 7: 5, 8: 2}, 'A-3': {0: 6, 1: 10, 2: 10, 3: 6, 4: 3, 5: 2, 6: 10, 7: 9, 8: 5}, 'A-4': {0: 8, 1: 4, 2: 4, 3: 6, 4: 9, 5: 7, 6: 9, 7: 1, 8: 1}}
df = pd.DataFrame(data=d)
答案 0 :(得分:2)
groupby
/tail
这假设您的数据已按 'Date'
排序。
如果不是,请先执行此操作。
df.groupby('ID').tail(2)
Date ID A-1 A-2 A-3 A-4
3 2020-12-01 id1 4 8 6 6
4 2020-12-01 id3 1 2 3 9
5 2021-01-01 id1 2 2 2 7
6 2021-01-01 id2 9 7 10 9
7 2021-02-01 id2 1 5 9 1
8 2021-02-01 id3 10 2 5 1