我正在尝试将两个日期之间的多个文件导入Pandas DataFrame。但结果数据框有多个副本的数据,而不是一个副本。
我的代码如下所示:
File A:
20170501 00:00:11 11 1
20170501 00:00:20 21 2
File B:
20170502 00:06:11 31 3
20170502 00:30:11 41 4
File C:
20170503 00:40:11 51 5
20170503 00:50:11 61 6
如果我有这样的文件:
20170501 00:00:11 11 1
20170501 00:00:20 21 2
20170502 00:06:11 31 3
20170502 00:30:11 41 4
20170503 00:40:11 51 5
20170503 00:50:11 61 6
20170501 00:00:11 11 1
20170501 00:00:20 21 2
20170502 00:06:11 31 3
20170502 00:30:11 41 4
20170503 00:40:11 51 5
20170503 00:50:11 61 6
20170501 00:00:11 11 1
20170501 00:00:20 21 2
20170502 00:06:11 31 3
20170502 00:30:11 41 4
20170503 00:40:11 51 5
20170503 00:50:11 61 6
结果数据框如下所示:
20170501 00:00:11 11 1
20170501 00:00:20 21 2
20170502 00:06:11 31 3
20170502 00:30:11 41 4
20170503 00:40:11 51 5
20170503 00:50:11 61 6
我想要的是:
{{1}}
如何创建想要的数据框?
答案 0 :(得分:3)
您可以使用drop_duplicates:
Mu = Mu.drop_duplicates()
输出:
0 20170501 00:00:11 11 1
1 20170501 00:00:20 21 2
2 20170502 00:06:11 31 3
3 20170502 00:30:11 41 4
4 20170503 00:40:11 51 5
5 20170503 00:50:11 61 6