Pandas:通过按日期过滤来访问行

时间:2015-12-08 06:36:13

标签: python datetime pandas

做的时候:

import pandas
from datetime import datetime
timestampparse = lambda t: datetime.fromtimestamp(float(t))
df = pandas.read_csv('blah.csv', delimiter=';', parse_dates=True, date_parser=timestampparse, index_col='DateTime', names=['DateTime', 'Sell'], header=None)
print df.ix['2015-12-02 12:02:21.070':'2015-12-02 12:40:21.070']

使用此blah.csv文件:

1449054136.83;1.05905
1449054139.25;1.05906
1449054139.86;1.05906
1449054140.47;1.05906

我收到此错误:

  

KeyError异常

如何访问按日期过滤的pandas数据帧片段?

为什么df.ix['2015-12-02 12:02:19.000':'2015-12-02 12:40:21.070']无效?

3 个答案:

答案 0 :(得分:1)

用零'2015-12-02 12:02:16.0859'填充第二个分数:

>>> df['2015-12-02 12:02:16.0859':'2015-12-02 12:03:20'])
                              Sell
DateTime                           
2015-12-02 12:02:16.829999  1.05905
2015-12-02 12:02:19.250000  1.05906
2015-12-02 12:02:19.859999  1.05906
2015-12-02 12:02:20.470000  1.05906

这有效:

>>> df['2015-12-02 12:02:17':'2015-12-02 12:03:20']
                               Sell
DateTime                           
2015-12-02 12:02:19.250000  1.05906
2015-12-02 12:02:19.859999  1.05906
2015-12-02 12:02:20.470000  1.05906

这适用于版本0.16.2

>>> from datetime import datetime
>>> df[datetime(2015, 12, 2, 12, 2, 16):datetime(2015, 12, 2, 12, 2, 20)]

                               Sell
DateTime                           
2015-12-02 12:02:16.829999  1.05905
2015-12-02 12:02:19.250000  1.05906
2015-12-02 12:02:19.859999  1.05906

答案 1 :(得分:1)

我认为它不起作用,因为datetimeindexfloatindex可能是精确问题。

您可以使用partial string indexing,我在日期时间结束时省略数字 - 我只使用秒数:

print df['2015-12-02 12:02:19':'2015-12-02 12:40:20']

                            Sell
DateTime                        
2015-12-02 12:02:19.250  1.05906
2015-12-02 12:02:19.860  1.05906
2015-12-02 12:02:20.470  1.05906

答案 2 :(得分:0)

docsTime/Date Components我知道你需要指定微秒的数量(与datetime对象相同):

In [103]: df.loc["2015-12-02 14:02:10":"2015-12-02 14:02:19.899999"]
Out[103]:
                               Sell
DateTime
2015-12-02 14:02:16.829999  1.05905
2015-12-02 14:02:19.250000  1.05906
2015-12-02 14:02:19.859999  1.05906

或使用datetime指定精确的微秒数:

In [104]: df.loc["2015-12-02 14:02:10":datetime(year=2015, month=12, day=2, hour=14, minute=2, second=20, microsecond=999999)]
Out[104]:
                               Sell
DateTime
2015-12-02 14:02:16.829999  1.05905
2015-12-02 14:02:19.250000  1.05906
2015-12-02 14:02:19.859999  1.05906
2015-12-02 14:02:20.470000  1.05906