熊猫DataFrame中的日期时间索引冲突

时间:2019-03-17 16:51:45

标签: python pandas dataframe

我有一个包含原始日期时间索引的数据框:

olhcv.index

DatetimeIndex(['1989-01-31', '1989-02-01', '1989-02-02', '1989-02-03',
           '1989-02-06', '1989-02-07', '1989-02-08', '1989-02-09',
           '1989-02-10', '1989-02-13',
           ...
           '2019-03-01', '2019-03-04', '2019-03-05', '2019-03-06',
           '2019-03-07', '2019-03-08', '2019-03-11', '2019-03-12',
           '2019-03-13', '2019-03-14'],
          dtype='datetime64[ns]', length=7606, freq=None) 

我必须从另一个软件包中删除基于另一个索引的非工作日:

import pandas_market_calendars as mcal

nyse = mcal.get_calendar('NYSE')date=nyse.valid_days(start_date=min(olhcv.index), end_date=max(olhcv.index))
date
DatetimeIndex(['1989-01-31 00:00:00+00:00', '1989-02-01 00:00:00+00:00',
           '1989-02-02 00:00:00+00:00', '1989-02-03 00:00:00+00:00',
           '1989-02-06 00:00:00+00:00', '1989-02-07 00:00:00+00:00',
           '1989-02-08 00:00:00+00:00', '1989-02-09 00:00:00+00:00',
           '1989-02-10 00:00:00+00:00', '1989-02-13 00:00:00+00:00',
           ...
           '2019-03-01 00:00:00+00:00', '2019-03-04 00:00:00+00:00',
           '2019-03-05 00:00:00+00:00', '2019-03-06 00:00:00+00:00',
           '2019-03-07 00:00:00+00:00', '2019-03-08 00:00:00+00:00',
           '2019-03-11 00:00:00+00:00', '2019-03-12 00:00:00+00:00',
           '2019-03-13 00:00:00+00:00', '2019-03-14 00:00:00+00:00'],
          dtype='datetime64[ns, UTC]', length=7589, freq='C')

但是,当我尝试使用新索引滑动第一个数据框时:

olhcv2 = olhcv.loc[date]

Traceback (most recent call last):

File "<ipython-input-139-8a6e732943bb>", line 1, in <module>
olhcv2 = olhcv.loc[date]

File "/Users/luca/miniconda3/lib/python3.7/site-packages/pandas/core/indexing.py", line 1500, in __getitem__
return self._getitem_axis(maybe_callable, axis=axis)

File "/Users/luca/miniconda3/lib/python3.7/site-packages/pandas/core/indexing.py", line 1902, in _getitem_axis
return self._getitem_iterable(key, axis=axis)

File "/Users/luca/miniconda3/lib/python3.7/site-packages/pandas/core/indexing.py", line 1205, in _getitem_iterable
raise_missing=False)

File "/Users/luca/miniconda3/lib/python3.7/site-packages/pandas/core/indexing.py", line 1161, in _get_listlike_indexer
raise_missing=raise_missing)

File "/Users/luca/miniconda3/lib/python3.7/site-packages/pandas/core/indexing.py", line 1246, in _validate_read_indexer
key=key, axis=self.obj._get_axis_name(axis)))

KeyError: "None of [DatetimeIndex(['1989-01-31 00:00:00+00:00', '1989-02-01 00:00:00+00:00',\n               '1989-02-02 00:00:00+00:00', '1989-02-03 00:00:00+00:00',\n               '1989-02-06 00:00:00+00:00', '1989-02-07 00:00:00+00:00',\n               '1989-02-08 00:00:00+00:00', '1989-02-09 00:00:00+00:00',\n               '1989-02-10 00:00:00+00:00', '1989-02-13 00:00:00+00:00',\n               ...\n               '2019-03-01 00:00:00+00:00', '2019-03-04 00:00:00+00:00',\n               '2019-03-05 00:00:00+00:00', '2019-03-06 00:00:00+00:00',\n               '2019-03-07 00:00:00+00:00', '2019-03-08 00:00:00+00:00',\n               '2019-03-11 00:00:00+00:00', '2019-03-12 00:00:00+00:00',\n               '2019-03-13 00:00:00+00:00', '2019-03-14 00:00:00+00:00'],\n              dtype='datetime64[ns, UTC]', length=7589, freq='C')] are in the [index]"

我相信2个索引有一些差异(时区,..)。我该如何处理?

谢谢

1 个答案:

答案 0 :(得分:1)

DatetimeIndex.tz_convertNone一起使用:

olhcv2 = olhcv.loc[date.tz_convert(None)]