我想找到一种方法,以1秒的频率为交易时间构建自定义pandas.tseries.offsets
类。这里的主要要求是时间偏移对象足够聪明,知道“2015-06-18 16:00:00”的下一秒将是'2015-06-19 09:30:00或09:30: 01',从这两个时间戳计算的时间增量恰好为1s(自定义偏移1s,类似于工作日频率的BDay(1)
),而不是关闭时间的持续时间。
原因是当在几个交易日内绘制pd.Series的日内数据时,请参阅下面的模拟示例,在近期和次日开盘价之间有很多“阶梯线”(线性插值)来表示关闭时间的持续时间。有没有办法摆脱这个?我查看pandas.tseries.offsets
的源代码,找到pd.tseries.offsets.BusinessHour
和pd.tseries.offsets.BusinessMixin
可能会有所帮助,但我不知道如何使用它们。
import pandas as pd
import numpy as np
from pandas.tseries.holiday import USFederalHolidayCalendar
from pandas.tseries.offsets import CustomBusinessDay
# set as 'constant' object shared by all codes in this script
BDAY_US = CustomBusinessDay(calender=USFederalHolidayCalendar())
sample_freq = '5min'
dates = pd.date_range(start='2015-01-01', end='2015-01-31', freq=BDAY_US).date
# exculde the 09:30:00 as it is included in the first time bucket
times = pd.date_range(start='09:30:00', end='16:00:00', freq=sample_freq).time[1:]
time_stamps = [dt.datetime.combine(date, time) for date in dates for time in times]
s = pd.Series(np.random.randn(len(time_stamps)).cumsum() + 100, index=time_stamps)
s.plot()
我能想到的部分解决此问题的另一种方法是首先reset_index()
获取每行的默认连续整数索引,然后计算连续整数索引之间的差异,即时间(以秒为单位)。将整数索引绘制为x轴,然后将它们重新标记为适当的时间标签。有人可以告诉我如何使用matplotlib
吗?
感谢Jeff的评论。我只是检查BusinessHour()
的在线文档,并发现它可能对我的情况有用。另一个后续问题:BusinessHour
是小时频率,是否有办法以1s的频率进行?另外,如何将它与CustomBusinessDay
对象结合起来?
使用BusinessHour()
from pandas.tseries.offsets import *
bhour = BusinessHour(start='09:30', end='16:00')
time = pd.Timestamp('2015-06-18 15:00:00')
print(time)
2015-06-18 15:00:00
# hourly increment works nicely
print(time + bhour * 1)
2015-06-19 09:30:00
# but not at minute or second frequency
print(time + Minute(61))
2015-06-18 16:01:00
print(time + Second(60*60 + 1))
2015-06-18 16:00:01
非常感谢,任何帮助都将受到高度赞赏。
答案 0 :(得分:5)
正如我在评论中提到的,你可能有两个不同的问题
我给出的解决方案将占1,因为这似乎是您的直接问题。如果您需要2个或两个 - 请在评论中告诉我们:
matplotlib
中的大多数图表都可以通过ticker
API将索引格式化程序应用于轴。我会根据你的情况调整this example
import pandas as pd
import numpy as np
from pandas.tseries.holiday import USFederalHolidayCalendar
from pandas.tseries.offsets import CustomBusinessDay
import datetime as dt
import matplotlib.pyplot as plt
import matplotlib.ticker as ticker
# set as 'constant' object shared by all codes in this script
BDAY_US = CustomBusinessDay(calender=USFederalHolidayCalendar())
sample_freq = '5min'
dates = pd.date_range(start='2015-01-01', end='2015-01-31', freq=BDAY_US).date
# exculde the 09:30:00 as it is included in the first time bucket
times = pd.date_range(start='09:30:00', end='16:00:00', freq=sample_freq).time[1:]
time_stamps = [dt.datetime.combine(date, time) for date in dates for time in times]
s = pd.Series(np.random.randn(len(time_stamps)).cumsum() + 100, index=time_stamps)
data_length = len(s)
s.index.name = 'date_time_index'
s.name='stock_price'
s_new = s.reset_index()
ax = s_new.plot(y='stock_price') #plot the data against the new linearised index...
def format_date(x,pos=None):
thisind = np.clip(int(x+0.5), 0, data_length-1)
return s_new.date_time_index[thisind].strftime('%Y-%m-%d %H:%M:%S')
ax.xaxis.set_major_formatter(ticker.FuncFormatter(format_date))
fig = plt.gcf()
fig.autofmt_xdate()
plt.show()
这给出了如下输出,首先是缩小的自然比例,第二个是放大的,所以你可以看到星期五16:00到星期一09:00之间的过渡