日内数据的每日高/低

时间:2018-06-01 14:00:30

标签: python pandas dataframe

让我有一个带有日期时间索引的数据框,如下所示:

date_time           open    high    low     close   vol
2018-05-13 18:00:00 70.54   70.60   70.42   70.55   2665
2018-05-13 18:15:00 70.55   70.59   70.53   70.58   378
2018-05-13 18:30:00 70.58   70.70   70.57   70.69   1470
2018-05-13 18:45:00 70.68   70.68   70.63   70.65   427
...
2018-05-14 00:00:00 70.46   70.47   70.40   70.41   1276
2018-05-14 00:15:00 70.41   70.45   70.38   70.39   1356
2018-05-14 00:30:00 70.39   70.48   70.39   70.46   1161
2018-05-14 00:45:00 70.46   70.47   70.43   70.46   359

我还需要两个DAILY High和DAILY低值的列。 我试着这样做:

df['day_high']= x.resample('D').high.max()
df.day_high = x.day_high.fillna(method='ffill')

它在几天内完美运行,其中存在00:00:00数据时间。因此,14.05.2018我有一个日期时间00:00:00的值,我的代码工作。但2018-05-13天开始于18:00,我的代码返回" NaN"价值(我知道,为什么,但我不知道如何编写正确的代码)。

你可以帮帮我吗?感谢。

1 个答案:

答案 0 :(得分:4)

我认为需要Resampler.transform

df['day_high']= df.resample('D').high.transform('max')
df['day_low']= df.resample('D').low.transform('min')
print (df)
                      open   high    low  close   vol  day_high  day_low
date_time                                                               
2018-05-13 18:00:00  70.54  70.60  70.42  70.55  2665     70.70    70.42
2018-05-13 18:15:00  70.55  70.59  70.53  70.58   378     70.70    70.42
2018-05-13 18:30:00  70.58  70.70  70.57  70.69  1470     70.70    70.42
2018-05-13 18:45:00  70.68  70.68  70.63  70.65   427     70.70    70.42
2018-05-14 00:00:00  70.46  70.47  70.40  70.41  1276     70.48    70.38
2018-05-14 00:15:00  70.41  70.45  70.38  70.39  1356     70.48    70.38
2018-05-14 00:30:00  70.39  70.48  70.39  70.46  1161     70.48    70.38
2018-05-14 00:45:00  70.46  70.47  70.43  70.46   359     70.48    70.38