Pandas循环自定义日期(月份)到(月份年份+ N)进行绘图

时间:2016-12-21 08:46:42

标签: python datetime pandas matplotlib

我一直在分析持续数月的数据,然后每月生成并保存一个数字。到目前为止,当这些都在同一日历年内时,这种情况很有效,但是当数据跨越到下一年时,我很难理解如何指示循环工作。

示例代码:

import pandas as pd
import datetime as datetime
import matplotlib as plt

df = pd.read_csv("file.csv")
df.index = df.Datetime

for month in range(4,12): #Data starts in April in this example
    fig, axes = plt.subplots(nrows=2,ncols=1, sharex=True, figsize =(18,10))
    startDate = datetime.date(2016,month,1)
    stopDate = datetime.date(2016,month+1,1)
    date_val = startDate.strftime("%B %Y")

    k=0
    df.PRe[startDate:stopDate].plot(ax=axes[k])
    #ylim, xlim, title etc
    k=1
    df.PRp[startDate:stopDate].plot(ax=axes[k])

    plt.savefig("PRe and PRp in %s.png"%date_val,bbox_inches="tight")

This SO question接近,尽管他们使用pandas datetime对象而不是我使用过的datetime.date对象。我应该修改我的代码以适应解决方案,如果是,如何? 否则,是否有一种熊猫/ pythonic方式可以让我们在2016年之后开始工作 - 无论是已知的开始日期还是结束日期,或者更好的是,对于任何开始和结束日期?

2 个答案:

答案 0 :(得分:1)

您可以使用dateoffset

month = 4
startDate = datetime.date(2016,month,1)
print (startDate)
stopDate = (startDate + pd.offsets.MonthBegin()).date()
print (stopDate)
2016-04-01
2016-05-01
month = 4
startDate = datetime.date(2016,month,1)
print (startDate)
stopDate = (startDate + pd.offsets.DateOffset(months=1)).date()
print (stopDate)
2016-04-01
2016-05-01

另一个解决方案是datetimeindex partial string indexing,如果需要按yearmonth选择:

df.PRe['2016-4'].plot(ax=axes[k])
df.PRe[str(2016)+'-'+str(month)].plot(ax=axes[k])

解决方案是否需要在datetimeindex中按唯一年份和月份按DatetimeIndex.to_period的唯一month句点循环:

start = pd.to_datetime('2015-10-24')
rng = pd.date_range(start, periods=10, freq='3W')

df = pd.DataFrame({'PRe': np.random.randint(10, size=10)}, index=rng)  
print (df)
            PRe
2015-10-25    2
2015-11-15    3
2015-12-06    3
2015-12-27    1
2016-01-17    8
2016-02-07    4
2016-02-28    2
2016-03-20    6
2016-04-10    8
2016-05-01    0
2015-10-25    2
for date in df.index.to_period('m').unique():
    print (df.PRe[str(date)])

Freq: 3W-SUN, Name: PRe, dtype: int32
2015-11-15    3
Freq: 3W-SUN, Name: PRe, dtype: int32
2015-12-06    3
2015-12-27    1
Freq: 3W-SUN, Name: PRe, dtype: int32
2016-01-17    8
Freq: 3W-SUN, Name: PRe, dtype: int32
2016-02-07    4
2016-02-28    2
Freq: 3W-SUN, Name: PRe, dtype: int32
2016-03-20    6
Freq: 3W-SUN, Name: PRe, dtype: int32
2016-04-10    8
Freq: 3W-SUN, Name: PRe, dtype: int32
2016-05-01    0
Freq: 3W-SUN, Name: PRe, dtype: int32

答案 1 :(得分:0)

@ jezrael的答案解决了这个问题;以下是后人的解决方案。

import pandas as pd
import matplotlib as plt

df = pd.read_csv("file.csv")
df.index = df.Datetime

startDate = df.index[0] #seed the while loop, format Timestamp
while (startDate >= df.index[0]) & (startDate < df.index[-1]): 
    fig, axes = plt.subplots(nrows=2,ncols=1, sharex=True, figsize =(18,10))

    stopDate = (startDate + pd.offsets.MonthBegin())#stopDate also Timestamp
    date_val = startDate.strftime("%B %Y")#Date as Month Year string

    k=0
    df.PRe[startDate:stopDate].plot(ax=axes[k])
    #ylim, xlim, title etc
    k=1
    df.PRp[startDate:stopDate].plot(ax=axes[k])
    #ylim, xlim, title etc
    plt.savefig("PRe and PRp in %s.png"%date_val,bbox_inches="tight")
    startDate = stopDate