比较熊猫的日期范围

时间:2017-03-06 07:03:29

标签: python pandas

我有一个pandas.DataFrame列,如下所示:

0   2013-07-01 13:20:05.072029
1   2013-07-01 15:49:33.110849
2   2013-07-01 13:39:18.608330
Name: invite_sent_time, dtype: datetime64[ns]

现在我想创建另一个列month,如果日期范围介于2013-07-012013-08-01之间,那么Jul其他Aug

我做了类似下面的事情:

# Creating a column for month. 
invites_combined["month"]=np.where(((invites_combined.invite_sent_time.dt.Date >= pd.Timestamp('2013-07-01')) & \
                                   (invites_combined.invite_sent_time.dt.Date < pd.Timestamp('2013-08-01'))),"July","Aug")

但它表示不能将Date与Timestamp进行比较。我不能直接在引号中使用日期,因为它被视为字符串。

那我哪里错了?

1 个答案:

答案 0 :(得分:2)

您需要将date()添加到Timestamp以进行比较dates

dates = invites_combined.invite_sent_time.dt.date
mask = (dates>=pd.Timestamp('2013-07-01').date()) & (dates<pd.Timestamp('2013-08-01').date())
invites_combined["month"] = np.where(mask,"July","Aug")

between

mask = invites_combined.invite_sent_time.between('2013-07-01', '2013-08-01')
invites_combined["month"] = np.where(mask ,"July","Aug")

但更好,更通用的是使用strftime

invites_combined["month"] = invites_combined.invite_sent_time.dt.strftime('%b')

样品:

print (invites_combined)
            invite_sent_time
0 2013-07-01 13:20:05.072029
1 2013-07-01 15:49:33.110849
2 2013-08-01 13:39:18.608330 <-last date was changed to August

invites_combined["month"] = invites_combined.invite_sent_time.dt.strftime('%b')
print (invites_combined)
            invite_sent_time month
0 2013-07-01 13:20:05.072029   Jul
1 2013-07-01 15:49:33.110849   Jul
2 2013-08-01 13:39:18.608330   Aug