这是AfterSelect
,但现在我需要找到存储在'YYYY-MM-DD'中的日期之间的差异。基本上date,site,country_code,kind,ID,rank,votes,sessions,avg_score,count
2017-03-20,website1,US,0,84,226,0.0,15.0,3.370812,53.0
2017-03-21,website1,US,0,84,214,0.0,15.0,3.370812,53.0
2017-03-22,website1,US,0,84,226,0.0,16.0,3.370812,53.0
2017-03-23,website1,US,0,84,234,0.0,16.0,3.369048,54.0
2017-03-24,website1,US,0,84,226,0.0,16.0,3.369048,54.0
2017-03-25,website1,US,0,84,212,0.0,16.0,3.369048,54.0
2017-03-27,website1,US,0,84,228,0.0,16.0,3.369048,58.0
2017-02-15,website2,AU,1,91,144,4.0,148.0,4.727272,521.0
2017-02-16,website2,AU,1,91,144,3.0,147.0,4.727272,524.0
2017-02-20,website2,AU,1,91,100,4.0,148.0,4.727272,531.0
2017-02-21,website2,AU,1,91,118,6.0,149.0,4.727272,533.0
2017-02-22,website2,AU,1,91,114,4.0,151.0,4.727272,534.0
列中值之间的差异是我们需要的,但是按每行之间的天数进行标准化。
我的数据框是:
date+site+country+kind+ID
我希望找到按[date,site,country_code,kind,ID,rank,votes,sessions,avg_score,count,day_diff
2017-03-20,website1,US,0,84,226,0.0,15.0,3.370812,0,0
2017-03-21,website1,US,0,84,214,0.0,15.0,3.370812,0,1
2017-03-22,website1,US,0,84,226,0.0,16.0,3.370812,0,1
2017-03-23,website1,US,0,84,234,0.0,16.0,3.369048,0,1
2017-03-24,website1,US,0,84,226,0.0,16.0,3.369048,0,1
2017-03-25,website1,US,0,84,212,0.0,16.0,3.369048,0,1
2017-03-27,website1,US,0,84,228,0.0,16.0,3.369048,4,2
2017-02-15,website2,AU,1,91,144,4.0,148.0,4.727272,0,0
2017-02-16,website2,AU,1,91,144,3.0,147.0,4.727272,3,1
2017-02-20,website2,AU,1,91,100,4.0,148.0,4.727272,7,4
2017-02-21,website2,AU,1,91,118,6.0,149.0,4.727272,3,1
2017-02-22,website2,AU,1,91,114,4.0,151.0,4.727272,1,1]
元组分组后每个日期之间的差异。
date
一种选择是使用datetime
将pd.to_datetime()
列转换为Panda diff
并使用x days
函数,但会产生值{{1 “,类型为timetelda64。我想用这个差异来找出每日平均数,所以如果这可以在一个/不那么痛苦的步骤中完成,那就行得很好。
答案 0 :(得分:2)
您可以使用.dt.days
访问者:
In [72]: df['date'] = pd.to_datetime(df['date'])
In [73]: df['day_diff'] = df.groupby(['site','country_code','kind','ID'])['date'] \
.diff().dt.days.fillna(0)
In [74]: df
Out[74]:
date site country_code kind ID rank votes sessions avg_score count day_diff
0 2017-03-20 website1 US 0 84 226 0.0 15.0 3.370812 53.0 0.0
1 2017-03-21 website1 US 0 84 214 0.0 15.0 3.370812 53.0 1.0
2 2017-03-22 website1 US 0 84 226 0.0 16.0 3.370812 53.0 1.0
3 2017-03-23 website1 US 0 84 234 0.0 16.0 3.369048 54.0 1.0
4 2017-03-24 website1 US 0 84 226 0.0 16.0 3.369048 54.0 1.0
5 2017-03-25 website1 US 0 84 212 0.0 16.0 3.369048 54.0 1.0
6 2017-03-27 website1 US 0 84 228 0.0 16.0 3.369048 58.0 2.0
7 2017-02-15 website2 AU 1 91 144 4.0 148.0 4.727272 521.0 0.0
8 2017-02-16 website2 AU 1 91 144 3.0 147.0 4.727272 524.0 1.0
9 2017-02-20 website2 AU 1 91 100 4.0 148.0 4.727272 531.0 4.0
10 2017-02-21 website2 AU 1 91 118 6.0 149.0 4.727272 533.0 1.0
11 2017-02-22 website2 AU 1 91 114 4.0 151.0 4.727272 534.0 1.0