如何计算每个时期在熊猫中增加的数量

时间:2019-04-23 17:38:00

标签: python pandas dataframe

我有一个df,其中包含每个时间段的JIRA票证状态汇总,其中包含“开”,“关”和“其他”的计数。我想看看随着时间的流逝,门票数量的增加。

period                              Status  Counts
No. 1 Apr 06 2019 to Apr 12 2019    CLOSE   1026
No. 1 Apr 06 2019 to Apr 12 2019    OPEN    2914
No. 1 Apr 06 2019 to Apr 12 2019    OTHER   264
No. 2 Mar 30 2019 to Apr 05 2019    CLOSE   1307
No. 2 Mar 30 2019 to Apr 05 2019    OPEN    2212
No. 2 Mar 30 2019 to Apr 05 2019    OTHER   256 

在第1期间,OPEN状态的计数从2212(第2时期)增加到 2914,因此添加了第1期的702凭单。我如何添加和显示额外的colmun。

period                              Status  Counts   Added
No. 1 Apr 06 2019 to Apr 12 2019    CLOSE   1026     702 (2914-2212)
No. 1 Apr 06 2019 to Apr 12 2019    OPEN    2914     702 
No. 1 Apr 06 2019 to Apr 12 2019    OTHER   264      702 
No. 2 Mar 30 2019 to Apr 05 2019    CLOSE   1307     (2212 minus  xxx)
No. 2 Mar 30 2019 to Apr 05 2019    OPEN    2212     (2212 minus  xxx)
No. 2 Mar 30 2019 to Apr 05 2019    OTHER   256      (2212 minus  xxx)

3 个答案:

答案 0 :(得分:2)

您可以在OPEN中找到差异,然后使用transform('first')将这些值重新拟合到框架中。

u = df.assign(Added=df.loc[df.Status.eq('OPEN'), 'Counts'].diff(-1))

u.assign(Added=u.groupby('period')['Added'].transform('first'))

                             period Status  Counts  Added
0  No. 1 Apr 06 2019 to Apr 12 2019  CLOSE    1026  702.0
1  No. 1 Apr 06 2019 to Apr 12 2019   OPEN    2914  702.0
2  No. 1 Apr 06 2019 to Apr 12 2019  OTHER     264  702.0
3  No. 2 Mar 30 2019 to Apr 05 2019  CLOSE    1307    NaN
4  No. 2 Mar 30 2019 to Apr 05 2019   OPEN    2212    NaN
5  No. 2 Mar 30 2019 to Apr 05 2019  OTHER     256    NaN

答案 1 :(得分:0)

public object myObject { get; set; }

使用diff()函数并使用向后和向前填充函数来填充NA。

答案 2 :(得分:0)

从定义要在下面应用的功能开始

df

然后,通过应用此功能:

import os
os.startfile('filename.csv')

您将获得一个带有def fn(src): return src.query("Status == 'OPEN'").Counts 列的DataFrame。

最后一步是合并两个DataFrame:

df2 = df.groupby('period').apply(fn).diff(-1)\
    .fillna(0, downcast='infer')\
    .reset_index(level=1, drop=True).to_frame('Added')
相关问题