如何在熊猫中将特定年份的列中的所有值相乘

时间:2018-03-11 10:27:45

标签: python pandas

我正在尝试将特定年份中的所有值相乘并将其推送到另一列。使用下面的代码我收到此错误

TypeError: ("'NoneType' object is not callable", 'occurred at index

当我使用shift(1)时,我得到NaTNaN。我怎样才能让它发挥作用?

def check_date():
    next_row = df.Date.shift(1)
    first_row = df.Date

    date1 = pd.to_datetime(first_row).year
    date2 = pd.to_datetime(next_row).year

    if date1 == date2:
        df['all_data_in_year'] = date1 * date2


df.apply(check_date(), axis=1)

数据集:

    Date    Open    High    Low Last    Close   Total Trade Quantity    Turnover (Lacs)
31/12/10    816 824.5   807.3   815 818.45  1165987 9529.64
31/01/11    675 680 654 670.1   669.35  535039  3553.92
28/02/11    550 561.6   542 548.5   548.4   749166  4136.09
31/03/11    621.5   624.7   607.1   618 616.25  628572  3866
29/04/11    654.7   657.95  626 631 632.05  833213  5338.91
31/05/11    575 590 565.6   589.3   585.15  908185  5239.36
30/06/11    527 530.7   521.3   524 524.6   534496  2804.89
29/07/11    496.95  502.9   486 486.2   489.7   500743  2477.96
30/08/11    365.95  382.7   365 380 376.65  844439  3171.6
30/09/11    362.4   365.9   348.1   352 352.75  617537  2196.56
31/10/11    430 439.5   425 429.1   431.2   1033903 4493.97
30/11/11    349.05  354.95  344.15  348 350 686735  2404.1
30/12/11    353 355.9   340.1   340.1   342.75  740222  2565.39
31/01/12    443 451.45  428 445.5   446 1344942 5952.77
29/02/12    485.55  505.9   484 497 495.1   1011007 5004.46
30/03/12    421 436.45  418.4   432.5   432.95  867832  3740.04
30/04/12    410.35  419.4   406.85  414.3   414.05  418539  1733.81
31/05/12    362 363.05  351.2   359 358.3   840753  3000.41
29/06/12    385.05  395.3   382.9   388 389.75  1171690 4581.58
31/07/12    377.75  386 367.7   380.5   381.35  499246  1886.06
31/08/12    473.7   473.7   394.25  399 400.85  631225  2544.24

1 个答案:

答案 0 :(得分:1)

我认为更好的是避免循环(apply在幕后)并使用numpy.where

#sample Dataframe with sample datetimes
rng = pd.date_range('2017-04-03', periods=10, freq='8m')
df = pd.DataFrame({'Date': rng, 'a': range(10)})  

date1 = df.Date.shift(1).dt.year
date2 = df.Date.dt.year

df['all_data_in_year'] = np.where(date1 == date2, date1 * date2, np.nan)
print (df)
        Date  a  all_data_in_year
0 2017-04-30  0               NaN
1 2017-12-31  1         4068289.0
2 2018-08-31  2               NaN
3 2019-04-30  3               NaN
4 2019-12-31  4         4076361.0
5 2020-08-31  5               NaN
6 2021-04-30  6               NaN
7 2021-12-31  7         4084441.0
8 2022-08-31  8               NaN
9 2023-04-30  9               NaN

EDIT1:

df['new'] = df.groupby( pd.to_datetime(df['Date']).dt.year)['Close'].transform('prod')