我有以下数据:
https://docs.google.com/spreadsheets/d/15Dg0JYXoQyqIVokrVoSJOBogJw_bDCY-IoBGtleOlm8/edit?usp=sharing
我需要以大熊猫计算Pct_Change_Adjusted列:
Pct_Change_Adjusted =((值[1] +股息[1])/值[0]-1)
例如,对于#3行,#4 y#5(Googlesheet),数据为:
2019-01-02 9072 A 1020.0000 0.0000 0.0200 0.0200 9072A
2019-01-03 9072 A 1040.4000 0.0000 0.0200 0.0200 9072A
2019-01-04 9072 A 1009.1880 52.0200 -0.0300 0.0200 9072A
Pct_Change_Adjusted(第4行)=((1.040.4000 + 0.0000)/(1020.0000)-1)= 0.0200
Pct_Change_Adjusted(第5行)=((1.009.1880 + 52.02000)/(1040.4000)-1)= 0.0200
是否可以使用pct_change快速完成此操作? (而不是遍历数据的条件)
到目前为止,我对Pct_Change的代码是:
df.groupby(df [6])[3] .pct_change(1)
谢谢!
答案 0 :(得分:1)
IIUC,很可能您可以执行以下操作:
df['Pct_Change_Adjusted'] = df.groupby(['Fund_ID', 'Fund_Series'], as_index=False) \
.apply(lambda x: (x.Value + x.Dividend)/x.Value.shift()-1) \
.reset_index(level=0, drop=True)
答案 1 :(得分:0)
同样的东西,但更加详尽:
import numpy as np
import pandas as pd
import io
s = '''
Date Fund_ID Fund_Series Value Dividend
2019-01-02 9072 A 1020.0000 0.0000
2019-01-03 9072 A 1040.4000 0.0000
2019-01-04 9072 A 1009.1880 52.0200
''';
df = pd.read_csv(io.StringIO(s),sep='\s')
print(df)
Date Fund_ID Fund_Series Value Dividend
0 2019-01-02 9072 A 1020.000 0.00
1 2019-01-03 9072 A 1040.400 0.00
2 2019-01-04 9072 A 1009.188 52.02
df['Pct_Change_Adjusted'] = df.groupby(['Fund_ID', 'Fund_Series'], as_index=False) \
.apply(lambda x: (x.Value + x.Dividend)/x.Value.shift()-1) \
.reset_index(drop=True).values[0]
print(df)
Date Fund_ID Fund_Series Value Dividend Pct_Change_Adjusted
0 2019-01-02 9072 A 1020.000 0.00 NaN
1 2019-01-03 9072 A 1040.400 0.00 0.02
2 2019-01-04 9072 A 1009.188 52.02 0.02