根据以前的行计算列值

时间:2017-07-08 15:33:55

标签: python pandas numpy dataframe

我想在每行添加一个新列Y,这将告诉我X列val更大的百分比次数> 1为过去10个以前的记录

stock price history

   ticker       date    adj_open   ad_close       X(%) 
0    ABC     2017-10-06   12.10      13.11        8.0
1    ABC     2017-12-05   11.11      11.87        5.0
2    ABC     2017-12-04   12.08      11.40       -7.0
3    ABC     2017-12-03   12.01      13.03       10.1
4    ABC     2017-07-04   9.01        9.59        8.0
5    ABC     2017-07-03   7.89        8.19        4.0

Resultant transformed data set

    ticker       date    adj_open ad_close    X(%)     Y(%)    
0    ABC     2017-10-06   12.10    13.11      8.0        80
1    ABC     2017-12-05   11.11    11.87      5.0        75
2    ABC     2017-12-04   12.08    11.40     -7.0       100
3    ABC     2017-12-03   12.01    13.03     10.1       100
4    ABC     2017-07-04   9.01     9.59       8.0       100
5    ABC     2017-07-03   7.89     8.19       4.0        0

2 个答案:

答案 0 :(得分:0)

尝试使用simple try进行except循环,这基于您的示例输出,尝试根据您的data

n=5 #your example
df['boolean']=df['X(%)']>1
A=[]
for i in range(len(df)):
     try :
         A.append(sum(df.boolean[i+1:i+n+1])/len(df.boolean[i+1:i+n+1]))
     except:
         A.append(0)

df['Y(%)']=A


df

     ticker       date  adj_open  ad_close   X(%) boolean Y(%)
   0    ABC  10/6/2017     12.10     13.11   8.0   True  0.80
   1    ABC  12/5/2017     11.11     11.87   5.0   True  0.75
   2    ABC  12/4/2017     12.08     11.40  -7.0  False  1.00
   3    ABC  12/3/2017     12.01     13.03  10.1   True  1.00
   4    ABC   7/4/2017      9.01      9.59   8.0   True  1.00
   5    ABC   7/3/2017      7.89      8.19   4.0   True  0.00

答案 1 :(得分:0)

你有:

 $ sudo rm /var/lib/mongodb/mongod.lock

rm: cannot remove '/var/lib/mongodb/mongod.lock': No such file or directory

让我们定义df ticker date adj_open ad_close X(%) 0 ABC 2017-10-06 12.10 13.11 8.0 1 ABC 2017-12-05 11.11 11.87 5.0 2 ABC 2017-12-04 12.08 11.40 -7.0 3 ABC 2017-12-03 12.01 13.03 10.1 4 ABC 2017-07-04 9.01 9.59 8.0 5 ABC 2017-07-03 7.89 8.19 4.0 和一个计算想要数量的函数:

window

最后,让我们应用这个功能:

w = 2
def count_pcnt(x, window = w):
    return (np.sum(x>1)/window)*100.0

您可以将df["Y(%)"] = df["X(%)"].rolling(window=w).apply(count_pcnt) df ticker date adj_open ad_close X(%) Y(%) 0 ABC 2017-10-06 12.10 13.11 8.0 NaN 1 ABC 2017-12-05 11.11 11.87 5.0 100.0 2 ABC 2017-12-04 12.08 11.40 -7.0 50.0 3 ABC 2017-12-03 12.01 13.03 10.1 50.0 4 ABC 2017-07-04 9.01 9.59 8.0 100.0 5 ABC 2017-07-03 7.89 8.19 4.0 100.0 更改为w,因为您有更多数据。

修改

如果您愿意:

10

编辑2

w=4
df["Y(%)"] = df["X(%)"].rolling(window=w).apply(lambda x: count_pcnt(x, window = w))

df
    ticker  date        adj_open    ad_close    X(%)    Y(%)
0   ABC     2017-10-06  12.10       13.11       8.0     NaN
1   ABC     2017-12-05  11.11       11.87       5.0     NaN
2   ABC     2017-12-04  12.08       11.40      -7.0     NaN
3   ABC     2017-12-03  12.01       13.03      10.1     75.0
4   ABC     2017-07-04  9.01        9.59        8.0     75.0
5   ABC     2017-07-03  7.89        8.19        4.0     75.0

编辑3

w=4 # specify the desired window
df["Y(%)"] = df["X(%)"].rolling(window=w).apply(lambda x: (np.sum(x>1)/x.shape[0])* 100.0)