有没有办法在熊猫中矢量化这个程序?

时间:2013-10-20 01:44:39

标签: python numpy pandas

我有一个简单的股票投资组合模拟,我试图建模,但尽管有一些尝试,我无法找到一种方法来矢量化这个。也许这是不可能的,但我想知道是否有人有任何想法。

我的观点是,某一天的股票是前两天的账户价值和股票价格的函数。但是,一天的账户价值是前一天的账户价值和今天的股票数量和股票价格变化的函数。因此,股票和账户价值之间存在来回关系,我无法想到一种矢量化方式,因此下面我唯一的解决方案是下面的for循环。

提前致谢!

import pandas as pd
import numpy as np
stats = pd.DataFrame(index = range(0,10))

stats['Acct Val'] = 0.0
stats['Shares'] = 0.0
stats['Stock Px'] = pd.Series([23,25,24,26,22,23,25,25,26,24],index=stats.index)
# Wgt is the percentage of the account value that should be invested in the stock on a given day
stats['Wgt'] = pd.Series([0.5,0.5,0.5,0.5,0.3,0.4,0.4,0.2,0.2,0.0,],index=stats.index)
stats['Daily PNL'] = 0.0
# Start the account value at $10,000.00
stats.ix[0:1, 'Acct Val'] = 10000.0
stats.ix[0:1, 'Wgt'] = 0
for date_loc in range(2, len(stats.index)):
    # Keep shares the same unless 'wgt' column changes
    if stats.at[date_loc,'Wgt'] != stats.at[date_loc-1,'Wgt']:
        # Rebalanced shares are based on the acct value and stock price two days before
        stats.at[date_loc,'Shares'] = stats.at[date_loc-2,'Acct Val'] * stats.at[date_loc,'Wgt'] / stats.at[date_loc-2,'Stock Px']
    else:
        stats.at[date_loc,'Shares'] = stats.at[date_loc-1,'Shares']
    # Daily PNL is simply the shares owned on a day times the change in stock price from the previous day to the next
    stats.at[date_loc,'Daily PNL'] = stats.at[date_loc,'Shares'] * (stats.at[date_loc,'Stock Px'] - stats.at[date_loc-1,'Stock Px'])
    # Acct value is yesterday's acct value plus today's PNL
    stats.at[date_loc,'Acct Val'] = stats.at[date_loc-1,'Acct Val'] + stats.at[date_loc,'Daily PNL']


In [44]: stats
Out[44]:
       Acct Val      Shares  Stock Px  Wgt   Daily PNL
0  10000.000000    0.000000        23  0.0    0.000000
1  10000.000000    0.000000        25  0.0    0.000000
2   9782.608696  217.391304        24  0.5 -217.391304
3  10217.391304  217.391304        26  0.5  434.782609
4   9728.260870  122.282609        22  0.3 -489.130435
5   9885.451505  157.190635        23  0.4  157.190635
6  10199.832776  157.190635        25  0.4  314.381271
7  10199.832776   85.960448        25  0.2    0.000000
8  10285.793224   85.960448        26  0.2   85.960448
9  10285.793224    0.000000        24  0.0   -0.000000

In [45]:

编辑:2013年10月19日晚上11:01:

我尝试使用foobarbecue的代码,但我无法到达那里:

import pandas as pd
import numpy as np
stats = pd.DataFrame(index = range(0,10))
stats['Acct Val'] = 10000.0
stats['Shares'] = 0.0
stats['Stock Px'] = pd.Series([23,25,24,26,22,23,25,25,26,24],index=stats.index)
# Wgt is the percentage of the account value that should be invested in the stock on a given day
stats['Wgt'] = pd.Series([0.5,0.5,0.5,0.5,0.3,0.4,0.4,0.2,0.2,0.0,],index=stats.index)
stats['Daily PNL'] = 0.0
# Start the account value at $10,000.00
#stats.ix[0:1, 'Acct Val'] = 10000.0
stats.ix[0:1, 'Wgt'] = 0

def function1(df_row):
    #[stuff you want to do when Wgt changed]
    df_row['Shares'] = df_row['Acct Val'] * df_row['Wgt2ahead'] / df_row['Stock Px']
    return df_row

def function2(df_row):
    #[stuff you want to do when Wgt did not change]
    df_row['Shares'] = df_row['SharesPrevious']
    return df_row

#Find where the Wgt column changes
stats['WgtChanged']=stats.Wgt.diff() <> 0 # changed ">" to "<>"
#Using boolean indexing, choose all rows where Wgt changed and apply a function
stats['Wgt2ahead'] = stats['Wgt'].shift(-2)
stats = stats.apply(lambda df_row: function1(df_row) if df_row['WgtChanged'] == True else df_row, axis=1)
stats['Shares'] = stats['Shares'].shift(2)
#Likewise, for rows where Wgt did not change
stats['SharesPrevious'] = stats['Shares'].shift(1)
stats = stats.apply(lambda df_row: function2(df_row) if df_row['WgtChanged'] == False else df_row, axis=1)

1 个答案:

答案 0 :(得分:0)

def function1(df_row):
    [stuff you want to do when Wgt changed]

def function2(df_row):
    [stuff you want to do when Wgt did not change]

#Find where the Wgt column changes
stats['WgtChanged']=stats.Wgt.diff() > 0
#Using boolean indexing, choose all rows where Wgt changed and apply a function
stats[stats['WgtChanged']].apply(function1, axis=1)
#Likewise, for rows where Wgt did not change
stats[~stats['WgtChanged']].apply(function2, axis=1)