具有行业级约束的SciPy投资组合优化

时间:2017-06-13 08:15:34

标签: python pandas optimization scipy portfolio

尝试优化投资组合权重分配,通过限制风险最大化我的回报功能。我没有问题找到通过简单约束产生我的返回函数的优化权重,即所有权重的总和等于1,并使另一个约束我的总风险低于目标风险。

我的问题是,如何为每个组添加行业权重界限?

我的代码如下:

# -*- coding: utf-8 -*-
import pandas as pd
import numpy as np
import scipy.optimize as sco

dates = pd.date_range('1/1/2000', periods=8)
industry = ['industry', 'industry', 'utility', 'utility', 'consumer']
symbols = ['A', 'B', 'C', 'D', 'E']  
zipped = list(zip(industry, symbols))
index = pd.MultiIndex.from_tuples(zipped)

noa = len(symbols)

data = np.array([[10, 9, 10, 11, 12, 13, 14, 13],
                 [11, 11, 10, 11, 11, 12, 11, 10],
                 [10, 11, 10, 11, 12, 13, 14, 13],
                 [11, 11, 10, 11, 11, 12, 11, 11],
                 [10, 11, 10, 11, 12, 13, 14, 13]])

market_to_market_price = pd.DataFrame(data.T, index=dates, columns=index)

rets = market_to_market_price / market_to_market_price.shift(1) - 1.0
rets = rets.dropna(axis=0, how='all')

expo_factor = np.ones((5,5))
factor_covariance = market_to_market_price.cov()
delta = np.diagflat([0.088024, 0.082614, 0.084237, 0.074648,
                                 0.084237])
cov_matrix = np.dot(np.dot(expo_factor, factor_covariance),
                            expo_factor.T) + delta

def calculate_total_risk(weights, cov_matrix):
    port_var = np.dot(np.dot(weights.T, cov_matrix), weights)
    return port_var

def max_func_return(weights):
    return -np.sum(rets.mean() * weights)

# optimized return with given risk
tolerance_risk = 27
noa = market_to_market_price.shape[1]
cons = ({'type': 'eq', 'fun': lambda x:  np.sum(x) - 1},
         {'type': 'eq', 'fun': lambda x:  calculate_total_risk(x, cov_matrix) - tolerance_risk})
bnds = tuple((0, 1) for x in range(noa))
init_guess = noa * [1. / noa,]
opts_mean = sco.minimize(max_func_return, init_guess, method='SLSQP',
                       bounds=bnds, constraints=cons)


In [88]: rets
Out[88]: 
            industry             utility            consumer
                   A         B         C         D         E
2000-01-02 -0.100000  0.000000  0.100000  0.000000  0.100000
2000-01-03  0.111111 -0.090909 -0.090909 -0.090909 -0.090909
2000-01-04  0.100000  0.100000  0.100000  0.100000  0.100000
2000-01-05  0.090909  0.000000  0.090909  0.000000  0.090909
2000-01-06  0.083333  0.090909  0.083333  0.090909  0.083333
2000-01-07  0.076923 -0.083333  0.076923 -0.083333  0.076923
2000-01-08 -0.071429 -0.090909 -0.071429  0.000000 -0.071429

In[89]: opts_mean['x'].round(3)
Out[89]: array([ 0.233,  0.117,  0.243,  0.165,  0.243])

如何添加这样的组绑定,使得5个资产的总和落入下面?

model = pd.DataFrame(np.array([.08,.12,.05]), index= set(industry), columns = ['strategic'])
model['tactical'] = [(.05,.41), (.2,.66), (0,.16)]
In [85]: model
Out[85]: 
          strategic      tactical
industry       0.08  (0.05, 0.41)
consumer       0.12   (0.2, 0.66)
utility        0.05     (0, 0.16)

我已经阅读过类似的帖子SciPy optimization with grouped bounds,但仍然无法获得任何线索,任何身体都可以提供帮助吗?  谢谢。

1 个答案:

答案 0 :(得分:3)

首先,考虑使用专为凸优化设计的GetFocus模块。我不太熟悉,但有效前沿的例子是here

现在回答您的问题,这是一个专门适用于您发布的问题并使用cvxopt的解决方法。 (可以概括为在输入类型和用户友好性方面创建更多灵活性,基于类的实现在这里也很有用。)

关于你的问题,"我如何添加组边界?",简短的回答是你实际上需要通过minimize而不是constraints参数来执行此操作,因为

  

可选地,x 中每个元素的下限和上限也可以   使用bounds参数指定。 [强调补充]

此规范与您尝试执行的操作不匹配。相反,以下示例的作用是分别为每个组的上限和下限添加不等式约束。函数bounds返回添加到当前约束的dicts列表。

首先,请参阅以下示例数据:

mapto_constraints

你可以看到年度数字"有意义":

import pandas as pd
import numpy as np
import numpy.random as npr
npr.seed(123)
from scipy.optimize import minimize

# Create a DataFrame of hypothetical returns for 5 stocks across 3 industries,
# at daily frequency over a year.  Note that these will be in decimal
# rather than numeral form. (i.e. 0.01 denotes a 1% return)

dates = pd.bdate_range(start='1/1/2000', end='12/31/2000')
industry = ['industry'] * 2 + ['utility'] * 2 + ['consumer']
symbols = list('ABCDE')
zipped = list(zip(industry, symbols))
cols = pd.MultiIndex.from_tuples(zipped)
returns = pd.DataFrame(npr.randn(len(dates), len(cols)), index=dates, columns=cols)
returns /= 100 + 3e-3 #drift term

returns.head()
Out[191]: 
           industry           utility          consumer
                  A        B        C        D        E
2000-01-03 -0.01484  0.00986 -0.00476  0.00235 -0.00630
2000-01-04  0.00518  0.00958 -0.01210 -0.00814 -0.01664
2000-01-05  0.00233 -0.01665 -0.00366  0.00520  0.02058
2000-01-06  0.00368  0.01253  0.00259  0.00309 -0.00211
2000-01-07 -0.00383  0.01174  0.00375  0.00336 -0.00608

现在有一些将在优化中使用的函数。这些都是根据Yves Hilpisch的{​​{3}}第11章的例子建立的。

(1 + returns.mean()) ** 252 - 1
Out[199]: 
industry  A   -0.05531
          B    0.32455
utility   C    0.10979
          D    0.14339
consumer  E   -0.12644

以下是等权重投资组合的预期年化标准差。我在这里只是将其用作优化中的锚(def logrels(rets): """Log of return relatives, ln(1+r), for a given DataFrame rets.""" return np.log(rets + 1) def statistics(weights, rets): """Compute expected portfolio statistics from individual asset returns. Parameters ========== rets : DataFrame Individual asset returns. Use numeral rather than decimal form weights : array-like Individual asset weights, nx1 vector. Returns ======= list of (pret, pvol, pstd); these are *per-period* figures (not annualized) pret : expected portfolio return pvol : expected portfolio variance pstd : expected portfolio standard deviation Note ==== Note that Modern Portfolio Theory (MPT), being a single-period model, works with (optimizes using) continuously compounded returns and volatility, using log return relatives. The difference between these and more commonly used geometric means will be negligible for small returns. """ if isinstance(weights, (tuple, list)): weights = np.array(weights) pret = np.sum(logrels(rets).mean() * weights) pvol = np.dot(weights.T, np.dot(logrels(rets).cov(), weights)) pstd = np.sqrt(pvol) return [pret, pvol, pstd] # The below are a few convenience functions around statistics() above, needed # because scipy minimize must optimize a function that returns a scalar def port_ret(weights, rets): return -1 * statistics(weights=weights, rets=rets)[0] def port_variance(weights, rets): return statistics(weights=weights, rets=rets)[1] 参数)。

risk_tol

下一个函数采用与您的statistics([0.2] * 5, returns)[2] * np.sqrt(252) # ew anlzd stdev Out[192]: 0.06642120658640735 DataFrame类似的DataFrame,并为每个组构建约束。请注意,这非常不灵活,因为您需要遵循现在使用的退货和model数据框的特定格式。

model

关于如何在上面构建约束的说明:

  

等式约束意味着约束函数结果   零,而不平等意味着它是非负的。

最后,优化本身:

def mapto_constraints(rets, model):
    tactical = model['tactical'].to_dict() # values are tuple bounds
    industries = rets.columns.get_level_values(0)
    group_cons = list()
    for key in tactical:
        if isinstance(industries.get_loc('consumer'), int):
            pos = [industries.get_loc(key)]
        else:
            pos = np.where(industries.get_loc(key))[0].tolist()
        lb = tactical[key][0]
        ub = tactical[key][1] # upper and lower bounds
        lbdict = {'type': 'ineq', 
                  'fun': lambda x: np.sum(x[pos[0]:(pos[-1] + 1)]) - lb}
        ubdict = {'type': 'ineq', 
                  'fun': lambda x: ub - np.sum(x[pos[0]:(pos[-1] + 1)])}
        group_cons.append(lbdict); group_cons.append(ubdict)
    return group_cons