投资组合构建 - 因子模型

时间:2021-03-05 12:52:27

标签: r portfolio

我正在尝试在 R 中构建一个投资组合,我需要将不同的股票 (PERMNO) 划分为六个不同的投资组合。

我想创建一个逻辑,其中股票被归类为具有 mkt.cap > 给定年份(例如 2010)所有股票的 mkt.cap 中值

另外,在上述两组中,股票应根据BM(OBS)分为3组。

分类应该是这样的:

                             Mkt. Cap
Quartile BM (OBS)   Over yearly median  Under yearly median
      >70%                  PF1                PF2
     30-70%                 PF3                PF4
      <30%                  PF5                PF6

我的数据表中的一个示例如下所示:

PERMNO  Date      ret     mkt.cap            BM (OBS)               
10001   2009-12  0,1626 44918,3008   0,00000000000000000000
75672   2009-12 -0,2062 43722,1389   0,00001104509093018260
80928   2009-12  0,1770 689062,2694  0,00000688713518454942
80912   2009-12 -0,0274 71494,3516   0,00000984511341873784
76261   2009-12  0,0315 382438,0821  0,00000213437164919912
90303   2009-12  0,1959 964578,8864  0,00000000000000000000
91161   2009-12  0,2808 371170,0671  0,00000504687787573149
89841   2009-12  0,0438 1235170,0000 0,00000000000000000000
82515   2009-12  0,0565 934767,3563  0,00002803828655806010
84330   2009-12 -0,1000 166769,8187  0,00014664615387307400
10001   2010-01 -0,0189 43871,6618   0,00000000000000000000
75672   2010-01 -0,0260 42586,5000   0,00001115063263397240
80928   2010-01 -0,0704 640548,3269  0,00000728527479914769
80912   2010-01  0,0256 73322,8542   0,00000943960571401137
76261   2010-01 -0,0334 369662,6679  0,00000217133254998311
90303   2010-01 -0,1095 858998,8864  0,00000000000000000000
91161   2010-01 -0,1217 325990,6705  0,00000565055792544003
89841   2010-01 -0,0480 1175881,8965 0,00000000000000000000
82515   2010-01 -0,0377 899493,1499  0,00002865219568686880
84330   2010-01  0,0873 181329,0906  0,00013295614165661100

我的数据集非常广泛,因此代码应该能够在大型数据集上快速运行。

我正在考虑为投资组合创建 6 个新的二元变量,根据股票是否符合他们的不同标准,它们将是 = 0 或 = 1,但我不知道如何做到这一点

谢谢

1 个答案:

答案 0 :(得分:0)

如果您希望使用年度聚合/分位数计算新列,请使用此代码

df$YEAR <- substr(df$Date, 1, 4)
df$PF1 <- as.numeric(ave(df$BM_OBS, df$YEAR, FUN = function(x){x >= quantile(x, 0.7)}) & ave(df$mkt.cap, df$YEAR, FUN = function(x){x >= median(x)}))
df$PF2 <- as.numeric(ave(df$BM_OBS, df$YEAR, FUN = function(x){x >= quantile(x, 0.7)}) & ave(df$mkt.cap, df$YEAR, FUN = function(x){x < median(x)}))
df$PF3 <- as.numeric(ave(df$BM_OBS, df$YEAR, FUN = function(x){x < quantile(x, 0.7) & x >= quantile(x, 0.3)}) & ave(df$mkt.cap, df$YEAR, FUN = function(x){x >= median(x)}))
df$PF4 <- as.numeric(ave(df$BM_OBS, df$YEAR, FUN = function(x){x < quantile(x, 0.7) & x >= quantile(x, 0.3)}) & ave(df$mkt.cap, df$YEAR, FUN = function(x){x < median(x)}))
df$PF5 <- as.numeric(ave(df$BM_OBS, df$YEAR, FUN = function(x){x < quantile(x, 0.3)}) & ave(df$mkt.cap, df$YEAR, FUN = function(x){x >= median(x)}))
df$PF6 <- as.numeric(ave(df$BM_OBS, df$YEAR, FUN = function(x){x < quantile(x, 0.3)}) & ave(df$mkt.cap, df$YEAR, FUN = function(x){x < median(x)}))

获得

> df
   PERMNO    Date     ret    mkt.cap       BM_OBS YEAR PF1 PF2 PF3 PF4 PF5 PF6
1   10001 2009-12  0.1626   44918.30 0.000000e+00 2009   0   0   0   0   0   1
2   75672 2009-12 -0.2062   43722.14 1.104509e-05 2009   0   1   0   0   0   0
3   80928 2009-12  0.1770  689062.27 6.887135e-06 2009   0   0   1   0   0   0
4   80912 2009-12 -0.0274   71494.35 9.845113e-06 2009   0   0   0   1   0   0
5   76261 2009-12  0.0315  382438.08 2.134372e-06 2009   0   0   1   0   0   0
6   90303 2009-12  0.1959  964578.89 0.000000e+00 2009   0   0   0   0   1   0
7   91161 2009-12  0.2808  371170.07 5.046878e-06 2009   0   0   0   1   0   0
8   89841 2009-12  0.0438 1235170.00 0.000000e+00 2009   0   0   0   0   1   0
9   82515 2009-12  0.0565  934767.36 2.803829e-05 2009   1   0   0   0   0   0
10  84330 2009-12 -0.1000  166769.82 1.466462e-04 2009   0   1   0   0   0   0
11  10001 2010-01 -0.0189   43871.66 0.000000e+00 2010   0   0   0   0   0   1
12  75672 2010-01 -0.0260   42586.50 1.115063e-05 2010   0   1   0   0   0   0
13  80928 2010-01 -0.0704  640548.33 7.285275e-06 2010   0   0   1   0   0   0
14  80912 2010-01  0.0256   73322.85 9.439606e-06 2010   0   0   0   1   0   0
15  76261 2010-01 -0.0334  369662.67 2.171333e-06 2010   0   0   1   0   0   0
16  90303 2010-01 -0.1095  858998.89 0.000000e+00 2010   0   0   0   0   1   0
17  91161 2010-01 -0.1217  325990.67 5.650558e-06 2010   0   0   0   1   0   0
18  89841 2010-01 -0.0480 1175881.90 0.000000e+00 2010   0   0   0   0   1   0
19  82515 2010-01 -0.0377  899493.15 2.865220e-05 2010   1   0   0   0   0   0
20  84330 2010-01  0.0873  181329.09 1.329561e-04 2010   0   1   0   0   0   0

使用的数据

df <- structure(list(PERMNO = c(10001L, 75672L, 80928L, 80912L, 76261L, 
90303L, 91161L, 89841L, 82515L, 84330L, 10001L, 75672L, 80928L, 
80912L, 76261L, 90303L, 91161L, 89841L, 82515L, 84330L), Date = c("2009-12", 
"2009-12", "2009-12", "2009-12", "2009-12", "2009-12", "2009-12", 
"2009-12", "2009-12", "2009-12", "2010-01", "2010-01", "2010-01", 
"2010-01", "2010-01", "2010-01", "2010-01", "2010-01", "2010-01", 
"2010-01"), ret = c(0.1626, -0.2062, 0.177, -0.0274, 0.0315, 
0.1959, 0.2808, 0.0438, 0.0565, -0.1, -0.0189, -0.026, -0.0704, 
0.0256, -0.0334, -0.1095, -0.1217, -0.048, -0.0377, 0.0873), 
    mkt.cap = c(44918.3008, 43722.1389, 689062.2694, 71494.3516, 
    382438.0821, 964578.8864, 371170.0671, 1235170, 934767.3563, 
    166769.8187, 43871.6618, 42586.5, 640548.3269, 73322.8542, 
    369662.6679, 858998.8864, 325990.6705, 1175881.8965, 899493.1499, 
    181329.0906), BM_OBS = c(0, 1.10450909301826e-05, 6.88713518454942e-06, 
    9.84511341873784e-06, 2.13437164919912e-06, 0, 5.04687787573149e-06, 
    0, 2.80382865580601e-05, 0.000146646153873074, 0, 1.11506326339724e-05, 
    7.28527479914769e-06, 9.43960571401137e-06, 2.17133254998311e-06, 
    0, 5.65055792544003e-06, 0, 2.86521956868688e-05, 0.000132956141656611
    )), class = "data.frame", row.names = c(NA, -20L))

   PERMNO    Date     ret    mkt.cap       BM_OBS
1   10001 2009-12  0.1626   44918.30 0.000000e+00
2   75672 2009-12 -0.2062   43722.14 1.104509e-05
3   80928 2009-12  0.1770  689062.27 6.887135e-06
4   80912 2009-12 -0.0274   71494.35 9.845113e-06
5   76261 2009-12  0.0315  382438.08 2.134372e-06
6   90303 2009-12  0.1959  964578.89 0.000000e+00
7   91161 2009-12  0.2808  371170.07 5.046878e-06
8   89841 2009-12  0.0438 1235170.00 0.000000e+00
9   82515 2009-12  0.0565  934767.36 2.803829e-05
10  84330 2009-12 -0.1000  166769.82 1.466462e-04
11  10001 2010-01 -0.0189   43871.66 0.000000e+00
12  75672 2010-01 -0.0260   42586.50 1.115063e-05
13  80928 2010-01 -0.0704  640548.33 7.285275e-06
14  80912 2010-01  0.0256   73322.85 9.439606e-06
15  76261 2010-01 -0.0334  369662.67 2.171333e-06
16  90303 2010-01 -0.1095  858998.89 0.000000e+00
17  91161 2010-01 -0.1217  325990.67 5.650558e-06
18  89841 2010-01 -0.0480 1175881.90 0.000000e+00
19  82515 2010-01 -0.0377  899493.15 2.865220e-05
20  84330 2010-01  0.0873  181329.09 1.329561e-04
相关问题