在data.table中计算fama法国因素

时间:2018-09-10 00:38:14

标签: r data.table finance

我正在尝试计算r中的fama法国因子。 经过几天的汗水和绝望,我设法计算了6种投资组合的回报...只是发现了一个我似乎无法解决的问题。

我的数据大致如下所示,这只是一个简化的数据集,用于说明我的问题:

> TestX = data.table(Group = c("SM", "SM", "SM", "SH", "SH", "SH", "SL", "SL", "SL"), Date= as.Date(c("1995-07-30","1995-07-30","1995-07-30","1995-07-30","1995-07-30","1995-07-30","1995-07-30","1995-07-30", "1995-07-30")), Code= c("C1", "C2", "C3", "C4", "C5", "C6", "C7", "C8", "C9"), SMRet = c(2,3,3, NA, NA, NA, NA, NA, NA), SHRet = c(NA, NA, NA, 5,5,5, NA, NA, NA), SLRet = c(NA, NA, NA, NA, NA, NA, 0,1,2) )
> TestX
   Group       Date Code SMRet SHRet SLRet
1:    SM 1995-07-30   C1     2    NA    NA
2:    SM 1995-07-30   C2     3    NA    NA
3:    SM 1995-07-30   C3     3    NA    NA
4:    SH 1995-07-30   C4    NA     5    NA
5:    SH 1995-07-30   C5    NA     5    NA
6:    SH 1995-07-30   C6    NA     5    NA
7:    SL 1995-07-30   C7    NA    NA     0
8:    SL 1995-07-30   C8    NA    NA     1
9:    SL 1995-07-30   C9    NA    NA     2

Group给出了该组(SmallMedium,SmallHigh,SmallLow,我在真实的data.table中有其他组)。代码给出了各自的公司代码等。 我想要做的是使用相应的因素创建一个新列。 为此,我需要进行以下计算: (Smret+SHret+SLret)/3,但是我该怎么做?

TestX[, Factor := (SMRet+SHRet+SLRet)/3, by = Date]

工作,我到处都只有NA。

  Group       Date Code SMRet SHRet SLRet Factor
1:    SM 1995-07-30   C1     2    NA    NA     NA
2:    SM 1995-07-30   C2     3    NA    NA     NA
3:    SM 1995-07-30   C3     3    NA    NA     NA
4:    SH 1995-07-30   C4    NA     5    NA     NA
5:    SH 1995-07-30   C5    NA     5    NA     NA
6:    SH 1995-07-30   C6    NA     5    NA     NA
7:    SL 1995-07-30   C7    NA    NA     0     NA
8:    SL 1995-07-30   C8    NA    NA     1     NA
9:    SL 1995-07-30   C9    NA    NA     2     NA

我还需要按日期分组。实际数据表还有另外402个月。

谢谢。

编辑:这是一个更好的data.table来说明我的问题

TestX = data.table(Group = c("SM", "SM", "SH", "SH", "SL", "SL", "SM", "SM", "SH", "SH", "SL", "SL"), Date= as.Date(c("1995-07-30","1995-07-30","1995-07-30","1995-07-30","1995-07-30","1995-07-30","1995-08-30","1995-08-30", "1995-08-30", "1995-08-30","1995-08-30","1995-08-30")), Code= c("C1", "C2", "C3", "C4", "C5", "C6", "C7", "C8", "C9", "c10", "c11", "12"), SMRet = c(2,3, NA, NA, NA, NA, 4, 5, NA, NA, NA, NA), SHRet = c(NA, NA, 5, 5, NA, NA, NA, NA, 3, 4, NA, NA), SLRet = c(NA, NA, NA, NA, 0, 1, NA,NA,NA, NA, 2,3))
> TestX

        Group       Date Code SMRet SHRet SLRet
     1:    SM 1995-07-30   C1     2    NA    NA
     2:    SM 1995-07-30   C2     3    NA    NA
     3:    SH 1995-07-30   C3    NA     5    NA
     4:    SH 1995-07-30   C4    NA     5    NA
     5:    SL 1995-07-30   C5    NA    NA     0
     6:    SL 1995-07-30   C6    NA    NA     1
     7:    SM 1995-08-30   C7     4    NA    NA
     8:    SM 1995-08-30   C8     5    NA    NA
     9:    SH 1995-08-30   C9    NA     3    NA
    10:    SH 1995-08-30  c10    NA     4    NA
    11:    SL 1995-08-30  c11    NA    NA     2
    12:    SL 1995-08-30   12    NA    NA     3

这是期望的结果:

    Group       Date Code SMRet SHRet SLRet   Factor
 1:    SM 1995-07-30   C1     2    NA    NA 5.333333
 2:    SM 1995-07-30   C2     3    NA    NA 5.333333
 3:    SH 1995-07-30   C3    NA     5    NA 5.333333
 4:    SH 1995-07-30   C4    NA     5    NA 5.333333
 5:    SL 1995-07-30   C5    NA    NA     0 5.333333
 6:    SL 1995-07-30   C6    NA    NA     1 5.333333
 7:    SM 1995-08-30   C7     4    NA    NA 7.000000
 8:    SM 1995-08-30   C8     5    NA    NA 7.000000
 9:    SH 1995-08-30   C9    NA     3    NA 7.000000
10:    SH 1995-08-30  c10    NA     4    NA 7.000000
11:    SL 1995-08-30  c11    NA    NA     2 7.000000
12:    SL 1995-08-30   12    NA    NA     3 7.000000

so:每个月:(SMRet + ShRet + SLRet)/ 3

2 个答案:

答案 0 :(得分:1)

您可以使用以下代码来计算R中的fama法式因子:

TestX[ , newvar := sum(SMRet, SHRet, SLRet, na.rm=TRUE)/3, by=Date]

答案 1 :(得分:1)

我认为add_action('pmxi_saved_post', 'post_saved', 10, 1); function post_saved($id) { $original_stock = get_post_meta($id, '_stock', true); $new_stock = get_post_meta($id, '_custom_stock_placeholder', true); $combined_stock= $original_stock + $new_stock; update_post_meta($id, '_stock', $combined_stock); } 是用于此任务的便捷程序包,但它可能不如import pandas as pd import numpy as np df2 = pd.DataFrame({ 'date': [20130101,20130101, 20130105, 20130105, 20130107, 20130108], 'price': [25, 16.3, 23.5, 27, 40, 8], }) 快。您可以使用tidyverse轻松按组进行计算:

data.table
相关问题