在data.tables R中应用基于列名的函数

时间:2016-12-06 10:51:39

标签: r data.table

我希望根据给列的名称

应用用户定义函数
dt <- data.table(gr_id = 1, id = seq(1,10),min_c = runif(10,10,30),
                 ml_c = runif(10,30,50),mx_c = runif(10,50,100),
                 min_t = runif(10,10,20),ml_t = runif(10,20,25),
                 mx_t = runif(10,25,30))

我想应用一个为“c”列和“t”列计算(min(min)+min(ml))/mx的函数。目前,我做了如下。但是,当我想添加更多列(比方说,“a”)

时变得很难
dt[,{
  temp1 = min(min_c)
  temp2 = min(ml_c)
  temp3 = min(mx_c)
  score_c = (temp1+temp2)/temp3
  temp4 = min(min_t)
  temp5 = min(ml_t)
  temp6 = min(mx_t)
  score_t = (temp4+temp5)/temp6
  list(score_c = score_c,
       score_t = score_t)
},by = gr_id
  ]

1 个答案:

答案 0 :(得分:0)

我认为这会奏效。基本想法是使用get

# the original code could be simplified to:
dt[, .(
    score_c = (min(min_c) + min(ml_c)) / min(mx_c),
    score_t = (min(min_t) + min(ml_t)) / min(mx_t)
    ), by = gr_id]
# 
#    gr_id   score_c score_t
# 1:     1 0.9051556 1.28054

# using `get`
cols <- c('c', 't')
dt[, {
    res <- lapply(cols, function(i){
        vars <- paste(c('min', 'ml', 'mx'), i, sep = '_')
        (min(get(vars[1])) + min(get(vars[2]))) / min(get(vars[3]))
    })
    names(res) <- paste('score', cols, sep = '_')
    res
}, by = gr_id]

#    gr_id   score_c score_t
# 1:     1 0.9051556 1.28054