没有嵌套的数据框列表上的函数列表apply()

时间:2014-07-26 09:14:26

标签: r

根据下面的数据dat,我试图获得以下结果,只是没有嵌套
 lapply(, sapply...),如下所示。

> lapply(dat, function(x) sapply(funs, function(y) y(x)))
# $bondsba01
#   AVG   SLG 
# 0.223 0.300 
#
# $pujolal01
#   AVG   SLG 
# 0.329 0.422 

我熟悉rapply(),但我在此列表中无法实施。我认为,因为dat是数据框列表,所以此调用相当于列表列表,rapply是合适的。

我尝试了rapply()的一些变体,几乎每次都会遇到同样的错误。

> rapply(funs, function(x) x(dat), how = "replace")
#  Error in eval(expr, envir, enclos) : object 'H' not found 

我在how = "list"how = "unlist"时遇到同样的错误如果不将sapplylapply嵌套,我该怎么做?

示例数据:

dat <- 
structure(list(bondsba01 = structure(list(AB = 413L, R = 72L, 
    H = 92L, X2B = 26L, X3B = 3L, HR = 16L, RBI = 48L, SB = 36L, 
    CS = 7L, BB = 65L, SO = 102L, IBB = 2L, HBP = 2L, SH = 2L, 
    SF = 2L), .Names = c("AB", "R", "H", "X2B", "X3B", "HR", 
"RBI", "SB", "CS", "BB", "SO", "IBB", "HBP", "SH", "SF"), row.names = 1L, 
    class = "data.frame"), 
    pujolal01 = structure(list(AB = 590L, R = 112L, H = 194L, 
        X2B = 47L, X3B = 4L, HR = 37L, RBI = 130L, SB = 1L, CS = 3L, 
        BB = 69L, SO = 93L, IBB = 6L, HBP = 9L, SH = 1L, SF = 7L), 
    .Names = c("AB", "R", "H", "X2B", "X3B", "HR", "RBI", "SB", "CS", "BB",
    "SO", "IBB", "HBP", "SH", "SF"), row.names = 1L, class = "data.frame")),
    .Names = c("bondsba01", "pujolal01"))

功能列表:

funs <- 
structure(list(AVG = function (x) 
with(x, round(H/AB, 3)), SLG = function (x) 
with(x, round(((H - X2B - X3B - HR) + 2 * X2B + 3 * X3B + HR)/AB, 
    3))), .Names = c("AVG", "SLG"))

Link to the actual data

1 个答案:

答案 0 :(得分:3)

仅仅因为星期六早上,我想尝试foreach,这是一个解决方案:

library(foreach)
library(iterators)

foreach(x=iter(dat), .combine=cbind) %:% 
  foreach(f=iter(funs), .combine=c)  %do% 
  f(x)


     result.1 result.2
[1,]    0.223    0.329
[2,]    0.300    0.422

这应该很快,但更重要的是,将foreach并行化非常容易。您只需进行两项更改:

  • 加载您首选的并行包(我使用doParallel)并注册群集
  • %do%更改为%dopar%

像这样:

library(doParallel)
cl <- makePSOCKcluster(2)
registerDoParallel(cl)
foreach(x=iter(dat), .combine=cbind) %:% 
  foreach(f=iter(funs), .combine=c)  %dopar% 
  f(x)

     result.1 result.2
[1,]    0.223    0.329
[2,]    0.300    0.422

stopCluster(cl)