R:为什么lapply()会使我的结果翻倍?

时间:2019-01-16 05:26:49

标签: r function

我正在编写一个用于从一系列线性回归模型中获取诊断和测试错误的函数。

我的输入是列表列表。每个列表都包含其自身模型的信息。

model.1 <- list("medv","~.","Boston_Ready")
names(model.1) <- c("response", "input","dataset")

model.2 <- list("medv","~lstat","Boston_Ready")
names(model.2) <- c("response", "input","dataset")

models <- list(model.1,model.2)

当给定一个具有数据框,响应变量和输入的列表时,我的函数将计算回归诊断。

TestError <- function(model){
  library('boot')

    df <- get(model$dataset)
    formula <- paste(model$response,model$input)
    response <- model$response

    ##Diagnostics
    fit <- lm(formula,data=df)
    fit_summ <- summary(fit)
    F_Stat <- fit_summ$fstatistic[1]
    Adj_R_Sq <- fit_summ$adj.r.squared
    RSS <- with(fit_summ, df[2] * sigma^2)
    AIC <- AIC(fit)
    BIC <- BIC(fit)

    ##Cross-Validation
    #5-fold cross validation
    glm.fit <- glm(formula, data=df)
    cv.err <- cv.glm(df, glm.fit, K=5)
    Five.Fold_MSE <- cv.err$delta[1]

    #10-fold cross validation
    glm.fit <- glm(formula, data=df)
    cv.err <- cv.glm(df, glm.fit, K=10)
    Ten.Fold_MSE <- cv.err$delta[1]

    #LOOCV
    glm.fit <- glm(formula, data=df)
    cv.err <- cv.glm(df, glm.fit)
    LOOCV_MSE <- cv.err$delta[1]

    #Output
    label <- c("lm","formula =",paste(model$response,model$input), "data= ",model$dataset)
    print(paste(label))
    Results <- (c(LOOCV_MSE,Five.Fold_MSE,Ten.Fold_MSE,F_Stat,Adj_R_Sq, RSS, AIC, BIC))
    names(Results) <- c("LOOCV MSE", "5-Fold MSE", "10-Fold MSE","F-Stat","Adjusted R^2","RSS","AIC","BIC")

    print(Results)
    }

由于某种原因,输出两次生成相同的东西

lapply(models,TestError)

> lapply(models,TestError)
[1] "lm"           "formula ="    "medv ~."      "data= "       "Boston_Ready"
   LOOCV MSE   5-Fold MSE  10-Fold MSE       F-Stat Adjusted R^2          RSS          AIC          BIC 
   0.3250332    0.3288020    0.3251508  114.3744328    0.6918372  152.5405737  853.2181335  903.9365735 
[1] "lm"           "formula ="    "medv ~lstat"  "data= "       "Boston_Ready"
   LOOCV MSE   5-Fold MSE  10-Fold MSE       F-Stat Adjusted R^2          RSS          AIC          BIC 
   0.4597660    0.4622565    0.4593045  601.6178711    0.5432418  230.2061197 1043.4596316 1056.1392416 
[[1]]
   LOOCV MSE   5-Fold MSE  10-Fold MSE       F-Stat Adjusted R^2          RSS          AIC          BIC 
   0.3250332    0.3288020    0.3251508  114.3744328    0.6918372  152.5405737  853.2181335  903.9365735 

[[2]]
   LOOCV MSE   5-Fold MSE  10-Fold MSE       F-Stat Adjusted R^2          RSS          AIC          BIC 
   0.4597660    0.4622565    0.4593045  601.6178711    0.5432418  230.2061197 1043.4596316 1056.1392416 

那是由于对lapply()的怪癖吗?

1 个答案:

答案 0 :(得分:1)

由于在函数结尾处有print(result),因此它实际上是在打印模型,然后将其作为列表值返回。