闭包在datadist $ limits中不是子集

时间:2016-02-20 00:34:40

标签: r linear-regression

在使用rms包和ols()函数进行简单的多变量回归时,我不完全确定为什么会收到此错误。 lm()函数工作正常。

错误:

  

datadist $ limits中的错误:'closure'类型的对象不是可子集化的

示例数据:

dat <- structure(list(value = c(153.7, 137.2, 137.2, 137.2, 137.2, 137.2, 
137.2, 137.2, 137.2, 144.3), x1 = c(1586.30574782368, 1827.63764435891, 
1274.37779664208, 1470.22193641518, 1424.71486797217, 1588.96099774091, 
1768.09933607758, 1447.4030640002, 1586.11159875168, 1741.04342002899
), x2 = c(9.37073885963036, 79.466637406771, 3.07432642677304, 
5.32614246511366, 9.65257915442635, 9.70809241832467, 47.0161105721418, 
39.7744598414865, 13.2940602286908, 26.6250313249184)), .Names = c("value", 
"x1", "x2"), row.names = c(NA, 10L), class = "data.frame")

使用ols的模型:

library(rms)
datadist <- datadist(dat)
options("datadist" = "datadist")

mod <- ols(log(value) ~ x1 + x2, data = dat, x = TRUE, y = TRUE)

> mod <- ols(log(value) ~ x1 + x2, data = dat, x = TRUE, y = TRUE)
Error in datadist$limits : object of type 'closure' is not subsettable

使用lm的模型:

> mod <- lm(log(value) ~ x1 + x2, data = dat)
> summary(mod)

Call:
lm(formula = log(value) ~ x1 + x2, data = dat)

Residuals:
      Min        1Q    Median        3Q       Max 
-0.032746 -0.021049 -0.004316  0.010937  0.080848 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept)  4.7514306  0.1487970   31.93 7.64e-09 ***
x1           0.0001335  0.0001019    1.31    0.232    
x2          -0.0009582  0.0007205   -1.33    0.225    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.03755 on 7 degrees of freedom
Multiple R-squared:  0.2261,    Adjusted R-squared:  0.004997 
F-statistic: 1.023 on 2 and 7 DF,  p-value: 0.4077

1 个答案:

答案 0 :(得分:1)

您通过调用数据datadist()来混淆"datadist"。试试这个:

d <- datadist(dat)
options(datadist = "d")  ## don't need quotes around argument name ...
mod <- ols(log(value) ~ x1 + x2, data = dat, x = TRUE, y = TRUE)

关于rms(其回归建模策略一书很棒)的作者Frank Harrell,这种保存选项的方式(即保存名称并通过名称从环境中检索它)有点微妙,可能会在这种情况下搞砸了。

这种问题也是有经验的R用户建议通过给予其他对象同名(datadist())来掩盖函数名称(datadist)的原因。 R 通常聪明到足以弄清楚你的意思,但在相对罕见的情况下,它会混淆,症状通常模糊不清,难以调试。 (这也是您不应该为数据框命名的原因datadf ...)