我正在尝试了解predict()

时间:2020-08-18 21:20:33

标签: r predict

我试图理解R中的predict()操作。如果我运行这组R代码:

growth <- c(12,10,8,11,6,7,2,3,3)
tannin <- c(0,1,2,3,4,5,6,7,8)
plot(growth, tannin, pch = 20, col = "blue")
model <- lm(growth ~ tannin)
new.d <- data.frame(tannin = seq(0,10, 0.5))
mod2 <- predict(model, newdata = new.d, se.fit = T, interval = "confidence", level = 0.95)
col <- c("red", "green", "green")
plot(tannin, growth, pch = 20, col = "blue")
matlines(new.d$tannin, mod2$fit[,1:3], col = col, lty =2)

我得到了数据的散点图和一些不错的置信区间线

但是如果我运行这组R代码:

y.hat <- c(973.6536, 1620.5231, 882.3643, 1529.2338, 1266.6586, 1281.8735, 1205.7990, 928.0090, 1574.8784, 1297.0883, 1190.5841, 1251.4437, 1187.5412, 1305.3495)
gain <- c(1004, 1636, 852, 1506, 1272, 1270, 1269, 903, 1555, 1260, 1146, 1276, 1225, 1321)
plot(y.hat, gain, pch = 20, col = "blue")
model.2 <- lm(gain ~ y.hat)
new.d.2 <- data.frame(gain = seq(800,1700, length.out = 14))
mod.3 <- predict(model.2, newdata = new.d.2, se.fit = T, interval = "confidence", level = 0.95)
matlines(new.d.2$gain, mod.3$fit[,1:3], col = col, lty = 2)

散点图是正常的,但置信度线都很奇怪。

请帮助我理解原因。

感谢柯克

1 个答案:

答案 0 :(得分:0)

正如@MrFlick所指出的那样,您的预测函数很可能是错误的...

如果您要基于“增益”来预测值,由于增益是您在predict()函数中的输入,因此您的lm函数应该为lm(y.hat ~ gain),并且您的图形看起来会很好。

col <- c("red", "green", "green")
y.hat <- c(973.6536, 1620.5231, 882.3643, 1529.2338, 1266.6586, 1281.8735, 1205.7990, 928.0090, 1574.8784, 1297.0883, 1190.5841, 1251.4437, 1187.5412, 1305.3495)
gain <- c(1004, 1636, 852, 1506, 1272, 1270, 1269, 903, 1555, 1260, 1146, 1276, 1225, 1321)

model.2 <- lm(y.hat ~ gain)

new.d.2 <- data.frame(gain = seq(800,1700, length.out = 14))
mod.3 <- predict(model.2, newdata = new.d.2, se.fit = T, interval = "confidence", level = 0.95)

plot(gain, y.hat, pch = 20, col = "blue")
matlines(new.d.2$gain, mod.3$fit[,1:3], col = col, lty = 2)

reprex package(v0.3.0)于2020-08-18创建