我是R和统计数据的新手,并且无法弄清楚如何在运行多元线性回归后绘制预测值与实际值的关系。我遇到过类似的问题(只是无法理解代码)。如果您解释一下代码,我将不胜感激。 这是我到目前为止所做的:
# Attach file containing variables and responses
q <- read.csv("C:/Users/A/Documents/Design.csv")
attach(q)
# Run a linear regression
model <- lm(qo~P+P1+P4+I)
# Summary of linear regression results
summary(model)
预测与实际的关系是如此,我可以用图形方式看到我的回归在我的实际数据中的适合程度。
答案 0 :(得分:8)
如果你提供reproducible example会更好,但这是我编写的一个例子:
set.seed(101)
dd <- data.frame(x=rnorm(100),y=rnorm(100),
z=rnorm(100))
dd$w <- with(dd,
rnorm(100,mean=x+2*y+z,sd=0.5))
使用data
参数(更好) - 你几乎不应该使用attach()
..
m <- lm(w~x+y+z,dd)
plot(predict(m),dd$w,
xlab="predicted",ylab="actual")
abline(a=0,b=1)
答案 1 :(得分:4)
答案 2 :(得分:0)
一种简单的方法是使用modelsummary :: augment():
library(tidyverse)
library(cowplot)
library(modelsummary)
set.seed(101)
# Using Ben's data above:
dd <- data.frame(x=rnorm(100),y=rnorm(100),
z=rnorm(100))
dd$w <- with(dd,rnorm(100,mean=x+2*y+z,sd=0.5))
m <- lm(w~x+y+z,dd)
m %>% augment() %>%
ggplot() +
geom_point(aes(.fitted, w)) +
geom_smooth(aes(.fitted, w), method = "lm", se = FALSE, color = "lightgrey") +
labs(x = "Actual", y = "Fitted") +
theme_bw()
这特别适用于深度嵌套的回归列表。
为说明这一点,请考虑一些嵌套的回归列表:
Reglist <- list()
Reglist$Reg1 <- dd %>% do(reg = lm(as.formula("w~x*y*z"), data = .)) %>% mutate( Name = "Type 1")
Reglist$Reg2 <- dd %>% do(reg = lm(as.formula("w~x+y*z"), data = .)) %>% mutate( Name = "Type 2")
Reglist$Reg3 <- dd %>% do(reg = lm(as.formula("w~x"), data = .)) %>% mutate( Name = "Type 3")
Reglist$Reg4 <- dd %>% do(reg = lm(as.formula("w~x+z"), data = .)) %>% mutate( Name = "Type 4")
现在,上述整洁的绘图框架的力量得以发挥……
Graph_Creator <- function(Reglist){
Reglist %>% pull(reg) %>% .[[1]] %>% augment() %>%
ggplot() +
geom_point(aes(.fitted, w)) +
geom_smooth(aes(.fitted, w), method = "lm", se = FALSE, color = "lightgrey") +
labs(x = "Actual", y = "Fitted",
title = paste0("Regression Type: ", Reglist$Name) ) +
theme_bw()
}
Reglist %>% map(~Graph_Creator(.)) %>%
cowplot::plot_grid(plotlist = ., ncol = 1)
答案 3 :(得分:0)
与@Ben Bolker 的解决方案相同,但获取 ggplot 对象而不是使用基础 R
#first generate the dd data set using the code in Ben's solution, then...
require(ggpubr)
m <- lm(w~x+y+z,dd)
ggscatter(x = "prediction",
y = "actual",
data = data.frame(prediction = predict(m),
actual = dd$w)) +
geom_abline(intercept = 0,
slope = 1)