如何从R中的线性模型中提取特定残差数据

时间:2016-03-23 03:53:40

标签: r dplyr broom

如何在以下线性模型中提取特定棒球队的提取残差数据?例如,我如何提取" CLE"?的残差?

library(Lahman)
library(dplyr)
library(broom)

# create baseball team data
data(Teams)
teams <- Teams
teams <- teams %>% mutate(win_percentage = (W / (W + L)) * 100)

# summarize baseball team salary by year
salaries <- Salaries
salaries <- salaries %>% 
  group_by(teamID, yearID, lgID) %>%
  summarise(payroll_M = sum(as.numeric(salary)) / 10^6) %>% 
  ungroup()

# add winning percentage to the salary table
salaries <- teams %>% 
  select(yearID, teamID, win_percentage) %>% 
  right_join(salaries, by = c("yearID", "teamID"))

# compute linear model of winning vs team salary
model <- salaries %>% 
  group_by(yearID) %>%
  do(fit = augment(lm(win_percentage ~ payroll_M, data = .)))

# extract residuals for Cleveland ??????

1 个答案:

答案 0 :(得分:3)

您已关闭,但需要对augment行进行两次更改。

  1. 您将生成的(增强的)数据框保存到名为fit的列中。相反,请尝试将其直接发送到do(删除fit =)。

  2. 扩充函数需要将teamID列保留为结果数据的一部分,即使它不在模型中。请注意,augment为此目的采用了第二个参数data(有关详情,请参阅help(augment.lm))。

  3. 因此,新行看起来像:

    do(augment(lm(win_percentage ~ payroll_M, data = .), data = .))
    

    结果数据框每个原始观察将有一行,并包括teamID以及残差和拟合值(允许您过滤CLE)。

相关问题