您可以在多个级别进行过滤吗?

时间:2020-03-20 15:16:38

标签: r dplyr filtering

我有一个名为“行为”的数据集(更大的数据集的样本,每个会话的每个受试者有800个Stim.RT和Stim.ACC值)

enter image description here

我想获取每个文本类别的总体平均Stim.RT和平均Stim.Acc。例如,我通常会这样做:

Dataset<-Behavioral%>%
  select(Subject, Session, Stim.ACC, Stim.RT, Text) %>%
  group_by(Text) %>%
  summarize(mean.ac = mean(Stim.ACC), mean.RT = mean(Stim.RT))

它会返回如下内容:

enter image description here

唯一的问题是,我想过滤掉所有均值.ac值小于.50的主题会话对,然后再获得第二张表。 就是如果主题1在会议1中的mean.ac为0.45,我希望删除其所有会议1值。

我尝试过:

Dataset<-Behavioral%>%
  select(Subject, Session, Stim.ACC, Stim.RT, Text) %>%
  group_by(Subject, Session) %>%
  summarize(mean.ac = mean(Stim.ACC), mean.RT = mean(Stim.RT))%>%
  group_by(Text)

我收到此错误:错误:Text列未知

2 个答案:

答案 0 :(得分:1)

library(dplyr)

Behavioral%>%
select(Subject, Session, Stim.ACC, Stim.RT, Text) %>%
group_by(Subject, Session) %>%
summarize(mean.ac = mean(Stim.ACC), mean.RT = mean(Stim.RT)) %>%
ungroup() %>%
filter(mean.ac >= 0.5) %>% 
select(Subject, Session) %>%
inner_join(Behavioral, by = c("Subject" = "Subject", "Session" = "Session")) %>%
select(Subject, Session, Stim.ACC, Stim.RT, Text) %>%
group_by(Text) %>%
summarize(mean.ac = mean(Stim.ACC), mean.RT = mean(Stim.RT)) %>%
ungroup()

所以出现错误的原因是Text不是group_by函数的一部分,并且当summarise之后的group_by时,您的变量中唯一产生的错误是group_by中的变量和您在summarise中创建的变量。因此,在您的情况下,

Dataset<-Behavioral%>%
select(Subject, Session, Stim.ACC, Stim.RT, Text) %>%
group_by(Subject, Session) %>%
summarize(mean.ac = mean(Stim.ACC), mean.RT = mean(Stim.RT))

将为SubjectSessionmean.acmean.RT

因此,我根据您的要求过滤了mean.ac >= 0.5的时间,只需要Subject到原始数据集的Sessioninner_join,这样{包含满足条件的{1}}和SubjectSession就像inner_joinjoin。然后,我继续计算filter之后mean.ac的{​​{1}}和mean.RT

答案 1 :(得分:0)

您似乎需要三个小组。

Dataset<-Behavioral%>%
  select(Subject, Session, Stim.ACC, Stim.RT, Text) %>%
  group_by(Subject, Session, Text) %>%
  summarize(mean.ac = mean(Stim.ACC), mean.RT = mean(Stim.RT))%>%
  filter(mean >= 0.5) %>%
  group_by(Text) %>%
  summarize(mean.ac = mean(mean.ac), mean.RT = mean(mean.RT))

下次,尝试制作reprex,以便我们直接处理您的数据。