使用dplyr进行多条件过滤联接

时间:2019-04-01 09:32:30

标签: r dplyr dbplyr

我正在尝试通过创建名为df的{​​{1}}来完成下面描述的操作。

我希望从event_f detail作为过滤标准,所有df的{​​{1}}排除6和3或6和7组合的那些。

请注意,可以有其他组合,但随后将全部包括在内。

event_id

reprex package(v0.2.1)于2019-04-01创建

我想要一个带有一行的df:type_id == 6library(tidyverse) #> Warning: package 'tidyverse' was built under R version 3.5.3 #> Warning: package 'purrr' was built under R version 3.5.3 event <- tibble(id = c("00_1", "00_2", "00_3", "00_4", "00_5", "00_6", "00_7"), type_id = c("A", "B", "C", "B", "A", "B", "C")) detail <- tibble(id = c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L), event_id = c("00_1", "00_1", "00_2", "00_2", "00_3", "00_4", "00_4", "00_5", "00_6", "00_6", "00_7", "00_8"), type_id = c(3L, 4L, 6L, 7L, 2L, 6L, 3L, 2L, 6L, 5L, 2L, 1L)) event_f <- event %>% semi_join(detail %>% filter(event_id %in% event$id, type_id == 6, type_id != (7 | 3)), by = c("id" = "event_id")) 。我想问题出在最后两个id = "00_6"操作上,但是不确定如何将它们组合在一起?

1 个答案:

答案 0 :(得分:1)

我认为您需要

library(dplyr)

event %>%
   semi_join(detail %>%
               group_by(event_id) %>%
               filter(any(type_id == 6) & all(!type_id %in% c(3, 7))),
    by = c("id" = "event_id"))

# id    type_id
# <chr> <chr>  
#1 00_6  B     

当我们尝试为满足event_id type_id所需条件的group_by查找event_id时。如果我们不group_by,则过滤条件将应用于整个数据帧,因为我们在数据帧中具有值3和7,所以它将返回0行。