Question

我正在尝试过滤我的数据，以仅使出现两次以上的人出现。所以，在我的场景中，我想从

ID    LUNCH
1     Sandwich
2     Cheese
3     Soup
1     Salad

收件人

ID     LUNCH
1      Sandwich
1      Salad

因为唯一出现多次的ID是1

Answer 1

一种base解决方案：

subset(df, ID %in% ID[duplicated(ID)])

#   ID    LUNCH
# 1  1 Sandwich
# 4  1    Salad

它是dplyr版本：

library(dplyr)

df %>%
  filter(ID %in% ID[duplicated(ID)])

数据

df <- structure(list(ID = c(1L, 2L, 3L, 1L), LUNCH = c("Sandwich", 
"Cheese", "Soup", "Salad")), class = "data.frame", row.names = c(NA, -4L))

Answer 2

您可以尝试以下方法：

library(dplyr)

#Data
df <- structure(list(ID = c(1L, 2L, 3L, 1L), LUNCH = c("Sandwich", 
"Cheese", "Soup", "Salad")), class = "data.frame", row.names = c(NA, 
-4L))
#Code
df %>% left_join(df %>% group_by(ID) %>% summarise(N=n())) %>%
  filter(N>1) %>% select(-N)

  ID    LUNCH
1  1 Sandwich
2  1    Salad

Answer 3

您可以尝试以下方法：

在dplyr包中

newdata <- filter(data, duplicated(data$ID, incomparables = FALSE))

Answer 4

另一种dplyr解决方案：

library(dplyr)

df %>%
    group_by(LUNCH) %>%
    tally() %>%
    filter(n > 1)
    ungroup()

按频率过滤数据

4 个答案: