Question

我有一个简单的数据框，如下所示：

x <- c("aa", "aa", "aa", "bb", "cc", "cc", "cc")
y <- c(101, 102, 113, 201, 202, 344, 407)
df = data.frame(x, y)    

    x   y
1   aa  101
2   aa  102
3   aa  113
4   bb  201
5   cc  202
6   cc  344
7   cc  407

我想使用dplyr :: filter（）和RegEx来过滤掉以y

开头的所有1观察结果

我想象代码看起来像这样：

df %>%
  filter(y != grep("^1"))

但我得到Error in grep("^1") : argument "x" is missing, with no default

Answer 1

您需要仔细检查grepl和filter的文档。

对于grep / grepl，您还必须提供要检入的向量（在本例中为y），filter采用逻辑向量（即您需要使用grepl）。如果您想提供索引向量（来自grep），则可以改为使用slice。

df %>% filter(!grepl("^1", y))

或者使用源自grep的索引：

df %>% slice(grep("^1", y, invert = TRUE))

但您也可以使用substr，因为您只对第一个字符感兴趣：

df %>% filter(substr(y, 1, 1) != 1)

Answer 2

结合使用dplyr和stringr（保持在整数范围内），您可以这样做：

df %>% filter(!str_detect(y, "^1"))

这是有效的，因为str_detect返回逻辑向量。

正则表达式（RegEx）和dplyr :: filter（）

2 个答案: