在字符串中搜索字符串列表并返回匹配项

时间:2018-05-05 17:01:39

标签: r string dataframe character

在R中我想按照标题说的那样做。在字符列中搜索并返回匹配的单词

As.data.frame(
    c("yellow carrot","big car","green tomato","orange car","fertile goat","red snapper")
    )

并且

c("yellow","red","orange","green","blue")

我想返回

As.data.frame(
    cbind(
        c("yellow carrot","big car","green tomato","orange car","fertile goat","red snapper"),
        c("yellow","NA","green","orange","NA","red")
        )

3 个答案:

答案 0 :(得分:0)

我们可以使用str_extract来获取匹配的子字符串

library(stringr)
df1$new <- str_extract(df1[[1]], paste(vec1, collapse="|")) 
df1$new
#[1] "yellow" NA       "green"  "orange" NA       "red"   

数据

vec1 <- c("yellow","red","orange","green","blue")
df1 <- data.frame(col1 = c("yellow carrot","big car",
  "green tomato","orange car","fertile goat","red snapper"))

答案 1 :(得分:0)

使用dplyrifelse语句,如果颜色不在字符串的开头,则有效。

data.frame(
    vary_1 = c(
        "yellow carrot",
        "big car",
        "green tomato",
        "orange car",
        "fertile goat",
        "red snapper"
    )
) %>%
    mutate(new = ifelse(grepl('yellow', .$vary_1),'yellow',
        ifelse(grepl('green', .$vary_1),'green',
            ifelse(grepl('red', .$vary_1),'red',
                   ifelse(grepl('orange',.$vary_1),'orange',
            NA
        )))))
    )

         vary_1    new
1 yellow carrot yellow
2       big car   <NA>
3  green tomato  green
4    orange car orange
5  fertile goat   <NA>
6   red snapper    red

答案 2 :(得分:0)

使用grepl的基础R解决方案:

# Sample data
df <- data.frame(V1 = c("yellow carrot","big car","green tomato","orange car","fertile goat","red snapper"))
s <- c("yellow","red","orange","green","blue")

df$new <- apply(df, 1, function(x)
    ifelse(length(ret <- s[sapply(s, function(y) grepl(y, x))]) > 0, ret, NA))
df;
#             V1    new
#1 yellow carrot yellow
#2       big car   <NA>
#3  green tomato  green
#4    orange car orange
#5  fertile goat   <NA>
#6   red snapper    red
相关问题