Question

我有一列文字如下：

str1 = "ABCID 123456789 is what I'm looking for, could you help me to check this Item's status?"

我想在R中使用gsub函数从那里提取"ABCID 123456789"。该数字可能会随不同的数字而变化，但是ABCID是一个常数。有人可以知道解决方案吗？非常感谢！

Answer 1

我们可以使用str_extract选择固定单词，然后选择空格和一个或多个数字（\\d+）

library(stringr)
str_extract(df1$col1, "ABCID \\d+")

如果有多个实例，请使用str_extract_all

str_extract_all(df1$col1, "ABCID \\d+")

注意：OP指出要从那里提取"ABCID 123456789"

Answer 2

匹配字符串（^）开头的字母（ABCID），空格，数字（\ d +）以及其他所有内容（。*），并将其全部替换为捕获的部分，即括号中的部分。请注意，这里我们要使用sub而不是gsub，因为只有一个替换项。

sub("^(ABCID \\d+).*", "\\1", str1)
## [1] "ABCID 123456789"

Answer 3

如果数字的长度为常数（9），则可以使用正向查找：

sub("(?<=ABCID \\d{9}).*", "", str1, perl = TRUE)
# [1] "ABCID 123456789"