使用正则表达式从字符串中提取

时间:2013-09-12 23:12:11

标签: regex r

我有一个字符串:

s <- "test.test AS field1, ablh.blah AS field2, faslk.lsdf AS field3"

我想转换为:

"field1, field2, field3"

我知道正则表达式(\w+)(?:,|$)将提取我想要的字符串('field1,'等),但我无法弄清楚如何使用gsub来提取它。

2 个答案:

答案 0 :(得分:10)

## Preparation
s <- "test.test AS field1, ablh.blah AS field2, faslk.lsdf AS field3"
pat <- "(\\w+)(?:,|$)"  ## Note the doubly-escaped \\w

## Use the powerful gregexpr/regmatches one-two punch
m <- gregexpr(pat, s)
paste(regmatches(s, m)[[1]], collapse=" ")
# [1] "field1, field2, field3"

答案 1 :(得分:0)

gsubfn package中的strapplyc可以使用一个特别简单的正则表达式来提取" AS "后面的每个单词字符串(如果该字段可以包含非单词字符)然后用适当的表达式替换\\w,例如任何不是空格或逗号的字符:[^ ,]):

> library(gsubfn)
> strapplyc(s, " AS (\\w+)", simplify = toString)[[1]]
[1] "field1, field2, field3"