循环遍历正则表达式匹配列表并在循环中抓取第一个捕获组

时间:2015-12-06 08:28:49

标签: regex

我正在尝试遍历正则表达式结果,并将第一个捕获组插入到要在循环中处理的变量中。但我无法弄清楚如何做到这一点。这是我到目前为止所做的,但它只打印第二场比赛:

aQuote = "The big boat has a big assortment of big things."
theMatches = regmatches(aQuote, gregexpr("big ([a-z]+)", aQuote ,ignore.case = TRUE))

results = lapply(theMatches, function(m){
    capturedItem = m[[2]]
    print(capturedItem)
})

现在打印

[1] "big assortment"

我想要打印的是

[1] boat
[1] assortment
[1] things

2 个答案:

答案 0 :(得分:2)

试试这个:

regmatches(aQuote, gregexpr("(?<=big )[a-z]+", aQuote ,ignore.case = TRUE,perl=TRUE))[[1]]
#[1] "boat"       "assortment" "things"

答案 1 :(得分:1)

在您的代码中也包含g(全局)修饰符。 perl / javascript中的等效正则表达式为:/big ([a-z]+)/ig

样本perl prog:

$aQuote =  "The big boat has a big assortment of big things.";
print $1."\n" while ($aQuote =~ /big ([a-z]+)/ig);

JS小提琴here

修改:在r,我们可以写:

aQuote = "The big boat has a big assortment of big things."
theMatches = regmatches(aQuote, gregexpr("big ([a-z]+)", aQuote ,ignore.case = TRUE))

results = lapply(theMatches, function(m){
    len= length(m)
    for (i in 1:len)
    {
        print(m[[i]])
    }
})

r小提琴here