匹配有限数量的空格,忽略那些属于标记的空格

时间:2015-05-28 13:36:27

标签: php regex xml pcre

我正在尝试创建一个匹配输入的最后五个“单词”的正则表达式,其中“单词”被定义为匹配[^ ]+[^ ]*<[^>]*>[^ ]*的任何内容(因此任何分隔符的空格,但将<>之间的空格计为字母)

我试过了:

/([^ ]+(?:(?<!<[^>]+) +(?![^<]*>)(?:.*?)){0,4})$/

但它给了我一个错误,即lookbehind必须修复lenght。

说我有以下字符串:

'It\'s just that he <span class="verb">appear</span>ed rather late.'

它应匹配

'that he <span class="verb">appear</span>ed rather late.'

2 个答案:

答案 0 :(得分:1)

我认为您的解决方案已经非常接近了。请看这个:

$str = 'It\'s just that he <span class="verb">appear</span>ed rather late.';
$reg = '/(([^ ]*<[^>]*>[^ ]*)+|[^ ]+)/'; // let me know if you need explanation
if (preg_match_all($reg, $str, $m)) { // "_all" to match more than one
    $m = array_slice($m[0], -5, 5, true); // last 5 words
    //$m = implode(' ', $m); // uncomment this if you want a string instead of array
    print_r($m);
}

返回:

Array
(
    [2] => that
    [3] => he
    [4] => <span class="verb">appear</span>ed
    [5] => rather
    [6] => late.
)

答案 1 :(得分:0)

一种简单的方法:

preg_match('~^(?:\s*[^>\s]*(?:>[^<]*<[^>\s]*)*){0,5}~', strrev(rtrim($str)), $m);
$result = strrev($m[0]);
相关问题