preg_replace更换错误

时间:2011-07-29 04:59:55

标签: php regex templates preg-replace

我得到了一位朋友的帮助(他现在正在度假),但我遇到preg_replace搜索和替换问题。我不知道为什么,但它正在更换不正确的字符串,这会对下一个它应该替换的影响产生影响。

这基本上属于模板类中处理'if'和'else'查询的模板类。

function if_statement($a, $b, $if, $type, $else = NULL){
    if($type == "1" && is_numeric($a) && is_numeric($b)){
        $statement = ($a === $b) ? $if : $else;
    } else if($type == "1"){
        $statement = ($a == $b) ? $if : $else;
    } else if($type == "2"){
        $statement = ($a != $b) ? $if : $else;
    }
    return stripslashes($statement);
}

$output = file_get_contents("template.tpl");

$replace = array(
  '#\<if:"\'(.*?)\' == \'(.*?)\'"\>(.*?)\<else\>(.*?)\<\/endif\>#sei',
  '#\<if:"\'(.*?)\' == \'(.*?)\'"\>(.*?)\<\/endif\>#sei'
);  
$functions = array(
  "if_statement('\\1', '\\2', '\\3', '1', '\\4')",
  "if_statement('\\1', '\\2', '\\3', '1')"
);
$output = preg_replace($replace, $functions, $output);
echo $output;

模板:

<HTML>
    <head>
    <meta http-equiv="content-type" content="application/xhtml+xml; charset=UTF-8" />
    <title>Site Title</title>
    <link rel="stylesheet" type="text/css" media="screen" href="common.css" />
    <if:"'{ISADMIN}' == '1'">
        <link rel="stylesheet" href="admin-bar.css" type="text/css" media="all" />
    </endif>
</head>
<body>
    <if:"'{TODAY}' == 'Monday'">Today is Monday<else>Today is not Monday</endif>
    <if:"'1' == '2'">1 equals 2!<else>1 doesn't equal 2</endif>
</body>
</html>

当前输出将低于:

<HTML>
    <head>
    <meta http-equiv="content-type" content="application/xhtml+xml; charset=UTF-8" />
    <title>Site Title</title>
    <link rel="stylesheet" type="text/css" media="screen" href="common.css" />

        <link rel="stylesheet" href="admin-bar.css" type="text/css" media="all" />
    **</endif>**
</head>
<body>
    **<if:"'{TODAY}' == 'Monday'">**Today is Monday
    1 doesn't equal 2
</body>
</html>

在上面,粗体/ astrix makred部分不应该在输出上,今天也不是星期一。管理员登录后,admin-bar.css文件已被正确包含,但由于某种原因没有获取</endif>标记 - 事实上,它看起来已经在{{1}之后而在下一个语句中标记...换句话说,<else>匹配了一个不正确的东西!因此没有接受第二个preg_replace声明。

<if>标签正在被正确替换 - 我甚至手动将数据放入语句中(只是为了检查),所以它们不是问题......

我不知道为什么,但对我来说{BRACKET}没有找到正确的序列来替换和采取行动。如果有人可以放一双新眼睛/伸出援手,我将不胜感激。

谢谢!

1 个答案:

答案 0 :(得分:3)

示例中的第一个<if>没有<else>子句。因此,当<if:"'(.*?)' == '(.*?)'">(.*?)<else>(.*?)</endif>(其中<else>不是可选的)应用于它时,它会匹配所有这些:

    <if:"'{ISADMIN}' == '1'">
        <link rel="stylesheet" href="admin-bar.css" type="text/css" media="all" />
    </endif>
</head>
<body>
    <if:"'{TODAY}' == 'Monday'">Today is Monday<else>Today is not Monday</endif>

在该匹配中,组$3

        <link rel="stylesheet" href="admin-bar.css" type="text/css" media="all" />
    </endif>
</head>
<body>
    <if:"'{TODAY}' == 'Monday'">Today is Monday

你可以通过禁止正则表达式使用先行断言跨越</endif>来避免这种情况:

'%<if:\s*"\'([^\']*)\' == \'([^\']*)\'">((?:(?!<else>|</endif>).)*)<else>((?:(?!</endif).)*)</endif>%si'

或者,以评论的形式(当程序员再次“度假”时可能会更有帮助):

'%<if:\s*"\'     # Match <if:(optional space)"\'
    ([^\']*)     # Match 0 or more non-quote characters, capture group 1
    \'\s==\s\'   # Match \' == \'
    ([^\']*)     # Match 0 or more non-quote characters, capture group 2
    \'">         # Match \'">
    (            # Capture into group 3:
     (?:         # The following group...
      (?!        # only if we\'re not right before...
       <else>    # <else>
      |          # or
       </endif>  # </endif>
      )          # (End of lookahead assertion)
      .          # Match any character
     )*          # Repeat as necessary
    )            # End of capturing group 3
    <else>       # Match <else>
    (            # Same construction as above, group 4
     (?:
      (?!
       </endif>  # this time only looking for </endif>
      )
      .
     )*
    )
    </endif>     # and finally match </endif>
    %esix'

第二个正则表达式也应该改进:

'%<if:\s*"\'     # Match <if:(optional space)"\'
    ([^\']*)     # Match 0 or more non-quote characters, capture group 1
    \'\s==\s\'   # Match \' == \'
    ([^\']*)     # Match 0 or more non-quote characters, capture group 2
    \'">         # Match \'">
    (            # Capture into group 3:
     (?:
      (?!
       </endif>  # Any text until </endif>
      )
      .
     )*
    )
    </endif>     # and finally match </endif>
    %esix'

此外,这些正则表达式应该更快,因为它们更清楚地指定了可以匹配和不匹配的内容,从而避免了大量的回溯。