preg_replace无法按预期工作

时间:2012-11-07 11:51:44

标签: php regex preg-replace

我想删除HTML标签,还有样式和脚本标签的内容,但我的代码不是删除样式标签内容,不知道为什么。对此有何想法?

$search = array('@<script[^>]*?>.*?</script>@si',  // Strip out javascript 
               '@<[\/\!]*?[^<>]*?>@si',            // Strip out HTML tags 
               '@<style[^>]*?>.*?</style>@si',    // Strip style tags properly 
               '@<![\s\S]*?--[ \t\n\r]*>@'         // Strip multi-line comments including CDATA 
               ); 

$htmlstring = 'Which brand(s) of single serve coffee brewer do you own? <style type="text/css"> #answer67627X49X1159other {display:none;}</style>';
$htmlstring .= '<style> #answer67627X49X1159999 {display:none;}</style><script>alert(123);</script>';

$htmlstring = preg_replace($search,'',$htmlstring);

echo '<input style="width:90%" type="text" value="'.$htmlstring.'" />';

以下是输入标记中的输出。

您拥有哪个品牌的单杯咖啡机? #answer67627X49X1159other {display:none;}#answer67627X49X1159999 {display:none;}

2 个答案:

答案 0 :(得分:0)

模式顺序不好

<?php
$search = array('@<script[^>]*?>.*?</script>@si',  // Strip out javascript 
               '@<style[^>]*?>.*?</style>@si',
               '@<[\/\!]*?[^<>]*?>@si',            // Strip out HTML tags 
               '@<![\s\S]*?--[ \t\n\r]*>@'         // Strip multi-line comments including CDATA 
               ); 

$htmlstring = 'Which brand(s) of single serve coffee brewer do you own? <style type="text/css"> #answer67627X49X1159other {display:none;}</style>';
$htmlstring .= '<style> #answer67627X49X1159999 {display:none;}</style><script>alert(123);</script>';

$htmlstring = preg_replace($search, '' ,$htmlstring);
var_dump($htmlstring);

// string(57) "Which brand(s) of single serve coffee brewer do you own? "

答案 1 :(得分:0)

在到达样式标记之前,您已经剥离了html标记。更改替换的顺序,以便在其余部分之前处理脚本和样式

$search = array('@<script[^>]*?>.*?</script>@si',  // Strip out javascript                
                '@<style[^>]*?>.*?</style>@si',    // Strip style tags properly 
                '@<[\/\!]*?[^<>]*?>@si',            // Strip out HTML tags 
                '@<![\s\S]*?--[ \t\n\r]*>@'         // Strip multi-line comments including CDATA 
           );