使用preg_replace替换HTML标记

时间:2015-12-07 10:59:14

标签: php regex preg-replace preg-match preg-match-all

我想操纵内容

$description = 'Title <div data-type="page" data-id="1">Abcdef</div><div data-type="page" data-id="2">Fghij</div>';

我使用preg_match取值data-type&amp; data-id

if (preg_match('/<div data-type="(.*?)" data-id="(.*?)">(.*?)<\/div>/s', $description) ) { ... }

这个正则表达式不太确定,我希望得到

array(0 => array('type' => 'page', 'id' => 1), 1 => array('type' => 'page', 'id' => 1))

使用函数get_page_content($id)按ID返回实际内容,最后我的内容如下所示:Title this is page id 1 this is page id 2

解决:(临时)

$regex = '/<div data-type="(.*?)" data-id="(.*?)">(.*?)<\/div>/';
$description = 'Title <div data-type="page" data-id="1">Abcdef</div><div data-type="page" data-id="2">Fghij</div>';

echo preg_replace_callback(
        $regex,
        function($match) {
            if (isset($match[1]) && isset($match[2])) {
                if ($match[1] == 'page') {
                    return '<div class="page-embed">'. get_page_content($match[2]) .'</div>';
                }
            }

            return false;
        },
        $description);

1 个答案:

答案 0 :(得分:1)

首先,正则表达式不是DOM相关文本的最佳选择。但是,对于您的特定问题,应该这样做。

$string = 'Title <div data-type="page" data-id="1">Abcdef</div><div data-type="page" data-id="2">Fghij</div>';
$pattern = '<div data-type="(?P<type>[^"]+?)" data-id="(?P<id>[^"]+?)">(?P<div_contents>[^<]+?)<\/div>';

preg_match('/'.$pattern.'/', $string, $matches);

$ matches数组将包含以下内容:

Array ( [0] =>Abcdef [type] => page [1] => page [id] => 1 [2] => 1 [div_contents] => Abcdef [3] => Abcdef )

因此,要获得div之间的类型,ID和内容,您可以这样做,

$type = $matches['type'];
$id = $matches['id'];
$contents = $matches['div_contents'];