正则表达式以提取div内的完整内容

时间:2015-02-26 12:40:31

标签: php html regex dom

如何在div中提取完整的html内容?我试过这段代码,

$html= '<html>
            <body>
                <div id="test">
                    <div id="mydiv1">Hello</div>
                    <div id="mydiv2">How are you</div>
                </div>
            </body>
        </html>';

$attr = "id";
$value = "test";

$tag_regex = '/<div[^>]*'.$attr.'="'.$value.'">(.*?)<\\/div>/si';
preg_match($tag_regex,$html,$matches);

echo $matches[0];

通过运行此代码,我得到了结果,

 <div id="test">
    <div id="mydiv1">Hello</div>

预期结果,

<div id="test">
   <div id="mydiv1">Hello</div>
   <div id="mydiv2">How are you</div>
</div>

在我的代码中,正则表达式执行到第一次出现</div>。如何在<div id="test">内获取完整代码?

1 个答案:

答案 0 :(得分:4)

使用DOMDocument:

$dom = new DOMDocument;
$dom->loadHTML($html);

$div = $dom->getElementById('test');

$result = $dom->saveHTML($div);