从HTML中删除带有换行符的空标记

时间:2016-07-01 05:10:49

标签: php preg-replace

我有以下HTML:

<body>Summary: <br>
    <table class="stats data tablesorter marg-bottom">
        <thead><tr><th>Team</th><th>Wins</th><th>Losses</th><th>Ties</th><th>Win %</th></tr></thead>
        <tbody>

            <tr>
                <td>Team 1</td>
                <td>95</td>
                <td>74</td>
                <td>0</td>
                <td>56.21</td>
            </tr>

            <tr>
                <td>Team 2</td>
                <td>74</td>
                <td>95</td>
                <td>0</td>
                <td>43.79</td>
            </tr>

        </tbody>
    </table>


<div>
    </div>
</body>

我想要这个结果:

<body>Summary: <br>
    <table class="stats data tablesorter marg-bottom">
        <thead><tr><th>Team</th><th>Wins</th><th>Losses</th><th>Ties</th><th>Win %</th></tr></thead>
        <tbody>
            <tr>
                <td>Team 1</td>
                <td>95</td>
                <td>74</td>
                <td>0</td>
                <td>56.21</td>
            </tr>
            <tr>
                <td>Team 2</td>
                <td>74</td>
                <td>95</td>
                <td>0</td>
                <td>43.79</td>
            </tr>
        </tbody>
    </table>
</body>

最简单的方法是正确编码,遗憾的是,这是一个非常老的CKEditor版本,我无法升级它(由于其他影响)。

我可以运行preg_replace或递归函数或循环来删除空<div>标记和不需要的空行?

1 个答案:

答案 0 :(得分:0)

假设您在名为$html的变量中包含此HTML:

// Replace empty <div> tags with nothing
$html = preg_replace("/<div>\s*<\/div>/", "", $html);

// Replace multiple newlines in a row with a single newline
$html = preg_replace("/\n+/", "\n", $html);

echo $html;

修改

完整的工作代码,包括输出:

<?php

$html = <<<END
<body>Summary: <br>
    <table class="stats data tablesorter marg-bottom">
        <thead><tr><th>Team</th><th>Wins</th><th>Losses</th><th>Ties</th><th>Win %</th></tr></thead>
        <tbody>

            <tr>
                <td>Team 1</td>
                <td>95</td>
                <td>74</td>
                <td>0</td>
                <td>56.21</td>
            </tr>

            <tr>
                <td>Team 2</td>
                <td>74</td>
                <td>95</td>
                <td>0</td>
                <td>43.79</td>
            </tr>

        </tbody>
    </table>


<div>
    </div>
</body>

END;

// Replace empty <div> tags with nothing
$html = preg_replace("/<div>\s*<\/div>/", "", $html);

// Replace multiple newlines in a row with a single newline
$html = preg_replace("/\n+/", "\n", $html);

echo $html;

// OUTPUT:

// <body>Summary: <br>
//     <table class="stats data tablesorter marg-bottom">
//         <thead><tr><th>Team</th><th>Wins</th><th>Losses</th><th>Ties</th><th>Win %</th></tr></thead>
//         <tbody>
//             <tr>
//                 <td>Team 1</td>
//                 <td>95</td>
//                 <td>74</td>
//                 <td>0</td>
//                 <td>56.21</td>
//             </tr>
//             <tr>
//                 <td>Team 2</td>
//                 <td>74</td>
//                 <td>95</td>
//                 <td>0</td>
//                 <td>43.79</td>
//             </tr>
//         </tbody>
//     </table>
// </body>

?>