正则表达式:添加标题=&#34;&#34;到<img/>标签但不是{script> -tags </script>之间的<img/>

时间:2012-09-19 20:27:06

标签: javascript php regex

来自SEO插件的正则表达式/<img[^>]+/导致我的mapping plugin中断,因为它在我的JavaScript输出中添加了未转义的斜杠。

我想建议对此进行修复,其中<img> - 标记只有在title=""<script>标记内时才会使用<script*...>进行增强。

澄清:

  • 仅在代码blabla <img src="">blab上添加属性title=""

  • <script>blabla <img src=""> blabla</script>应保持原样

  • <script type="text/javascript">blabla <img src=""> blabla</script>也应保持原样

任何人都可以帮助正则表达式吗?我找到了(?! Expression)用于定义排除项的内容,但我不太确定如何使用它。

2 个答案:

答案 0 :(得分:3)

不要将regex用于此类问题。请改用DOMDocument。然后,仅将属性添加到适当的节点是微不足道的,并且您可以100%确定将来不会出现类似的问题。

答案 1 :(得分:1)

我不知道它的正则表达式,但我知道一种解决方法

function add_titles_no_scripts($page)
{
    /* split up page at each start of an script */
    $parts = explode("<script", str_ireplace("<script", "<script", $page));

    /* remove the first part from the list, as its before first script */
    $first_part = array_shift();

    /* add titles to first parts as usal */
    $new_page = add_title($first_part);

    /* for all other parts */
    foreach($parts as $current_part)
    {
        /* split up part in 2 parts, before and after end of script */
        $sub_parts = explode("</script", str_ireplace("</script", "</script", $current_part), 2);

        /* make sure we have 2 parts */
        if(count($sub_parts) == 2)
        {
            /* add first part as it was inside the scripttags removed by explode */
            $new_page .= "<script" . $sub_parts[0] . "</script";

            /* adds 2nd part as usal */
            $new_page .= add_title($sub_parts[1]);
        }
        else
        {
            /* if only one part, we are inside an not ended script tag */
            $new_page .= "<script" . $sub_parts[0];
        }
    }

    /* return the new page */
    return $new_page;
}
相关问题