Question

首先我获取网页的HTML，然后我删除通常出现在页面左侧或右侧（不在页面主体中）的href链接。正在删除Href链接，但未删除其标签。

示例：

<a href='http://test.blogspot.com/2012/11/myblog.html'>London</a>

正在删除链接，但不是标签，即“伦敦”。如何删除html源代码中的完整行？我正在使用以下代码：

$string = strip_tags($html_source_code, '<a>', TRUE); 

function strip_tags($text, $tags = '', $invert = FALSE) {
      preg_match_all('/<(.+?)[\s]*\/?[\s]*>/si', trim($tags), $tags); 
      $tags = array_unique($tags[1]); 
      if(is_array($tags) AND count($tags) > 0) { 
        if($invert == FALSE) { 
          return preg_replace('@<(?!(?:'. implode('|', $tags) .')\b)(\w+)\b.*?>.*?</\1>@si', '', $text); 
        } 
        else { 
          return preg_replace('@<('. implode('|', $tags) .')\b.*?>.*?</\1>@si', '', $text); 
        } 
      } 
      elseif($invert == FALSE) { 
        return preg_replace('@<(\w+)\b.*?>.*?</\1>@si', '', $text); 
      } 
return $text; 
}

Answer 1

$link = "<a href='http://test.blogspot.com/2012/11/myblog.html'>London</a>";

function erraser($theLink, $checkTag){

    if(strpos($theLink, $checkTag) == true){

        for($i=0; $i< strlen($theLink); $i++){
        $link[$i] = '';
        return  $link[$i];
        }
       }else{
        return $theLink;
    }

}

现在，让我们来看看：

您只需要为erraser()函数提供两个参数，然后链接变量，以及通过

识别链接的任何文本

如果您为ex：echo erraser($link, 'href');执行此操作，则会删除该链接，并且return不会删除任何内容。如果您在----内提供echo erraser($link, '----');，那么会给出链接 london ，这意味着它会检查它是否是链接并且所需的功能

Answer 2

如果我使用您的代码，我会收到致命错误：无法重新声明strip_tags（）。

将名称功能更改为my_strip_tags之类的工作正常。

function my_strip_tags($text, $tags = '', $invert = FALSE) {
      preg_match_all('/<(.+?)[\s]*\/?[\s]*>/si', trim($tags), $tags); 
      $tags = array_unique($tags[1]); 
      if(is_array($tags) AND count($tags) > 0) { 
        if($invert == FALSE) { 
          return preg_replace('@<(?!(?:'. implode('|', $tags) .')\b)(\w+)\b.*?>.*?</\1>@si', '', $text); 
        } 
        else { 
          return preg_replace('@<('. implode('|', $tags) .')\b.*?>.*?</\1>@si', '', $text); 
        } 
      } 
      elseif($invert == FALSE) { 
        return preg_replace('@<(\w+)\b.*?>.*?</\1>@si', '', $text); 
      } 
return $text; 
}

$html_source_code = "Beginning of content ... <a href='http://test.blogspot.com/2012/11/myblog.html'>London</a> ... end of content.";

echo "<p>".$html_source_code."</p>";

$string = my_strip_tags($html_source_code, '<a>', TRUE);

echo "<p>".$string."</p>";

打印：

内容的开头...... London ...内容结束。

内容的开头......内容的结束。

使用html dom解析器删除href链接和标签

2 个答案: