Question

美好的一天！

我想帮助删除方括号内的字符串并包括方括号。

字符串如下所示：

$string = "Lorem ipsum dolor<br /> [ Context are found on www.example.com ] <br />some text here. Text here. [test] Lorem ipsum dolor.";

我只想删除包含“www.example.com”的括号及其内容。我希望在字符串中保留"[test]"，其他任何括号中都没有"www.example.com"。

谢谢！

Answer 1

注意： OP已经大大改变了这个问题。此解决方案旨在以原始（更难）形式处理问题（在添加“www.example.com”约束之前）。虽然以下解决方案已经过修改以处理此附加约束，但现在可能更简单的解决方案足够（即anubhava的答案）。

这是我测试过的解决方案：

function strip_bracketed_special($text) {
    $re = '% # Remove bracketed text having "www.example.com" within markup.
          # Skip comments, CDATA, SCRIPT & STYLE elements, and HTML tags.
          (                      # $1: HTML stuff to be left alone.
            <!--.*?-->           # HTML comments (non-SGML compliant).
          | <!\[CDATA\[.*?\]\]>  # CDATA sections
          | <script.*?</script>  # SCRIPT elements.
          | <style.*?</style>    # STYLE elements.
          | <\w+                 # HTML element start tags.
            (?:                  # Group optional attributes.
              \s+                # Attributes separated by whitespace.
              [\w:.-]+           # Attribute name is required
              (?:                # Group for optional attribute value.
                \s*=\s*          # Name and value separated by "="
                (?:              # Group for value alternatives.
                  "[^"]*"        # Either double quoted string,
                | \'[^\']*\'     # or single quoted string,
                | [\w:.-]+       # or un-quoted string (limited chars).
                )                # End group of value alternatives.
              )?                 # Attribute values are optional.
            )*                   # Zero or more start tag attributes.
            \s*/?>               # End of start tag (optional self-close).
          | </\w+>               # HTML element end tags.
          )                      # End #1: HTML Stuff to be left alone.
        | # Or... Bracketed structures containing www.example.com
          \s*\[                  # (optional ws), Opening bracket.
          [^\]]*?                # Match up to required content.
          www\.example\.com      # Required bracketed content.
          [^\]]*                 # Match up to closing bracket.
          \]\s*                  # Closing bracket, (optional ws).
        %six';
    return preg_replace($re, '$1', $text);
}

请注意，正则表达式会跳过从内部删除括号内的材料：HTML注释，CDATA部分，SCRIPT和STYLE元素以及HTML标记属性值。给定以下XHTML标记（测试这些场景），上面的函数正确地删除了html元素内容中的括号内容：

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
 "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
    <title>Test special removal. [Remove this www.example.com]</title>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
    <style type="text/css">
        .test.before {
            content: "[Do not remove www.example.com]";
        }
    </style>
    <script type="text/javascript">
        // <![CDATA[ ["Do not remove www.example.com"] ]]>
        var ob = {};
        ob["Do not remove www.example.com"] = "stuff";
        var str = "[Do not remove www.example.com]";
    </script>
</head>
<body>
<!-- <![CDATA[ ["Do not remove www.example.com"] ]]> -->
<div title="[Do not remove www.example.com]">
<h1>Test special removal. [Remove this www.example.com]</h1>
<p>Test special removal. [Remove this www.example.com]</p>
<p onclick='var str = "[Do not remove www.example.com]"; return false;'>
    Test special removal. [Do not remove this]
    Test special removal. [Remove this www.example.com]
</p>
</div>
</body>
</html>

通过上面的PHP函数运行后，这是相同的标记：

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
 "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
    <title>Test special removal.</title>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
    <style type="text/css">
        .test.before {
            content: "[Do not remove www.example.com]";
        }
    </style>
    <script type="text/javascript">
        // <![CDATA[ ["Do not remove www.example.com"] ]]>
        var ob = {};
        ob["Do not remove www.example.com"] = "stuff";
        var str = "[Do not remove www.example.com]";
    </script>
</head>
<body>
<!-- <![CDATA[ ["Do not remove www.example.com"] ]]> -->
<div title="[Do not remove www.example.com]">
<h1>Test special removal.</h1>
<p>Test special removal.</p>
<p onclick='var str = "[Do not remove www.example.com]"; return false;'>
    Test special removal. [Do not remove this]
    Test special removal.</p>
</div>
</body>
</html>

这个解决方案应该可以很好地处理你可以抛出的任何有效（X）HTML。（但请，没有时髦的shorttags或SGML comments！）

Answer 2

$str = "Lorem ipsum dolor<br /> [ Context are found on www.example.com ] <br />some text here. Text here. [test] Lorem ipsum dolor.";
$str = preg_replace('~\[[^]]*?www\.example\.com[^]]*\]~si', "", $str);
var_dump($str);

输出

string(83) "Lorem ipsum dolor<br />  <br />some text here. Text here. [test] Lorem ipsum dolor."

PS：它可以在多行中断行。

Answer 3

使用类似/\[.*?\]/的正则表达式。反斜杠是必要的，否则它会尝试匹配任何单个字符.，*或?。

Answer 4

我能想到的最简单的方法是使用正则表达式来计算[和]之间的所有内容，然后将其替换为""。下面的代码将替换您在示例中使用的字符串。如果需要删除的实际字符串更复杂，则可以更改正则表达式以匹配。我建议使用regexpal.com来测试正则表达式。

$string = preg_replace("\[[A-Za-z .]*\]","",$string);

Answer 5

以下代码会将<br/>更改为换行符：

$str = "Lorem ipsum dolor<br />[ Context are found on www.example.com ] <br />some text here";
$str = preg_replace( "/\[[^\]]*\]/m", "", $str);
echo $str;

输出：

Lorem ipsum dolor

这里的一些文字

删除括号内的字符串

5 个答案:

输出