正则表达式替换标签之间和包括标签

时间:2013-02-06 15:52:18

标签: c# regex

我有以下文字(META标题):

Buy [ProductName][Text] at a great price [/Text] from [ShopName] today.

我正在取决于我的价值观。

我按照我的要求工作但是我找不到正确的正则表达式来替换:

[Text] at a great price [/Text]

单词(在方括号之间)改变所以唯一保持不变的是:

[][/] 

即我可能还想替换

[TestText]some test text[/TestText] with nothing.

我有这个工作:

System.Text.RegularExpressions.Regex.Replace(SEOContent, @"\[Text].*?\[/Text]", @"");

我推测正则表达式:

[.*?].*?\[/.*?]

会工作,但事实并非如此! - 我在ASP.NET C编码# 提前谢谢,

戴夫

1 个答案:

答案 0 :(得分:1)

使用命名捕获获取[..]的节点名称,然后使用\ k< ..>再次找到它。

(\[(?<Tag>[^\]]+)\][^\[]+\[/\k<Tag>\])

使用Ignore Pattern Whitespace和示例程序进行细分。

string pattern = @"
(                # Begin our Match
  \[             # Look for the [ escape anchor
  (?<Tag>[^\]]+) # Place anything that is not antother ] into the named match Tag
  \]             # Anchor of ]
  [^\[]+         # Get all the text to the next anchor
  \[/            # Anchor of the closing [...] tag
  \k<Tag>        # Use the named capture subgroup Tag to balance it out
  \]             # Properly closed end tag/node.
)                # Match is done";

string text = "[TestText]some test text[/TestText] with nothing.";

Console.WriteLine (Regex.Replace(text, pattern, "Jabberwocky", RegexOptions.IgnorePatternWhitespace));
// Outputs
// Jabberwocky with nothing.

顺便说一下,我实际上会创建一个标记化正则表达式(使用带有上述模式的正则表达式)并在匹配中替换,通过命名捕获来标识这些部分。然后在替换中使用匹配评估器替换标识的标记,例如:

string pattern = @"
(?(\[(?<Tag>[^\]]+)\][^\[]+\[/\k<Tag>\]) # If statement to  check []..[/] situation
  (                                      # Yes it is, match into named captures
   \[
   (?<Token>[^\]]+)                      # What is the text inside the [ ], into Token
   \]
   (?<TextOptional>[^\[]+)               # Optional text to reuse
   \[
   (?<Closing>/[^\]]+)                   # The closing tag info
   \]
  )
|                                        # Else, let is start a new check for either [] or plain text
 (?(\[)                                  # If a [ is found it is a token.
   (                                     # Yes process token
    \[
    (?<Token>[^\]]+)                     # What is the text inside the [ ], into Token
    \]
   )
  |                                      # Or (No of the second if) it is just plain text
  (?<Text>[^\[]+)                        # Put it into the text match capture.
 )
)
";


string text = @"Buy [ProductName] [Text]at a great price[/Text] from [ShopName] today.";

Console.WriteLine (
Regex.Replace(text,
              pattern,
              new MatchEvaluator((mtch) =>
              {

                 if (mtch.Groups["Text"].Success)           // If just text, return it.
                     return mtch.Groups["Text"].Value;

                 if (mtch.Groups["Closing"].Success)       // If a Closing match capture group reports success, then process
                 {
                    return string.Format("Reduced Beyond Comparison (Used to be {0})", mtch.Groups["TextOptional"].Value);
                 }

                  // Otherwise its just a plain old token, swap it out.
                  switch ( mtch.Groups["Token"].Value )
                  {
                     case "ProductName" : return "Jabberwocky"; break;
                     case "ShopName"    : return "StackOverFlowiZon"; break;
                  }


                  return "???"; // If we get to here...we have failed...need to determine why.

              }),
              RegexOptions.IgnorePatternWhitespace | RegexOptions.ExplicitCapture));
// Outputs:
// Buy Jabberwocky Reduced Beyond Comparison (Used to be at a great price) from StackOverFlowiZon today.