Question

我有以下正则表达式

var value = "hello";
"(?<start>.*?\W*?)(?<term>" + Regex.Escape(value) + @")(?<end>\W.*?)"

我正在试图找出其含义，因为它不会对单个单词起作用。例如，它匹配“他们告诉我们”，但只是“你好”

失败

你可以帮我解释这个正则表达式字符串是什么意思吗？！

PS：它是.NET regexp

Answer 1

因为\W的最后一部分。 \W为非A-Z0-9_字符。

在“他们向我们打招呼”中，你好之后有空间，但是“你好”那里什么都没有，这就是原因。

如果您将其更改为(?<end>\W*.*?)，则可能会有效。

实际上，正则表达式本身对我来说没有意义，它应该更像是

"\b" + Regex.Escape(value) + "\b"

\b是字边界

Answer 2

正则表达式可能正在尝试查找包含整个单词的模式，因此您的hello示例与Othello不匹配。如果是这样，单词boundary regex \b是为此目的而量身定制的：

@"\b(" + Regex.Escape(value) + @")\b"

Answer 3

如果这是.NET正则表达式而且Regex.escape（）部分被替换为'hello'.. Regex Buddy表示这意味着：

(?<start>.*?\W*?)(?<term>hello)(?<end>\W.*?)

Options: case insensitive

Match the regular expression below and capture its match into backreference with name “start” «(?<start>.*?\W*?)»
   Match any single character that is not a line break character «.*?»
      Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
   Match a single character that is a “non-word character” «\W*?»
      Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
Match the regular expression below and capture its match into backreference with name “term” «(?<term>hello)»
   Match the characters “hello” literally «hello»
Match the regular expression below and capture its match into backreference with name “end” «(?<end>\W.*?)»
   Match a single character that is a “non-word character” «\W»
   Match any single character that is not a line break character «.*?»
      Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»

解码与字符串中的单词匹配的正则表达式字符串

3 个答案: