解码与字符串中的单词匹配的正则表达式字符串

时间:2009-12-25 07:08:18

标签: regex

我有以下正则表达式

var value = "hello";
"(?<start>.*?\W*?)(?<term>" + Regex.Escape(value) + @")(?<end>\W.*?)"

我正在试图找出其含义,因为它不会对单个单词起作用。 例如,它匹配“他们告诉我们”,但只是“你好”

失败

你可以帮我解释这个正则表达式字符串是什么意思吗?!

PS:它是.NET regexp

3 个答案:

答案 0 :(得分:3)

因为\W的最后一部分。 \W为非A-Z0-9_字符。

在“他们向我们打招呼”中,你好之后有空间,但是“你好”那里什么都没有,这就是原因。

如果您将其更改为(?<end>\W*.*?),则可能会有效。

实际上,正则表达式本身对我来说没有意义,它应该更像是

"\b" + Regex.Escape(value) + "\b"

\b是字边界

答案 1 :(得分:1)

正则表达式可能正在尝试查找包含整个单词的模式,因此您的hello示例与Othello不匹配。如果是这样,单词boundary regex \b是为此目的而量身定制的:

@"\b(" + Regex.Escape(value) + @")\b"

答案 2 :(得分:0)

如果这是.NET正则表达式而且Regex.escape()部分被替换为'hello'.. Regex Buddy表示这意味着:

(?<start>.*?\W*?)(?<term>hello)(?<end>\W.*?)

Options: case insensitive

Match the regular expression below and capture its match into backreference with name “start” «(?<start>.*?\W*?)»
   Match any single character that is not a line break character «.*?»
      Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
   Match a single character that is a “non-word character” «\W*?»
      Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
Match the regular expression below and capture its match into backreference with name “term” «(?<term>hello)»
   Match the characters “hello” literally «hello»
Match the regular expression below and capture its match into backreference with name “end” «(?<end>\W.*?)»
   Match a single character that is a “non-word character” «\W»
   Match any single character that is not a line break character «.*?»
      Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»