Question

实施例： Book =包含全文的字符串

startChar =它应该开始捕获= |

endChar =它应该结束捕获= §

在capture = grey

中忽略的单词

因此，如果它不是“灰色”这个词，我的捕获将很简单：|(.+)§

这是我的意思的一个例子：

书=“灰狐狸是灰色的，它的颜色很漂亮。”

捕获=“以便它漂亮”

使用C＃和PHP，但我不想使用任何替换函数，我只想要一个纯正则表达式。

Answer 1

您可以在全局搜索中使用此模式：

(?:\G(?!\A)|\|)(?:\bgray\b)?\K((?:(?!\bgray\b)[^§])+)(?=(?:gray)?(§)?)

<强> demo

详情

(?:                     # the two entry points
    \G(?!\A)            # position at the end of the pevious match
  |
    \|                  # the start
)
(?:\bgray\b)?           # optional "gray"
\K
((?:(?!\bgray\b)[^§])+) # all that is not the word "gray" (see the note)
(?=(?:gray)?(§)?)       # trick to capture the last §

note ：这个子模式是一个众所周知的技巧，可以匹配避免单词的文本。但是，这个子模式特别慢，文本很长，几乎没有文字可以避免它可以替换为：((?>[^g§]+|\Bg|g(?!ray\b))+)可能更快（但不太容易以编程方式构建）。

与PHP一起使用的示例：

$book = "The gray fox is |so gray that its pretty gray§.";

$reg = '~(?:\G(?!\A)|\|)(?:\bgray\b)?\K((?:(?!\bgray\b)[^§])+)(?=(?:gray)?(§)?)~';

if ( preg_match_all($reg, $book, $matches) && !empty(end($matches[2])) )
    echo implode('', $matches[1]);

注意：最后一个捕获组仅用于确保已到达终点。 “if”条件使用!empty(end($matches[2]))

对其进行检查

如何捕获一个组并在捕获中排除一个单词？

1 个答案: