Question

我想用每个实例的单个空格替换所有连续非小写字母的所有实例。这有效，但为什么它会在字母之间注入空格？

const string pattern = @"[^a-z]*";
const string replacement = @" ";
var reg = new Regex(pattern);

string a = "the --fat- cat";
string b = reg.Replace(a, replacement);  // b = " t h e  f a t  c a t " should be "the fat cat"

Answer 1

由于*（重复前一个标记零或更多次）。它必须在所有边界中找到匹配项，因为所有这些边界中都存在空字符串。

const string pattern = @"[^a-z]+";

Answer 2

如果您只想删除非小写字母，则不需要正则表达式：

string a = "the --fat- cat";
string res = String.Join("", a.Where(c => Char.IsLower(c) || Char.IsWhiteSpace(c)));

Console.WriteLine(res); // the fat cat

Answer 3

只是一个可能变得有用的后续答案：如果您需要匹配除任何Unicode小写字母之外的任何字符，您可以使用

var res = Regex.Replace(str, @"\P{Ll}+", " ");
// "моя НЕ знает" > "моя знает"

\P{Ll}构造将匹配所有Unicode表中的所有字符但小写字母。 +量词将匹配一次或多次出现，并且不会导致OP中的问题。

由[^a-z]*引起的illustration of the current problem（请参阅显示Regex.Replace找到空字符串匹配位置的垂直管道）：

经验法则：避免可能与空字符串匹配的无固定模式！

为什么这个正则表达式没有匹配？

3 个答案: