非缓存：

Question

非缓存：

var sw = Stopwatch.StartNew();
foreach (var str in testStrings)
{
    foreach (var pair in flex)
    {
        if (Regex.IsMatch(str, "^(" + pair.Value + ")$", RegexOptions.IgnoreCase | RegexOptions.ExplicitCapture))
            ;
    }
}
Console.WriteLine("\nRan in {0} ms", sw.ElapsedMilliseconds); // 76 ms

缓存

var cache = flex.ToDictionary(p => p.Key, p => new Regex("^(" + p.Value + ")$", RegexOptions.IgnoreCase | RegexOptions.ExplicitCapture | RegexOptions.Compiled));

var sw = Stopwatch.StartNew();
foreach (var str in testStrings)
{
    foreach (var pair in cache)
    {
        if(pair.Value.IsMatch(str))
            ;
    }
}
Console.WriteLine("\nRan in {0} ms", sw.ElapsedMilliseconds); // 263 ms

我不知道为什么在预编译所有正则表达式时运行速度会变慢。更不用说flex上的迭代器也应该更慢，因为它需要做更多的计算。

导致这种情况的原因是什么？

实际上，如果我取消Compiled开关，它在缓存时会在8毫秒内运行。我认为“编译”会在构造正则表达式时编译它。如果没有，它什么时候这样做？

Answer 1

事实上，正则表达式不仅仅是在第一次使用时被缓存，而是在构建时（看一下反射器中的4.0代码，在其他框架中可能不是这样）。

因此，这里的巨大差异是：

后者中有一些简单的字符串连接，而不是前者，以及正则表达式编译之外的构造开销。
后者中有一个不同的集合，而不是前者。

目前尚不清楚什么类型的集合flex。如果它不是字典，那么我就不会对此感到惊讶，因为字典在枚举方面并不是非常快，因此大多数其他集合都会击败它。

除此之外，后者真的不是缓存的情况，因为它正在缓存已经从内存缓存中检索到的东西，因此没有理由怀疑后者会更快。< / p>

Answer 2

问题在于RegexOptions.Compiled标志。这实际上使它运行得慢得多。杰夫有点explains this in his blog。如果没有此标志，缓存版本会快得多。

为什么缓存导致我的代码运行得更慢？

非缓存：

缓存

2 个答案: