String.Replace .NET Framework的内存效率和性能

时间:2008-12-30 08:44:00

标签: c# .net string

 string str1 = "12345ABC...\\...ABC100000"; 
 // Hypothetically huge string of 100000 + Unicode Chars
 str1 = str1.Replace("1", string.Empty);
 str1 = str1.Replace("22", string.Empty);
 str1 = str1.Replace("656", string.Empty);
 str1 = str1.Replace("77ABC", string.Empty);

 // ...  this replace anti-pattern might happen with upto 50 consecutive lines of code.

 str1 = str1.Replace("ABCDEFGHIJD", string.Empty);

我继承了一些与上面的代码片段相同的代码。它需要一个巨大的字符串,并从大字符串中替换(删除)常量较小的字符串。

我认为这是一个非常耗费内存的过程,因为每个替换都会在内存中分配新的大型不可变字符串,等待通过GC死亡。

1。替换这些值的最快方法是什么,忽略内存问题?

2。实现相同结果的最有效记忆的方法是什么?

我希望这些答案是一样的!

在这些目标之间适合的实用解决方案也值得赞赏。

假设:

  • 所有替换都是不变的,事先知道
  • 底层字符确实包含一些unicode [non-ascii] chars

10 个答案:

答案 0 :(得分:23)

.NET字符串中的所有字符都是“unicode chars”。你的意思是他们不是ascii吗?这不应该有任何可能性 - 除非你遇到组合问题,例如当你试图取代“急性”时,“e +急性重音”不会被替换。

您可以尝试使用Regex.ReplaceStringBuilder.Replace的正则表达式。这里的示例代码对两者都做了同样的事情:

using System;
using System.Text;
using System.Text.RegularExpressions;

class Test
{
    static void Main(string[] args)
    {
        string original = "abcdefghijkl";

        Regex regex = new Regex("a|c|e|g|i|k", RegexOptions.Compiled);

        string removedByRegex = regex.Replace(original, "");
        string removedByStringBuilder = new StringBuilder(original)
            .Replace("a", "")
            .Replace("c", "")
            .Replace("e", "")
            .Replace("g", "")
            .Replace("i", "")
            .Replace("k", "")
            .ToString();

        Console.WriteLine(removedByRegex);
        Console.WriteLine(removedByStringBuilder);
    }
}

我不想猜哪个效率更高 - 您必须使用特定应用程序进行基准测试。正则表达式方式可以在一次传递中完成所有操作,但与StringBuilder中的许多替换相比,该传递将相对CPU密集。

答案 1 :(得分:13)

如果你想要非常快,我的意思是非常快,你必须超越StringBuilder并编写优秀的代码。

你的计算机不喜欢做的一件事就是分支,如果你可以编写一个在固定数组(char *)上运行的替换方法而且没有分支你就有很好的性能。

你要做的是替换操作将搜索一系列字符,如果找到任何这样的子字符串,它将替换它。实际上,您将复制字符串,并在执行此操作时,执行查找和替换。

您将依赖这些函数来选择某些缓冲区的索引来进行读/写。我们的目标是预先形成替换方法,这样当你不需要改变时就可以编写垃圾而不是分支。

您应该能够在没有单个if语句的情况下完成此操作并记住使用不安全的代码。否则,您将为每个元素访问支付索引检查费用。

unsafe
{
    fixed( char * p = myStringBuffer )
    {
        // Do fancy string manipulation here
    }
}

我在C#中编写了这样的代码,以获得乐趣,并且看到了显着的性能提升,几乎300%的速度用于查找和替换。虽然.NET BCL(基类库)执行得很好,但它充满了分支结构和异常处理,如果你使用内置的东西,这将减慢代码的速度。此外,JIT编译器不会执行完美声音的这些优化,您必须将代码作为发布版本运行,而无需附加任何调试器,以便能够观察到大量的性能提升。

我可以为您提供更完整的代码,但这是一项繁重的工作。但是,我可以向你保证,它会比目前为止提出的任何建议都快。

答案 2 :(得分:4)

StringBuilder:http://msdn.microsoft.com/en-us/library/2839d5h5.aspx

Replace操作本身的性能应与string.Replace大致相同,根据Microsoft,不应生成垃圾。

答案 3 :(得分:4)

这是一个快速的基准......

        Stopwatch s = new Stopwatch();
        s.Start();
        string replace = source;
        replace = replace.Replace("$TS$", tsValue);
        replace = replace.Replace("$DOC$", docValue);
        s.Stop();

        Console.WriteLine("String.Replace:\t\t" + s.ElapsedMilliseconds);

        s.Reset();

        s.Start();
        StringBuilder sb = new StringBuilder(source);
        sb = sb.Replace("$TS$", tsValue);
        sb = sb.Replace("$DOC$", docValue);
        string output = sb.ToString();
        s.Stop();

        Console.WriteLine("StringBuilder.Replace:\t\t" + s.ElapsedMilliseconds);

我在我的机器上没有看到太大的区别(string.replace是85ms而stringbuilder.replace是80),这与“源”中大约8MB的文本相反......

答案 4 :(得分:4)

<强> 1。替换这些值的最快方法是什么,忽略内存问题?

最快的方法是构建一个特定于您的用例的自定义组件。从.NET 4.6开始,BCL中没有为多个字符串替换而设计的类。

如果你需要快速的BCL,StringBuilder是最快的BCL组件,用于简单的字符串替换。可以找到源代码here:它替换单个字符串非常有效。如果你真的需要正则表达式的模式匹配能力,那么只能使用正则表达式。即使在编译时,它也会变得更慢,更麻烦。

<强> 2。实现相同结果的最有效记忆的方法是什么?

最节省内存的方法是执行从源到目标的过滤流复制(如下所述)。内存消耗将仅限于您的缓冲区,但这会占用更多CPU;根据经验,您将为内存消耗交换CPU性能。

技术细节

字符串替换很棘手。即使在可变存储空间中执行字符串替换(例如使用StringBuilder),它也很昂贵。如果替换字符串的长度与原始字符串的长度不同,那么您将重新定位替换字符串后面的每个字符,以使整个字符串保持连续。这导致大量内存写入,即使在StringBuilder的情况下,也会导致您在每次调用Replace时重写内存中的大部分字符串。

那么进行字符串替换的最快方法是什么?使用单遍写入新字符串:不要让代码返回并且必须重新编写任何内容。写入比读取更昂贵。您必须自己编写代码才能获得最佳效果。

高内存解决方案

我写的课程根据模板生成字符串。我在一个模板中放置了令牌($ ReplaceMe $),该模板标记了我想在以后插入字符串的位置。我在XmlWriter过于繁琐的情况下使用它,因为XML在很大程度上是静态和重复的,我需要生成大型XML(或JSON)数据流。

该类的工作原理是将模板切割成零件并将每个零件放入编号的字典中。参数也是枚举的。将部件和参数插入新字符串的顺序放入整数数组中。生成新字符串时,将从字典中选取部件和参数,并用于创建新字符串。

它既没有完全优化也没有防弹,但它非常适合从模板生成非常大的数据流。

低内存解决方案

您需要将源字符串中的小块读入缓冲区,使用优化的搜索算法搜索缓冲区,然后将新字符串写入目标流/字符串。这里有很多潜在的警告,但它可以提高内存效率,并且可以更好地解决动态且无法缓存的源数据,例如整页翻译或源数据。太大而不能合理地缓存。我没有这个方便的样本解决方案。

示例代码

期望的结果

<DataTable source='Users'>
  <Rows>
    <Row id='25' name='Administrator' />
    <Row id='29' name='Robert' />
    <Row id='55' name='Amanda' />
  </Rows>
</DataTable>

模板

<DataTable source='$TableName$'>
  <Rows>
    <Row id='$0$' name='$1$'/>
  </Rows>
</DataTable>

测试用例

class Program
{
  static string[,] _users =
  {
    { "25", "Administrator" },
    { "29", "Robert" },
    { "55", "Amanda" },
  };

  static StringTemplate _documentTemplate = new StringTemplate(@"<DataTable source='$TableName$'><Rows>$Rows$</Rows></DataTable>");
  static StringTemplate _rowTemplate = new StringTemplate(@"<Row id='$0$' name='$1$' />");
  static void Main(string[] args)
  {
    _documentTemplate.SetParameter("TableName", "Users");
    _documentTemplate.SetParameter("Rows", GenerateRows);

    Console.WriteLine(_documentTemplate.GenerateString(4096));
    Console.ReadLine();
  }

  private static void GenerateRows(StreamWriter writer)
  {
    for (int i = 0; i <= _users.GetUpperBound(0); i++)
      _rowTemplate.GenerateString(writer, _users[i, 0], _users[i, 1]);
  }
}

StringTemplate来源

public class StringTemplate
{
  private string _template;
  private string[] _parts;
  private int[] _tokens;
  private string[] _parameters;
  private Dictionary<string, int> _parameterIndices;
  private string[] _replaceGraph;
  private Action<StreamWriter>[] _callbackGraph;
  private bool[] _graphTypeIsReplace;

  public string[] Parameters
  {
    get { return _parameters; }
  }

  public StringTemplate(string template)
  {
    _template = template;
    Prepare();
  }

  public void SetParameter(string name, string replacement)
  {
    int index = _parameterIndices[name] + _parts.Length;
    _replaceGraph[index] = replacement;
    _graphTypeIsReplace[index] = true;
  }

  public void SetParameter(string name, Action<StreamWriter> callback)
  {
    int index = _parameterIndices[name] + _parts.Length;
    _callbackGraph[index] = callback;
    _graphTypeIsReplace[index] = false;
  }

  private static Regex _parser = new Regex(@"\$(\w{1,64})\$", RegexOptions.Compiled);
  private void Prepare()
  {
    _parameterIndices = new Dictionary<string, int>(64);
    List<string> parts = new List<string>(64);
    List<object> tokens = new List<object>(64);
    int param_index = 0;
    int part_start = 0;

    foreach (Match match in _parser.Matches(_template))
    {
      if (match.Index > part_start)
      {
        //Add Part
        tokens.Add(parts.Count);
        parts.Add(_template.Substring(part_start, match.Index - part_start));
      }


      //Add Parameter
      var param = _template.Substring(match.Index + 1, match.Length - 2);
      if (!_parameterIndices.TryGetValue(param, out param_index))
        _parameterIndices[param] = param_index = _parameterIndices.Count;
      tokens.Add(param);

      part_start = match.Index + match.Length;
    }

    //Add last part, if it exists.
    if (part_start < _template.Length)
    {
      tokens.Add(parts.Count);
      parts.Add(_template.Substring(part_start, _template.Length - part_start));
    }

    //Set State
    _parts = parts.ToArray();
    _tokens = new int[tokens.Count];

    int index = 0;
    foreach (var token in tokens)
    {
      var parameter = token as string;
      if (parameter == null)
        _tokens[index++] = (int)token;
      else
        _tokens[index++] = _parameterIndices[parameter] + _parts.Length;
    }

    _parameters = _parameterIndices.Keys.ToArray();
    int graphlen = _parts.Length + _parameters.Length;
    _callbackGraph = new Action<StreamWriter>[graphlen];
    _replaceGraph = new string[graphlen];
    _graphTypeIsReplace = new bool[graphlen];

    for (int i = 0; i < _parts.Length; i++)
    {
      _graphTypeIsReplace[i] = true;
      _replaceGraph[i] = _parts[i];
    }
  }

  public void GenerateString(Stream output)
  {
    var writer = new StreamWriter(output);
    GenerateString(writer);
    writer.Flush();
  }

  public void GenerateString(StreamWriter writer)
  {
    //Resolve graph
    foreach(var token in _tokens)
    {
      if (_graphTypeIsReplace[token])
        writer.Write(_replaceGraph[token]);
      else
        _callbackGraph[token](writer);
    }
  }

  public void SetReplacements(params string[] parameters)
  {
    int index;
    for (int i = 0; i < _parameters.Length; i++)
    {
      if (!Int32.TryParse(_parameters[i], out index))
        continue;
      else
        SetParameter(index.ToString(), parameters[i]);
    }
  }

  public string GenerateString(int bufferSize = 1024)
  {
    using (var ms = new MemoryStream(bufferSize))
    {
      GenerateString(ms);
      ms.Position = 0;
      using (var reader = new StreamReader(ms))
        return reader.ReadToEnd();
    }
  }

  public string GenerateString(params string[] parameters)
  {
    SetReplacements(parameters);
    return GenerateString();
  }

  public void GenerateString(StreamWriter writer, params string[] parameters)
  {
    SetReplacements(parameters);
    GenerateString(writer);
  }
}

答案 5 :(得分:3)

StringBuilder sb = new StringBuilder("Hello string");
sb.Replace("string", String.Empty);
Console.WriteLine(sb);  

StringBuilder,一个可变的字符串。

答案 6 :(得分:1)

这是我的benchmark

using System;
using System.Diagnostics;
using System.Linq;
using System.Text;
using System.Text.RegularExpressions;

internal static class MeasureTime
{
    internal static TimeSpan Run(Action func, uint count = 1)
    {
        if (count <= 0)
        {
            throw new ArgumentOutOfRangeException("count", "Must be greater than zero");
        }

        long[] arr_time = new long[count];
        Stopwatch sw = new Stopwatch();
        for (uint i = 0; i < count; i++)
        {
            sw.Start();
            func();
            sw.Stop();
            arr_time[i] = sw.ElapsedTicks;
            sw.Reset();
        }

        return new TimeSpan(count == 1 ? arr_time.Sum() : Convert.ToInt64(Math.Round(arr_time.Sum() / (double)count)));
    }
}

public class Program
{
    public static string RandomString(int length)
    {
        Random random = new Random();
        const string chars = "ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789";
        return new String(Enumerable.Range(1, length).Select(_ => chars[random.Next(chars.Length)]).ToArray());
    }

    public static void Main()
    {
        string rnd_str = RandomString(500000);
        Regex regex = new Regex("a|c|e|g|i|k", RegexOptions.Compiled);
        TimeSpan ts1 = MeasureTime.Run(() => regex.Replace(rnd_str, "!!!"), 10);
        Console.WriteLine("Regex time: {0:hh\\:mm\\:ss\\:fff}", ts1);

        StringBuilder sb_str = new StringBuilder(rnd_str);
        TimeSpan ts2 = MeasureTime.Run(() => sb_str.Replace("a", "").Replace("c", "").Replace("e", "").Replace("g", "").Replace("i", "").Replace("k", ""), 10);
        Console.WriteLine("StringBuilder time: {0:hh\\:mm\\:ss\\:fff}", ts2);

        TimeSpan ts3 = MeasureTime.Run(() => rnd_str.Replace("a", "").Replace("c", "").Replace("e", "").Replace("g", "").Replace("i", "").Replace("k", ""), 10);
        Console.WriteLine("String time: {0:hh\\:mm\\:ss\\:fff}", ts3);

        char[] ch_arr = {'a', 'c', 'e', 'g', 'i', 'k'};
        TimeSpan ts4 = MeasureTime.Run(() => new String((from c in rnd_str where !ch_arr.Contains(c) select c).ToArray()), 10);
        Console.WriteLine("LINQ time: {0:hh\\:mm\\:ss\\:fff}", ts4);
    }

}
  

正则表达式时间:00:00:00:008

     

StringBuilder时间:00:00:00:015

     

字符串时间:00:00:00:005

LINQ无法处理rnd_str(严重错误:已超出内存使用限制)

字符串。替换最快

答案 7 :(得分:1)

我已经在这个线程中结束了几次,在阅读了先前的答案后,我并没有完全确信,因为某些基准测试是通过StopWatch完成的,这可能会给出某种指示,但感觉不到很好。

我的用例是,我有一个很大的字符串,即网站的HTML输出。我需要用值替换此字符串内的多个占位符(大约10个,最大20个)。

我创建了一个Benchmark.NET测试来获取一些可靠的数据,这是我发现的:

TLDR:

  • 如果不考虑性能/内存,请不要使用String.Replace
  • Regex.Replace是最快的,但比StringBuilder.Replace占用更多的内存。如果您打算重复使用相同的模式,则编译正则表达式最快,因为使用次数较少,非编译正则表达式实例的创建成本较低。
  • 如果您只关心内存消耗并且可以降低执行速度,请使用StringBuilder.Replace

测试结果:

|                Method | ItemsToReplace |       Mean |     Error |    StdDev |   Gen 0 |  Gen 1 | Gen 2 | Allocated |
|---------------------- |--------------- |-----------:|----------:|----------:|--------:|-------:|------:|----------:|
|         StringReplace |              3 |  21.493 us | 0.1182 us | 0.1105 us |  3.6926 | 0.0305 |     - |  18.96 KB |
|  StringBuilderReplace |              3 |  35.383 us | 0.1341 us | 0.1119 us |  2.5024 |      - |     - |  13.03 KB |
|          RegExReplace |              3 |  19.620 us | 0.1252 us | 0.1045 us |  3.4485 | 0.0305 |     - |  17.75 KB |
| RegExReplace_Compiled |              3 |   4.573 us | 0.0318 us | 0.0282 us |  2.7084 | 0.0610 |     - |  13.91 KB |
|         StringReplace |             10 |  74.273 us | 0.7900 us | 0.7390 us | 12.2070 | 0.1221 |     - |  62.75 KB |
|  StringBuilderReplace |             10 | 115.322 us | 0.5820 us | 0.5444 us |  2.6855 |      - |     - |  13.84 KB |
|          RegExReplace |             10 |  24.121 us | 0.1130 us | 0.1002 us |  4.4250 | 0.0916 |     - |  22.75 KB |
| RegExReplace_Compiled |             10 |   8.601 us | 0.0298 us | 0.0279 us |  3.6774 | 0.1221 |     - |  18.92 KB |
|         StringReplace |             20 | 150.193 us | 1.4508 us | 1.3571 us | 24.6582 | 0.2441 |     - | 126.89 KB |
|  StringBuilderReplace |             20 | 233.984 us | 1.1707 us | 1.0951 us |  2.9297 |      - |     - |   15.3 KB |
|          RegExReplace |             20 |  28.699 us | 0.1179 us | 0.1045 us |  4.8218 | 0.0916 |     - |  24.79 KB |
| RegExReplace_Compiled |             20 |  12.672 us | 0.0599 us | 0.0560 us |  4.0894 | 0.1221 |     - |  20.95 KB |

所以我的结论是:

  • Regex.Replace是一种快速执行和合理使用内存的方式。使用已编译的共享实例来加快速度。
  • StringBuilder的内存占用最少,但比Regex.Replace慢得多。如果只有内存是唯一的问题,我只会使用它。

基准测试代码如下:

[MemoryDiagnoser]
[HtmlExporter]
[PlainExporter]
[RPlotExporter]
public class String_Replace
{
    private Dictionary<string, string> _valuesToReplace = new Dictionary<string, string>()
    {
        {"foo","fooRep" },
        {"bar","barRep" },
        {"lorem","loremRep" },
        {"ipsum","ipsumRep" },
        {"x","xRep" },
        {"y","yRep" },
        {"z","zRep" },
        {"yada","yadaRep" },
        {"old","oldRep" },
        {"new","newRep" },

        {"enim","enimRep" },
        {"amet","ametRep" },
        {"sit","sitRep" },
        {"convallis","convallisRep" },
        {"vehicula","vehiculaRep" },
        {"suspendisse","suspendisseRep" },
        {"accumsan","accumsanRep" },
        {"suscipit","suscipitRep" },
        {"ligula","ligulaRep" },
        {"posuere","posuereRep" }
    };

    private Regex _regexCompiled;

    private string GetText_With_3_Tags()
    {
        return @"<html>
        <body>
        Lorem ipsum dolor sit [foo], consectetur [bar] elit. Proin nulla quam, faucibus a ligula quis, posuere commodo elit. Nunc at tincidunt elit. Sed ipsum ex, accumsan sed viverra sit amet, tincidunt id nibh. Vestibulum ante ipsum primis in faucibus orci luctus et ultrices posuere cubilia curae; Nam interdum ex eget blandit lacinia. Nullam a tortor id sapien fringilla pellentesque vel ac purus. Fusce placerat dapibus tortor id luctus. Aenean in lacinia neque. Fusce quis ultrices odio. Nam id leo neque.

Etiam erat lorem, tincidunt volutpat odio at, finibus pharetra felis. Sed magna enim, accumsan at convallis a, aliquet eu quam. Vestibulum faucibus tincidunt ipsum et lacinia. Sed cursus ut arcu a commodo. Integer euismod eros at efficitur sollicitudin. In quis magna non orci sollicitudin condimentum. Fusce sed lacinia lorem, nec varius erat. In quis odio viverra, pharetra ex ac, hendrerit ante. Mauris congue enim et tellus sollicitudin pulvinar non sit amet tortor. Suspendisse at ex pharetra, semper diam ut, molestie velit. Cras lacinia urna neque, sit amet laoreet ex venenatis nec. Mauris at leo massa.

Aliquam mollis ultrices mi, sit amet venenatis enim rhoncus nec. Integer sit amet lectus tempor, finibus nisl quis, sodales ante. Curabitur suscipit dolor a dignissim consequat. Nulla eget vestibulum massa. Nam fermentum congue velit a placerat. Vivamus bibendum ex velit, id auctor ipsum bibendum eu. Praesent id gravida dui. Curabitur sollicitudin lobortis purus ac tempor. Sed felis enim, ornare et est egestas, blandit tincidunt lacus. Ut commodo dignissim augue, eget bibendum augue facilisis non.

Ut tortor neque, dignissim sit amet [lorem] ut, facilisis sit amet quam. Nullam et leo ut est congue vehicula et accumsan dolor. Aliquam erat dolor, eleifend a ipsum et, maximus suscipit ipsum. Nunc nec diam ex. Praesent suscipit aliquet condimentum. Nulla sodales lobortis fermentum. Maecenas ut laoreet sem. Ut id pulvinar urna, vel gravida lacus. Integer nunc urna, euismod eget vulputate sit amet, pharetra nec velit. Donec vel elit ac dolor varius faucibus tempus sed tortor. Donec metus diam, condimentum sit amet odio at, cursus cursus risus. Interdum et malesuada fames ac ante ipsum primis in faucibus. Nullam maximus tellus id quam consequat vestibulum. Curabitur rutrum eros tellus, eget commodo mauris sollicitudin a. In dignissim non est at pretium. Nunc bibendum pharetra dui ac ullamcorper.

Sed rutrum vehicula pretium. Morbi eu felis ante. Aliquam vel mauris at felis tempus dictum ac a justo. Suspendisse ultricies nisi turpis, non sagittis magna porttitor venenatis. Aliquam ac risus quis leo semper viverra non ac nunc. Phasellus lacinia augue sed libero elementum, at interdum nunc posuere. Duis lacinia rhoncus urna eget scelerisque. Morbi ullamcorper tempus bibendum. Proin at est eget nibh dignissim bibendum. Fusce imperdiet ut urna nec mattis. Aliquam massa mauris, consequat tristique magna non, sodales tempus massa. Ut lobortis risus rhoncus, molestie mi vitae, accumsan enim. Quisque dapibus libero elementum lectus dignissim, non finibus lacus lacinia.
        </p><p>
        </body>
        </html>";
    }

    
    private string GetText_With_10_Tags()
    {
          return @"<html>
        <body>
        Lorem ipsum dolor sit [foo], consectetur [bar] elit. Proin nulla quam, faucibus a ligula quis, posuere commodo elit. Nunc at tincidunt elit. Sed ipsum ex, accumsan sed viverra sit amet, tincidunt id nibh. Vestibulum ante ipsum primis in faucibus orci luctus et ultrices posuere cubilia curae; Nam interdum ex eget blandit lacinia. Nullam a tortor id sapien fringilla pellentesque vel ac purus. Fusce placerat dapibus tortor id luctus. Aenean in lacinia neque. Fusce quis ultrices odio. Nam id leo neque.

Etiam erat lorem, tincidunt volutpat odio at, finibus pharetra felis. Sed magna enim, [z] at convallis a, aliquet eu quam. Vestibulum faucibus tincidunt ipsum et lacinia. Sed cursus ut arcu a commodo. Integer euismod eros at efficitur sollicitudin. In quis magna non orci sollicitudin condimentum. Fusce sed lacinia lorem, nec varius erat. In quis odio viverra, pharetra ex ac, hendrerit ante. Mauris congue enim et tellus sollicitudin pulvinar non sit amet tortor. Suspendisse at ex pharetra, semper diam ut, molestie velit. Cras lacinia urna neque, sit amet laoreet ex venenatis nec. Mauris at leo massa.

Aliquam mollis ultrices mi, sit amet venenatis enim rhoncus nec. Integer sit amet [y] tempor, finibus nisl quis, sodales ante. Curabitur suscipit dolor a dignissim consequat. Nulla eget vestibulum massa. Nam fermentum congue velit a placerat. Vivamus bibendum ex velit, id auctor ipsum bibendum eu. Praesent id gravida dui. Curabitur sollicitudin lobortis purus ac tempor. Sed felis enim, ornare et est egestas, blandit tincidunt lacus. Ut commodo dignissim augue, eget bibendum augue facilisis non.

Ut tortor neque, dignissim sit amet [lorem] ut, [ipsum] sit amet quam. [x] et leo ut est congue [new] et accumsan dolor. Aliquam erat dolor, eleifend a ipsum et, maximus suscipit ipsum. Nunc nec diam ex. Praesent suscipit aliquet condimentum. Nulla sodales lobortis fermentum. Maecenas ut laoreet sem. Ut id pulvinar urna, vel gravida lacus. Integer nunc urna, euismod eget vulputate sit amet, pharetra nec velit. Donec vel elit ac dolor varius faucibus tempus sed tortor. Donec metus diam, condimentum sit amet odio at, cursus cursus risus. Interdum et malesuada fames ac ante ipsum primis in faucibus. Nullam maximus tellus id quam consequat vestibulum. Curabitur rutrum eros tellus, eget commodo mauris sollicitudin a. In dignissim non est at pretium. Nunc bibendum pharetra dui ac ullamcorper.

Sed rutrum vehicula pretium. Morbi eu felis ante. Aliquam vel [old] at felis [yada] dictum ac a justo. Suspendisse ultricies nisi turpis, non sagittis magna porttitor venenatis. Aliquam ac risus quis leo semper viverra non ac nunc. Phasellus lacinia augue sed libero elementum, at interdum nunc posuere. Duis lacinia rhoncus urna eget scelerisque. Morbi ullamcorper tempus bibendum. Proin at est eget nibh dignissim bibendum. Fusce imperdiet ut urna nec mattis. Aliquam massa mauris, consequat tristique magna non, sodales tempus massa. Ut lobortis risus rhoncus, molestie mi vitae, accumsan enim. Quisque dapibus libero elementum lectus dignissim, non finibus lacus lacinia.
        </p><p>
        </body>
        </html>";
    }

    private string GetText_With_20_Tags()
    {
           return @"<html>
        <body>
        Lorem ipsum dolor sit [foo], consectetur [bar] elit. Proin nulla [convallis], faucibus a [vehicula] quis, posuere commodo elit. Nunc at tincidunt elit. Sed ipsum ex, accumsan sed viverra sit amet, tincidunt id nibh. Vestibulum ante ipsum primis in faucibus orci luctus et ultrices posuere cubilia curae; Nam interdum ex eget blandit lacinia. Nullam a tortor id sapien fringilla pellentesque vel ac purus. Fusce placerat dapibus tortor id luctus. Aenean in lacinia neque. Fusce quis ultrices odio. Nam id leo neque.

Etiam erat lorem, tincidunt [posuere] odio at, finibus pharetra felis. Sed magna enim, [z] at convallis a, [enim] eu quam. Vestibulum faucibus tincidunt ipsum et lacinia. Sed cursus ut arcu a commodo. Integer euismod eros at efficitur sollicitudin. In quis magna non orci sollicitudin condimentum. Fusce sed lacinia lorem, nec varius erat. In quis odio viverra, pharetra ex ac, hendrerit ante. Mauris congue enim et tellus sollicitudin pulvinar non sit amet tortor. Suspendisse at ex pharetra, semper diam ut, molestie velit. Cras lacinia urna neque, sit amet laoreet ex venenatis nec. Mauris at leo massa.

[suspendisse] mollis [amet] mi, sit amet venenatis enim rhoncus nec. Integer sit amet [y] tempor, finibus nisl quis, sodales ante. Curabitur suscipit dolor a dignissim consequat. Nulla eget vestibulum massa. Nam fermentum congue velit a placerat. Vivamus bibendum ex velit, id auctor ipsum bibendum eu. Praesent id gravida dui. Curabitur sollicitudin lobortis purus ac tempor. Sed felis enim, ornare et est egestas, blandit tincidunt lacus. Ut commodo dignissim augue, eget bibendum augue facilisis non.

Ut tortor neque, dignissim sit amet [lorem] ut, [ipsum] sit amet quam. [x] et leo ut est congue [new] et accumsan [ligula]. Aliquam erat dolor, eleifend a ipsum et, maximus suscipit ipsum. Nunc nec diam ex. Praesent suscipit aliquet condimentum. Nulla sodales lobortis fermentum. Maecenas ut laoreet sem. Ut id pulvinar urna, vel gravida lacus. Integer nunc urna, euismod eget vulputate sit amet, pharetra nec velit. Donec vel elit ac dolor varius faucibus tempus sed tortor. Donec metus diam, condimentum sit amet odio at, cursus cursus risus. Interdum et malesuada fames ac ante ipsum primis in faucibus. Nullam maximus tellus id quam consequat vestibulum. Curabitur rutrum eros tellus, eget commodo mauris sollicitudin a. In dignissim non est at pretium. Nunc bibendum pharetra dui ac ullamcorper.

Sed rutrum vehicula [accumsan]. Morbi eu [suscipit] [sit]. Aliquam vel [old] at felis [yada] dictum ac a justo. Suspendisse ultricies nisi turpis, non sagittis magna porttitor venenatis. Aliquam ac risus quis leo semper viverra non ac nunc. Phasellus lacinia augue sed libero elementum, at interdum nunc posuere. Duis lacinia rhoncus urna eget scelerisque. Morbi ullamcorper tempus bibendum. Proin at est eget nibh dignissim bibendum. Fusce imperdiet ut urna nec mattis. Aliquam massa mauris, consequat tristique magna non, sodales tempus massa. Ut lobortis risus rhoncus, molestie mi vitae, accumsan enim. Quisque dapibus libero elementum lectus dignissim, non finibus lacus lacinia.
        </p><p>
        </body>
        </html>";
    }

    private string GetText(int numberOfReplace)
    {
        if (numberOfReplace == 3)
            return GetText_With_3_Tags();
        if (numberOfReplace == 10)
            return GetText_With_10_Tags();
        if (numberOfReplace == 20)
            return GetText_With_20_Tags();

        return "";
    }

    public String_Replace()
    {
        _regexCompiled = new Regex(@"\[([^]]*)\]",RegexOptions.Compiled);
    }

    [Params(3,10,20)]
    public int ItemsToReplace { get; set; }

    [Benchmark]
    public void StringReplace()
    {
        var str = GetText(ItemsToReplace);
        foreach (var rep  in _valuesToReplace.Take(ItemsToReplace))
        {
            str = str.Replace("[" + rep.Key + "]", rep.Value);
        }
    }

    [Benchmark]
    public void StringBuilderReplace()
    {
        var sb = new StringBuilder(GetText(ItemsToReplace));
        foreach (var rep  in _valuesToReplace.Take(ItemsToReplace))
        {
            sb.Replace("[" + rep.Key + "]", rep.Value);
        }
        var res = sb.ToString();
    }

    [Benchmark]
    public void RegExReplace()
    {
        var str = GetText(ItemsToReplace);
        Regex regex = new Regex(@"\[([^]]*)\]");
        
        str = regex.Replace(str, Replace);
        var res = str;
    }

    

    [Benchmark]
    public void RegExReplace_Compiled()
    {
        var str = GetText(ItemsToReplace);

        str = _regexCompiled.Replace(str, Replace);
        var res = str;
    }

    private string Replace(Match match)
    {
        if(match.Groups.Count > 0)
        { 
            string collectionKey = match.Groups[1].Value;

            return _valuesToReplace[collectionKey];
        }
        return string.Empty;
    }

}

答案 8 :(得分:0)

如果你想在dotnet中使用内置类,我认为StringBuilder是最好的。 为了使它变得非常简单,你可以使用带有char *的不安全代码并迭代你的字符串并根据你的标准进行替换

答案 9 :(得分:0)

由于您在一个字符串中有多个替换,我建议您使用RegEx而不是StringBuilder。

相关问题