
时间:2016-07-05 18:23:00

标签: c# winforms performance string-comparison



我目前有一个string string1 = "xxxxxx";。长度为6个字符。每个字符值都是01x。我需要将此string与另一个string进行比较,其中1具有相同数量的字符,但值为0x

  • char string表示第二个char值 0可以是任何内容

  • 第一个string中的
  • 字符0 - 仅在第二个string

  • 中接受1 第一个string中的
  • 字符1 - 仅在第二个string

  • 中接受string pattern = 'xxxxxx'; string test1 = '010101'; // pass string pattern = '1xxxxx'; string test2 = '010101'; // not pass string pattern = '0xxxxx'; string test3 = '010101'; // pass


    public bool passCombination(string pattern, string combination)
        bool combination_passed = true;
        for (int i = 0; i < pattern.Length; i++)
            char test_char = pattern[i];
            if (test_char != 'x' && combination[i] != test_char)
                combination_passed = false;

        return combination_passed;



这很简单。基本上我在char之后正在考虑x。如果它是regex那么我不关心第二个字符串中的值。如果它是其他字符 - 然后比较。

由于这是一个基于字符串的方法,我正在考虑其他解决方案吗?在我的真实场景中,我必须执行约(约700k这样的检查* ~150万次)。这个非常乐观的数字:)




我确实进行过比较测试。随着旧的aproach我得到2.5分钟的执行时间和下面建议的新的aproach(接受的答案) - 大约2分钟。它的性能提升约为import http.cookies c = http.cookies.SimpleCookie() c.load('currency=USD;country=UY') data = {key: c[key].value for key in c} print(data) # {'country': 'UY', 'currency': 'USD'}

2 个答案:

答案 0 :(得分:4)




如果您遇到性能问题,可以编译&#34;将模式分为两个整数:一个模式为每个1 1,每00 x;另一个是每个0 x的掩码,10 1。你每个整数浪费26位,但我不会告诉任何人。

然后将值编译为整数:1 10 0

编写一个具有那些模式/掩码整数的类,以及一个将它们与值int进行比较的方法。你会&#34;预编译&#34; &#34;价值观&#34;并将它们存储为整数而不是字符串,或者可能是具有int属性和字符串属性的类,如果您需要显示它们(或者您可以编写一个函数将这些int转换回字符串) 。

public class PatternMatcher
    public PatternMatcher(String pattern)
        Pattern = CompilePattern(pattern);
        Mask = CompileMask(pattern);

    #region Fields
    //  Could we save any cycles by making these fields instead of properties? 
    //  I think the optimizer is smarter than that. 
    public int Pattern { get; private set; }
    public int Mask { get; private set; }
    #endregion Fields

    public bool CheckValue(String value)
        return CheckValue(CompileValue(value));

    public bool CheckValue(int value)
        //  a & b: Bitwise And
        //      Any bit that's "true" in both numbers is "true" in the result. 
        //      Any bit that's "false" in EITHER number is "false" in the result.

        //      11 & 11 == 11
        //      11 & 01 == 01
        //      11 & 10 == 10
        //      11 & 00 == 00

        //      01 & 11 == 01
        //      01 & 01 == 01
        //      01 & 10 == 00
        //      01 & 00 == 00

        //  So xx0011 -> 
        //      Pattern: 000011
        //      Mask:    001111
        //      Value    110011

        //  (110011 & 001111) == 000011
        //  (000011 & 001111) == 000011
        //  000011 == 000011, so these two match. 

        return (value & Mask) == (Pattern & Mask);

    public static int CompileMask(string patternString)
        int mask = 0;
        int bitoffset = 0;

        //  For each character in patternString, set one bit in mask.
        //  Start with bit zero and move left one bit for each character.
        //  On x86, these bits are in reverse order to the characters in 
        //  the strings, but that doesn't matter. 
        foreach (var ch in patternString)
            switch (ch)
                //  If the pattern has a '0' or a '0', we'll be examining that 
                //  character in the value, so put a 1 at that spot in the mask.
                case '1':
                case '0':
                    //  a | b: Bitwise OR: If a bit is "true" in EITHER number, it's 
                    //  true in the result. So 0110 | 1000 == 1110.
                    //  a << b: Bitwise left shift: Take all the bits in a and move 
                    //  them leftward by 1 bit, so 0010 << 1 == 0100. 
                    //  So here we shift 1 to the left by some number of bits, and 
                    //  then set that bit in mask to 1. 
                    mask |= 1 << bitoffset;

                //  If it's an 'x', we'll ignore that character in the value by 
                //  putting a 0 at that spot in the mask. 
                //  All the bits are zero already.
                case 'x':

                    throw new ArgumentOutOfRangeException("Invalid pattern character: " + ch);


        return mask;

    public static int CompilePattern(string patternString)
        int pattern = 0;
        int bitoffset = 0;

        foreach (var ch in patternString)
            //  For each character in patternString, set one bit in pattern.
            //  Start with bit zero and move left one bit for each character.
            switch (ch)
                //  If the pattern has a 1, require 1 in the result.
                case '1':
                    pattern |= 1 << bitoffset;

                //  For 0, require 0 in the result.
                case '0':
                    //  All the bits were zero already so don't waste time setting 
                    //  it to zero. 

                //  Doesn't matter what we do for 'x', since it'll be masked out. 
                //  Just don't throw an exception on it. 
                case 'x':

                    throw new ArgumentOutOfRangeException("Invalid pattern character: " + ch);


        return pattern;

    public static int CompileValue(string valueString)
        int value = 0;
        int bitoffset = 0;

        //  For each character in patternString, set one bit in mask.
        //  Start with bit zero and move left one bit for each character.
        foreach (var ch in valueString)
            switch (ch)
                //  If the value has a '1', have a 1 for that bit
                case '1':
                    value |= 1 << bitoffset;

                //  If the value has a '0', leave a 0 for that bit
                //  All the bits were zero already.
                case '0':

                    throw new ArgumentOutOfRangeException("Invalid pattern character: " + ch);


        return value;

显然,如果你不能预先编制你的价值观并将它们存储为整数,那么你就浪费了你的时间(这是一个很大的问题,如果&#34;)。但是如果可以的话,你可以为每个模式创建一个,并在循环中使用700k +次。这可能比循环超过700k +次的字符串更快。

答案 1 :(得分:0)



此外,如果您要检查大约一百万种模式与大约一百万种组合,那么有更好的算法可以将当前复杂度从O(n ^ 2)提高到更接近O(n)http://bigocheatsheet.com/#chart < / p>


foreach ( string pattern in patterns )
    // save the non x indexes
    var indexes = new List<int>(); 
    for (int i = 0; i < pattern.Length; i++)
        if (pattern[i] != 'x')

    foreach ( string combination in combinations )
        bool combination_passed = true;
        foreach (int i in indexes)
            if (combination[i] != pattern[i])
                combination_passed = false;
        // ...