计算字符串中逗号的数量,但双引号之间的逗号除外

时间:2012-04-11 19:32:06

标签: java string performance counter

我有以下函数来计算字符串中的逗号(或任何其他字符)的数量,而不计算双引号内的逗号。我想知道是否有更好的方法来实现这一目标,或者即使您可以找到一些可能导致此功能崩溃的情况。

public int countCharOfString(char c, String s) {
    int numberOfC = 0;
    boolean doubleQuotesFound = false;
    for(int i = 0; i < s.length(); i++){
        if(s.charAt(i) == c && !doubleQuotesFound){
            numberOfC++;
        }else if(s.charAt(i) == c && doubleQuotesFound){
            continue;
        }else if(s.charAt(i) == '\"'){
            doubleQuotesFound = !doubleQuotesFound;
        }
    }
    return numberOfC;
}

感谢您的任何建议

7 个答案:

答案 0 :(得分:3)

这种实现有两点不同:

  • 使用CharSequence代替字符串
  • 无需boolean值来跟踪我们是否在引用的子序列中。

功能:

public static int countCharOfString(char quote, CharSequence sequence) {

    int total = 0, length = sequence.length();

    for(int i = 0; i < length; i++){
        char c = sequence.charAt(i);
        if (c == '"') {
            // Skip quoted sequence
            for (i++; i < length && sequence.charAt(i)!='"'; i++) {}
        } else if (c == quote) {
            total++;
        }
    }

    return total;
 }

答案 1 :(得分:2)

public static int countCharOfString(char c, String s)
{
    int numberOfC = 0;
    int innerC = 0;
    boolean holdDoubleQuotes = false;
    for(int i = 0; i < s.length(); i++)
    {
        char r = s.charAt(i);
        if(i == s.length() - 1 && r != '\"')
        {
            numberOfC += innerC;
            if(r == c) numberOfC++;
        }
        else if(r == c && !holdDoubleQuotes) numberOfC++;
        else if(r == c && holdDoubleQuotes) innerC++;
        else if(r == '\"' && holdDoubleQuotes)
        {
            holdDoubleQuotes = false;
            innerC = 0;
        }
        else if(r == '\"' && !holdDoubleQuotes) holdDoubleQuotes = true;
    }
    return numberOfC;
}

System.out.println(countCharOfString(',', "Hello, BRabbit27, how\",,,\" are, you?"));

<强>输出:

3

另一种方法是使用正则表达式:

public static int countCharOfString(char c, String s)
{
   s = " " + s + " "; // To make the first and last commas to be counted
   return s.split("[^\"" + c + "*\"][" + c + "]").length - 1;
}

答案 2 :(得分:1)

  • 你不应该在循环中多次调用charAt()。使用char变量。
  • 您不应该为每次迭代调用length()。在循环之前使用int
  • 您应避免与c重复比较 - 使用嵌套if / else。

答案 3 :(得分:1)

也许不是最快......

public int countCharOfString(char c, String s) {
    final String removedQuoted = s.replaceAll("\".*?\"", "");
    int total = 0;
    for(int i = 0; i < removedQuoted.length(); ++i)
        if(removedQuoted.charAt(i) == c)
            ++total;
    return total;
}

答案 4 :(得分:1)

需要一大串才能产生很大的不同。

此代码更快的原因是它每个循环平均包含1.5个检查,而不是每个循环3次检查。它通过使用两个循环来实现,一个用于引用,一个用于未引用的状态。

public static void main(String... args) {
    String s = generateString(20 * 1024 * 1024);
    for (int i = 0; i < 15; i++) {
        long start = System.nanoTime();
        countCharOfString(',', s);
        long mid = System.nanoTime();
        countCharOfString2(',', s);
        long end = System.nanoTime();
        System.out.printf("countCharOfString() took %.3f ms, countCharOfString2() took %.3f ms%n",
                (mid - start) / 1e6, (end - mid) / 1e6);
    }
}

private static String generateString(int length) {
    StringBuilder sb = new StringBuilder(length);
    Random rand = new Random(1);
    while (sb.length() < length)
        sb.append((char) (rand.nextInt(96) + 32)); // includes , and "
    return sb.toString();
}

public static int countCharOfString2(char c, String s) {
    int numberOfC = 0, i = 0;
    while (i < s.length()) {
        // not quoted
        while (i < s.length()) {
            char ch = s.charAt(i++);
            if (ch == c)
                numberOfC++;
            else if (ch == '"')
                break;
        }
        // quoted
        while (i < s.length()) {
            char ch = s.charAt(i++);
            if (ch == '"')
                break;
        }
    }
    return numberOfC;
}


public static int countCharOfString(char c, String s) {
    int numberOfC = 0;
    boolean doubleQuotesFound = false;
    for (int i = 0; i < s.length(); i++) {
        if (s.charAt(i) == c && !doubleQuotesFound) {
            numberOfC++;
        } else if (s.charAt(i) == c && doubleQuotesFound) {
            continue;
        } else if (s.charAt(i) == '\"') {
            doubleQuotesFound = !doubleQuotesFound;
        }
    }
    return numberOfC;
}

打印

countCharOfString() took 33.348 ms, countCharOfString2() took 31.381 ms
countCharOfString() took 28.265 ms, countCharOfString2() took 25.801 ms
countCharOfString() took 28.142 ms, countCharOfString2() took 14.576 ms
countCharOfString() took 28.372 ms, countCharOfString2() took 14.540 ms
countCharOfString() took 28.191 ms, countCharOfString2() took 14.616 ms

答案 5 :(得分:1)

更简单,更不容易出错(是的,比通过char遍历字符串char并且手动跟踪所有内容的性能更低):

public static int countCharOfString(char c, String s) {
  s = s.replaceAll("\".*?\"", "");
  int cnt = 0;
  for (int foundAt = s.indexOf(c); foundAt > -1; foundAt = s.indexOf(c, foundAt+1)) 
    cnt++;
  return cnt;
}

答案 6 :(得分:0)

您还可以使用正则表达式和String.split()

它可能看起来像这样:

public int countNonQuotedOccurrences(String inputstring, char searchChar)
{
  String regexPattern = "[^\"]" + searchChar + "[^\"]";
  return inputString.split(regexPattern).length - 1;
}

免责声明:

这只是展示了基本方法。

上面的代码不会在字符串的开头或结尾检查searchChar。

您可以手动检查或添加到regexPattern。